Google is very rude.
Nov. 13th, 2003 11:15 amThere is such a thing as too much of a good thing. In the last 13 days, Google's spider has been to my site (Catlove) 473 times, and has used 41.64 MB of bandwidth. The next most-visiting spider came 29 times and used 1.11 MB bandwidth.
As far as I can tell, robots.txt will only keep a spider out or let it in. I thought I recalled that you could restrict them to like "once a month" or something like that, but I can't find any reference to that now...
Can you tell I'm really bored?
As far as I can tell, robots.txt will only keep a spider out or let it in. I thought I recalled that you could restrict them to like "once a month" or something like that, but I can't find any reference to that now...
Can you tell I'm really bored?
no subject
Date: 2003-11-13 11:47 am (UTC)How much of your site do you want Google to archive? Just hte main page? Pages but not images? etc.? You can be very specific.
The robots exclusion protocol can be found here, it works for google and other search engines:
http://www.robotstxt.org/wc/exclusion.html
Google specific info is here: http://www.google.com/webmasters/faq.html
If you need further help beyond that, ask =)
no subject
Date: 2003-11-13 12:57 pm (UTC)How much of your site do you want Google to archive?
I'm fine with them crawling it all, but 473 visits in 13 days? That's just seriously excessive...
no subject
Date: 2003-11-13 01:40 pm (UTC)no subject
Date: 2003-11-13 01:47 pm (UTC)no subject
Date: 2003-11-13 05:28 pm (UTC)If you do decide to move the images to images/ you can provide a list of redirects from /whatever.jpg to /images/whatever.jpg. Normal users will follow the image redirect automatically and robots would be blocked.
A cheap trick is to make a symlink "images" to "." - then both urls will work. This doesn't keep people out but it gives you time to change / to /images everywhere without breaking anything (well, copying all the images to both places works too, but symlinks are so much geekier)
no subject
Date: 2003-11-13 06:05 pm (UTC)You know, I bet you can. Thanks for the idea and the other info as well!
no subject
Date: 2003-11-13 06:31 pm (UTC)