![]() |
![]() |
![]() |
||||
Welcome to the GoFuckYourself.com - Adult Webmaster Forum forums. You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today! If you have any problems with the registration process or your account login, please contact us. |
![]() ![]() |
|
Discuss what's fucking going on, and which programs are best and worst. One-time "program" announcements from "established" webmasters are allowed. |
|
Thread Tools |
![]() |
#1 |
Too lazy to set a custom title
Join Date: Mar 2002
Location: Australia
Posts: 17,393
|
Googlebot... how arrogant can they be?
Google ignores crawl-delay in robots.txt. Instead, they force you to register your site in webmaster tools, so you can set a custom crawl rate. This is then RESET to the default after 90 days, so at that time you have to login again and change it again! How fucking arrogant is that?
![]() Other search engines respect crawl-delay, imagine if they all wanted us to create accounts and login every 90 days to stop their bots hammering our servers? ![]() |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#2 |
Registered User
Industry Role:
Join Date: Feb 2006
Posts: 22,511
|
yeah that damn google bot racking up bandwidth.
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#3 |
So Fucking What
Industry Role:
Join Date: Jul 2006
Posts: 17,189
|
![]()
__________________
best host: Webair | best sponsor: Kink | best coder: 688218966 | Go Fuck Yourself ![]() |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#4 |
Too lazy to set a custom title
Join Date: Mar 2002
Location: Australia
Posts: 17,393
|
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#5 |
Too lazy to set a custom title
Join Date: Mar 2002
Location: Australia
Posts: 17,393
|
And the point isn't so much what they're doing specifically to my site, more that they're arrogant enough to ignore (defacto?) robots.txt settings that every other major search engine bot respects. The webmaster dashboard robots.txt checker even helpfully points out that each of the crawl-delay lines in my robots.txt are ignored!
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#6 |
So Fucking Banned
Join Date: Jun 2004
Posts: 539
|
so google bots are ignoring crawl delay? i have delays as i am updating my sites, i dont want to see google every day on site that i am updating weekly. if crawl delay do not work, it is really sick cause 6 days from week google see my site as static not updated. oh snap
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#7 | |
Too lazy to set a custom title
Join Date: Mar 2002
Location: Australia
Posts: 17,393
|
Quote:
http://www.google.com/support/webmas...n&answer=48620 My site has 200 million pages so technically googlebot isn't fetching fast enough... at the rate of 120k fetches per day it would take 4 1/2 years to index everything. At this point the benefit of indexing 100% of the site (or at least as much as it's trying to) isn't worth the load it's placing on the server. |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#8 | |
So Fucking Banned
Join Date: Jun 2004
Posts: 539
|
Quote:
|
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#9 |
<&(©¿©)&>
Industry Role:
Join Date: Jul 2002
Location: Chicago
Posts: 47,882
|
User-agent: Googlebot
Disallow: / problem solved ![]()
__________________
Custom Software Development, email: woj#at#wojfun#.#com to discuss details or skype: wojl2000 or gchat: wojfun or telegram: wojl2000 Affiliate program tools: Hosted Galleries Manager Banner Manager Video Manager ![]() Wordpress Affiliate Plugin Pic/Movie of the Day Fansign Generator Zip Manager |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#10 |
Too lazy to set a custom title
Join Date: Mar 2002
Location: Australia
Posts: 17,393
|
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#11 |
Confirmed User
Industry Role:
Join Date: Aug 2006
Posts: 5,594
|
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#12 |
Too lazy to set a custom title
Industry Role:
Join Date: Jul 2005
Posts: 10,057
|
What kind of site has 200,000,000 pages?
That's alot of pages |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#13 |
Confirmed User
Industry Role:
Join Date: Mar 2006
Location: Australia
Posts: 3,800
|
I wish everyone would just do this!
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#14 |
Hmm
Industry Role:
Join Date: Sep 2005
Location: On an endless road around the world for rock and roll.
Posts: 12,642
|
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#15 |
ICQ:649699063
Industry Role:
Join Date: Mar 2003
Posts: 27,763
|
They can be very arrogant.
__________________
Send me an email: [email protected] |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#16 |
So Fucking Banned
Industry Role:
Join Date: Oct 2009
Location: The Whitehouse
Posts: 17,349
|
Annoying isnt it? I've had googlebot hit my servers up to 15 times per second, for hours at a time. Dynamic pages really make googlebot go nuts.
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#17 |
there's no $$$ in porn
Industry Role:
Join Date: Jul 2005
Location: icq: 195./568.-230 (btw: not getting offline msgs)
Posts: 33,063
|
They're evil, plain and simple....
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#18 |
Confirmed User
Join Date: May 2008
Posts: 232
|
15times in a second?! really annoying! well its a smart idea to disallow the googlebot.. thanks for the advice guys.
__________________
Adult Reviews by the Stroke King. |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#19 |
Webmaster Extraordinaire
Industry Role:
Join Date: Jul 2002
Location: A beautiful beach...
Posts: 10,748
|
woj's solution is good if you want to keep googlebot off your site completely. But are you sure you want to do that?
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#20 |
So Fucking Lame
Industry Role:
Join Date: Jun 2009
Location: St. Petersburg, FL
Posts: 12,156
|
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#21 |
Sick Fuck
Industry Role:
Join Date: Feb 2004
Location: www
Posts: 9,491
|
Sue them.
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#22 |
Too lazy to set a custom title
Join Date: Mar 2002
Location: Australia
Posts: 17,393
|
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#23 |
Unregistered Abuser
Industry Role:
Join Date: Oct 2007
Posts: 15,547
|
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#24 |
Too lazy to set a custom title
Industry Role:
Join Date: Jul 2005
Posts: 10,057
|
What kind of site has 200,000,000 pages?
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#25 |
Too lazy to set a custom title
Join Date: Mar 2002
Location: Australia
Posts: 17,393
|
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#26 | |
Confirmed User
Join Date: Oct 2002
Posts: 3,745
|
Quote:
pieces of fake SE spam crap? If you tried to spam the crap out of Google by creating 200 million bogus pages, I'd say you got what you deserved, and really what you asked for. If you pretended to have 200 million pages so that Google would spider you 200 million times, that was your decision. You can't blame Google if you chose to create fake stuff for them to spider. Note the repeated use of "IF" - I'm asking IF that's what you did.
__________________
For historical display only. This information is not current: support@bettercgi.com ICQ 7208627 Strongbox - The next generation in site security Throttlebox - The next generation in bandwidth control Clonebox - Backup and disaster recovery on steroids |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#27 |
Confirmed User
Industry Role:
Join Date: Apr 2002
Location: Los Angeles
Posts: 6,986
|
![]() ![]() Yeah I was wondering when someone would post this. Bottom line is, if its causing you more problems then its worth just block it. If they are hitting you that hard you should be getting some good traffic because of it, more traffic = more money, Just upgrade the servers. |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#28 |
Confirmed User
Industry Role:
Join Date: Feb 2003
Location: USA
Posts: 855
|
I return a 503 page and they seem to respect that. Then I set it so they can crawl during my off peak loads which they seem to do.
Should work and I know this is documented somewhere on Google's FAQ just can't seem to find the link right now.
__________________
Obama Said: "We can absorb a terrorist attack." |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#29 | |
Too lazy to set a custom title
Join Date: Mar 2002
Location: Australia
Posts: 17,393
|
Quote:
raymor: not useless spam, it's all genuine profiling of... domains. ![]() |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#30 |
Too lazy to set a custom title
Join Date: Mar 2002
Location: Australia
Posts: 17,393
|
February 22, 2010
New crawl rate: Custom rate 1.000 requests per second 1.000 seconds per request This new crawl rate will stay in effect for 90 days. Funny, 2 weeks later googlebot is still requesting 120k+ pages per day, which is about 150% the rate of the above setting. Their webmaster tools system also sent me a notification encouraging me to increase the rate so they can fetch more pages. Looks like their bot is doing it anyway. ![]() |
![]() |
![]() ![]() ![]() ![]() ![]() |