Welcome to the GoFuckYourself.com - Adult Webmaster Forum forums.

You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today!

If you have any problems with the registration process or your account login, please contact us.

Post New Thread Reply

Register GFY Rules Calendar Mark Forums Read
Go Back   GoFuckYourself.com - Adult Webmaster Forum > >
Discuss what's fucking going on, and which programs are best and worst. One-time "program" announcements from "established" webmasters are allowed.

 
Thread Tools
Old 02-21-2010, 12:07 PM   #1
rowan
Too lazy to set a custom title
 
Join Date: Mar 2002
Location: Australia
Posts: 17,393
Googlebot... how arrogant can they be?

Google ignores crawl-delay in robots.txt. Instead, they force you to register your site in webmaster tools, so you can set a custom crawl rate. This is then RESET to the default after 90 days, so at that time you have to login again and change it again! How fucking arrogant is that?

Other search engines respect crawl-delay, imagine if they all wanted us to create accounts and login every 90 days to stop their bots hammering our servers?
rowan is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 12:09 PM   #2
Agent 488
Registered User
 
Industry Role:
Join Date: Feb 2006
Posts: 22,511
yeah that damn google bot racking up bandwidth.
Agent 488 is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 12:15 PM   #3
2012
So Fucking What
 
2012's Avatar
 
Industry Role:
Join Date: Jul 2006
Posts: 17,189
__________________
best host: Webair | best sponsor: Kink | best coder: 688218966 | Go Fuck Yourself
2012 is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 12:16 PM   #4
rowan
Too lazy to set a custom title
 
Join Date: Mar 2002
Location: Australia
Posts: 17,393
Quote:
Originally Posted by Agent 488 View Post
yeah that damn google bot racking up bandwidth.
Server load, actually. They're hitting me 120k times a day.
rowan is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 12:20 PM   #5
rowan
Too lazy to set a custom title
 
Join Date: Mar 2002
Location: Australia
Posts: 17,393
And the point isn't so much what they're doing specifically to my site, more that they're arrogant enough to ignore (defacto?) robots.txt settings that every other major search engine bot respects. The webmaster dashboard robots.txt checker even helpfully points out that each of the crawl-delay lines in my robots.txt are ignored!
rowan is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 12:23 PM   #6
ColetteX
So Fucking Banned
 
Join Date: Jun 2004
Posts: 539
so google bots are ignoring crawl delay? i have delays as i am updating my sites, i dont want to see google every day on site that i am updating weekly. if crawl delay do not work, it is really sick cause 6 days from week google see my site as static not updated. oh snap
ColetteX is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 12:28 PM   #7
rowan
Too lazy to set a custom title
 
Join Date: Mar 2002
Location: Australia
Posts: 17,393
Quote:
Originally Posted by ColetteX View Post
so google bots are ignoring crawl delay? i have delays as i am updating my sites, i dont want to see google every day on site that i am updating weekly. if crawl delay do not work, it is really sick cause 6 days from week google see my site as static not updated. oh snap
Crawl-delay is related to how fast a bot hits your site if it has multiple pages, not how long it will wait between refetching the same page.

http://www.google.com/support/webmas...n&answer=48620

My site has 200 million pages so technically googlebot isn't fetching fast enough... at the rate of 120k fetches per day it would take 4 1/2 years to index everything. At this point the benefit of indexing 100% of the site (or at least as much as it's trying to) isn't worth the load it's placing on the server.
rowan is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 12:31 PM   #8
ColetteX
So Fucking Banned
 
Join Date: Jun 2004
Posts: 539
Quote:
Originally Posted by rowan View Post
Crawl-delay is related to how fast a bot hits your site if it has multiple pages, not how long it will wait between refetching the same page.

http://www.google.com/support/webmas...n&answer=48620

My site has 200 million pages so technically googlebot isn't fetching fast enough... at the rate of 120k fetches per day it would take 4 1/2 years to index everything. At this point the benefit of indexing 100% of the site (or at least as much as it's trying to) isn't worth the load it's placing on the server.
thank you man, there is still much to learn. bump for your answer
ColetteX is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 12:36 PM   #9
woj
<&(©¿©)&>
 
woj's Avatar
 
Industry Role:
Join Date: Jul 2002
Location: Chicago
Posts: 47,882
User-agent: Googlebot
Disallow: /

problem solved
__________________
Custom Software Development, email: woj#at#wojfun#.#com to discuss details or skype: wojl2000 or gchat: wojfun or telegram: wojl2000
Affiliate program tools: Hosted Galleries Manager Banner Manager Video Manager
Wordpress Affiliate Plugin Pic/Movie of the Day Fansign Generator Zip Manager
woj is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 12:38 PM   #10
rowan
Too lazy to set a custom title
 
Join Date: Mar 2002
Location: Australia
Posts: 17,393
Quote:
Originally Posted by woj View Post
User-agent: Googlebot
Disallow: /

problem solved
I was waiting for it. I'm surprised this "solution" wasn't posted sooner.
rowan is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 02:02 PM   #11
CunningStunt
Confirmed User
 
CunningStunt's Avatar
 
Industry Role:
Join Date: Aug 2006
Posts: 5,594
Quote:
Originally Posted by woj View Post
User-agent: Googlebot
Disallow: /

problem solved


To the OP - agreed. The trouble is, they can do what the hell they want with over 70% of the search market.
CunningStunt is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 04:32 PM   #12
MrMaxwell
Too lazy to set a custom title
 
Industry Role:
Join Date: Jul 2005
Posts: 10,057
What kind of site has 200,000,000 pages?
That's alot of pages
MrMaxwell is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 04:34 PM   #13
Adam_M
Confirmed User
 
Adam_M's Avatar
 
Industry Role:
Join Date: Mar 2006
Location: Australia
Posts: 3,800
Quote:
Originally Posted by woj View Post
User-agent: Googlebot
Disallow: /

problem solved
I wish everyone would just do this!
Adam_M is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 04:55 PM   #14
Cyber Fucker
Hmm
 
Cyber Fucker's Avatar
 
Industry Role:
Join Date: Sep 2005
Location: On an endless road around the world for rock and roll.
Posts: 12,642
Quote:
Originally Posted by Adam_WildCash View Post
I wish everyone would just do this!
Lol
__________________
Cyber Fucker is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 04:57 PM   #15
fatfoo
ICQ:649699063
 
Industry Role:
Join Date: Mar 2003
Posts: 27,763
They can be very arrogant.
__________________
Send me an email: [email protected]
fatfoo is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 06:19 PM   #16
Waddymelon
So Fucking Banned
 
Industry Role:
Join Date: Oct 2009
Location: The Whitehouse
Posts: 17,349
Annoying isnt it? I've had googlebot hit my servers up to 15 times per second, for hours at a time. Dynamic pages really make googlebot go nuts.
Waddymelon is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 06:30 PM   #17
u-Bob
there's no $$$ in porn
 
u-Bob's Avatar
 
Industry Role:
Join Date: Jul 2005
Location: icq: 195./568.-230 (btw: not getting offline msgs)
Posts: 33,063
They're evil, plain and simple....
u-Bob is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 07:30 PM   #18
StrokeKing
Confirmed User
 
Join Date: May 2008
Posts: 232
Quote:
Originally Posted by Waddymelon View Post
Annoying isnt it? I've had googlebot hit my servers up to 15 times per second, for hours at a time. Dynamic pages really make googlebot go nuts.
15times in a second?! really annoying! well its a smart idea to disallow the googlebot.. thanks for the advice guys.
__________________
Adult Reviews by the Stroke King.
StrokeKing is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 07:38 PM   #19
czarina
Webmaster Extraordinaire
 
czarina's Avatar
 
Industry Role:
Join Date: Jul 2002
Location: A beautiful beach...
Posts: 10,748
woj's solution is good if you want to keep googlebot off your site completely. But are you sure you want to do that?
czarina is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 07:48 PM   #20
epitome
So Fucking Lame
 
epitome's Avatar
 
Industry Role:
Join Date: Jun 2009
Location: St. Petersburg, FL
Posts: 12,156
Quote:
Originally Posted by MrMaxwell View Post
What kind of site has 200,000,000 pages?
That's alot of pages
Google.

Rowan actually founded Google and he's frustrated because his baby is stuck in a loop.

Even as the founder, he cannot get support at Google and has to do the same thing as the rest of us.
epitome is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 08:33 PM   #21
Dirty Dane
Sick Fuck
 
Dirty Dane's Avatar
 
Industry Role:
Join Date: Feb 2004
Location: www
Posts: 9,491
Sue them.
Dirty Dane is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 09:44 PM   #22
rowan
Too lazy to set a custom title
 
Join Date: Mar 2002
Location: Australia
Posts: 17,393
Quote:
Originally Posted by epitome View Post
Google.

Rowan actually founded Google and he's frustrated because his baby is stuck in a loop.

Even as the founder, he cannot get support at Google and has to do the same thing as the rest of us.


Yeah, I got pushed out by The Man! Fuck them!
rowan is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-21-2010, 11:40 PM   #23
papill0n
Unregistered Abuser
 
Industry Role:
Join Date: Oct 2007
Posts: 15,547
Quote:
Originally Posted by StrokeKing View Post
15times in a second?! really annoying! well its a smart idea to disallow the googlebot.. thanks for the advice guys.
papill0n is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-22-2010, 05:58 AM   #24
MrMaxwell
Too lazy to set a custom title
 
Industry Role:
Join Date: Jul 2005
Posts: 10,057
What kind of site has 200,000,000 pages?
MrMaxwell is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-22-2010, 06:35 AM   #25
rowan
Too lazy to set a custom title
 
Join Date: Mar 2002
Location: Australia
Posts: 17,393
Quote:
Originally Posted by MrMaxwell View Post
What kind of site has 200,000,000 pages?
Think of something on the net that there might be 200,000,000 of to profile.
rowan is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-22-2010, 11:40 AM   #26
raymor
Confirmed User
 
Join Date: Oct 2002
Posts: 3,745
Quote:
Originally Posted by rowan View Post
My site has 200 million pages so technically googlebot isn't fetching fast enough... at the rate of 120k fetches per day it would take 4 1/2 years to index everything. At this point the benefit of indexing 100% of the site (or at least as much as it's trying to) isn't worth the load it's placing on the server.
200 MILLION pages? I'm curious, is that 200 million legitimate pages, or 200 million
pieces of fake SE spam crap? If you tried to spam the crap out of Google by creating
200 million bogus pages, I'd say you got what you deserved, and really what you
asked for. If you pretended to have 200 million pages so that Google would spider
you 200 million times, that was your decision. You can't blame Google if you chose to
create fake stuff for them to spider.

Note the repeated use of "IF" - I'm asking IF that's what you did.
__________________
For historical display only. This information is not current:
support&#64;bettercgi.com ICQ 7208627
Strongbox - The next generation in site security
Throttlebox - The next generation in bandwidth control
Clonebox - Backup and disaster recovery on steroids
raymor is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-22-2010, 12:26 PM   #27
tiger
Confirmed User
 
tiger's Avatar
 
Industry Role:
Join Date: Apr 2002
Location: Los Angeles
Posts: 6,986
Quote:
Originally Posted by woj View Post
User-agent: Googlebot
Disallow: /

problem solved


Yeah I was wondering when someone would post this.

Bottom line is, if its causing you more problems then its worth just block it. If they are hitting you that hard you should be getting some good traffic because of it, more traffic = more money, Just upgrade the servers.
__________________

tiger is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-22-2010, 12:38 PM   #28
DamnGoodRatio
Confirmed User
 
Industry Role:
Join Date: Feb 2003
Location: USA
Posts: 855
I return a 503 page and they seem to respect that. Then I set it so they can crawl during my off peak loads which they seem to do.
Should work and I know this is documented somewhere on Google's FAQ just can't seem to find the link right now.
__________________
Obama Said: "We can absorb a terrorist attack."
DamnGoodRatio is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 02-22-2010, 03:07 PM   #29
rowan
Too lazy to set a custom title
 
Join Date: Mar 2002
Location: Australia
Posts: 17,393
Quote:
Originally Posted by DamnGoodRatio View Post
I return a 503 page and they seem to respect that. Then I set it so they can crawl during my off peak loads which they seem to do.
Should work and I know this is documented somewhere on Google's FAQ just can't seem to find the link right now.
I've noticed that a connection or server error will make G-bot back right off, but I'm concerned that doing this routinely might affect my rank. I've seen recent articles saying that site response time may become a factor in the future.

raymor: not useless spam, it's all genuine profiling of... domains.
rowan is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 03-03-2010, 03:59 AM   #30
rowan
Too lazy to set a custom title
 
Join Date: Mar 2002
Location: Australia
Posts: 17,393
February 22, 2010

New crawl rate: Custom rate

1.000 requests per second

1.000 seconds per request

This new crawl rate will stay in effect for 90 days.


Funny, 2 weeks later googlebot is still requesting 120k+ pages per day, which is about 150% the rate of the above setting.

Their webmaster tools system also sent me a notification encouraging me to increase the rate so they can fetch more pages. Looks like their bot is doing it anyway.
rowan is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Old 03-03-2010, 04:36 AM   #31
seeandsee
Check SIG!
 
seeandsee's Avatar
 
Industry Role:
Join Date: Mar 2006
Location: Europe (Skype: gojkoas)
Posts: 50,945
bad bad google
__________________
BUY MY SIG - 50$/Year

Contact here
seeandsee is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote
Post New Thread Reply
Go Back   GoFuckYourself.com - Adult Webmaster Forum > >

Bookmarks
Thread Tools



Advertising inquiries - marketing at gfy dot com

Contact Admin - Advertise - GFY Rules - Top

©2000-, AI Media Network Inc



Powered by vBulletin
Copyright © 2000- Jelsoft Enterprises Limited.