View Single Post
Old 11-26-2019, 12:19 AM  
wankawonk
Confirmed User
 
Industry Role:
Join Date: Aug 2015
Posts: 1,017
Y'all ever had problems with google crawl rate?

My sites have millions of pages because they're tube aggregators and CJ tubes

Google...google will ruin my servers hitting them hundreds of thousands of times a day (times many sites). It's a serious problem because their crawler doesn't follow normal caching patterns...the average user will hit my front page or a page that ranks, make a common search query, and click on the same video the last 100 users did. Everything served from redis, no problem. Cheap. Google crawls queries users never make, they hit videos users never click on...their crawler never hits the cache. They account for like 80% of my database load because they never. hit. the cache.

For years I just used the search console to slow their crawl rate. They have never respected crawl-delay in robots.txt.

Lately it's even worse--with the new console I can't set their crawl rate limit anymore.

I've had to block them from parts of my sites just to keep my shit running.

Driving me nuts. Anyone struggled with this? Any tips?
wankawonk is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote