View Single Post
Old 08-28-2009, 06:11 AM  
rowan
Too lazy to set a custom title
 
Join Date: Mar 2002
Location: Australia
Posts: 17,393
I'm facing this problem myself as I have a site with millions of pages. Currently if an IP downloads too many pages within a short period of time and it's NOT on the whitelist (eg the IPs of Google) it gets firewalled for a period of time. It's a pretty aggressive approach but it works (for now)

Most of the people trying to scrape don't bother trying to hide it, so their 3 fetches a second gets picked up pretty quickly.
rowan is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote