View Single Post
Old 07-08-2019, 02:46 AM  
thommy
Confirmed User
 
thommy's Avatar
 
Industry Role:
Join Date: Jun 2003
Location: Switzerland / Germany / Thailand
Posts: 5,469
Quote:
Originally Posted by magneto664 View Post
Every day thousands of others bots scan your website, ahref, majestic, exploit looking bots, advert bots, other shit bots, most of them have loaded default directories and file names or directory paths for scripts working on your site. If you do not want something to appear on the Internet, you do not upload to the internet. Simple.
but this bots are not google. nobody will try to sue them.

I really know how a robots.txt is working but the point is that millions who have an internet presence donīt know.

if google crawls something from their site WITHOUT AN EXPLICIT demand to do so, they can be seen as "victim" from the one or other judge and can sue Google for millions.

this is why it would make sense to make robots.txt as THE rule to crawl your site and sites without robots.txt would not be touched.




Quote:
If in the robot file you select which file or directory to bypass the possible that Google will do. But for others it will be a gift.
as i said - if there are no clear rules for that it will open big doors for lawsuits. and not the others would be the ones that have to fight it - it would be the one who have the money to pay.
__________________
Open for handpicked publishers and advertisers:
www.trafficfabrik.com
thommy is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote