HTTrack Website Copier
Free software offline browser - FORUM
Subject: Spider identification in robot.txt
Author: Jack Hughes
Date: 05/21/2009 11:09
 
As someone who has just had their entire website stolen by somebody using
HTTrack, using in excess of 50MB of my bandwidth, I was a little disappointed
to see no mention on the HTTrack website on how I can ban the HTTrack spider
from my site using robots.txt. 

Can you please point me to a section of your site where you clearly and
unambiguously document how I might achieve this? 

If you have not documented how I can identify and ban your spider can your
please create one and link to it so it is easily findable.

If you create tools like these you have a responsibility to allow webmasters
to control access to their site. I do appreciate that you are not responsible
for the errant behaviour of your users. You are responsible for implementing a
spider that follows accepted spider etiquette and to document said behaviour.

Kind regards,

Jack
 
Reply


All articles

Subject Author Date
Spider identification in robot.txt

05/21/2009 11:09
Re: Spider identification in robot.txt

05/21/2009 15:30
Re: Spider identification in robot.txt

05/21/2009 18:14
Re: Spider identification in robot.txt

05/21/2009 19:08
Re: Spider identification in robot.txt

05/04/2011 11:04




1

Created with FORUM 2.0.11