HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: follow robots.txt rules does not work
Author: Xavier Roche
Date: 03/13/2005 10:23
 
> Httrack does not appear to obey robots.txt exclusion 
rules.
> We've tried both WinHttrack and the command line httrack
> with -sN2 option and in neither case does httrack obey the
> robots.txt rules.

HTTrack obeys to robot exclusion rules (except the too 
restrictive '/' rule), but does not takes in account the 
name of the robot ('*', 'foo' or whatever). In the future 
I'll change this behaviour to follow only two set of rules: 
the 'catch all' (*) and a specific httrack rule 
(an "httrack" entry)
 
Reply Create subthread


All articles

Subject Author Date
follow robots.txt rules does not work

03/09/2005 23:48
Re: follow robots.txt rules does not work

03/10/2005 10:03
Re: follow robots.txt rules does not work

03/13/2005 10:23
Re: follow robots.txt rules does not work

03/14/2005 07:13




d

Created with FORUM 2.0.11