Allows follow robots.txt doesn't quite - HTTrack Website Copier Forum

Subject: Allows follow robots.txt doesn't quite

Author: Lars Clausen

Date: 02/25/2004 10:21

We noticed today that we'd downloaded a site despite it
having a robots.txt with

User-agent: *
Disallow: /

and us running HTTrack with -s2.  In the log, it says
'robots.txt rules are too restrictive, ignoring /'.  Now I
could understand if it did this with -s1, but -s2 claims to
always follow robots.txt.  The almost-rfc that defines
robots.txt (http://www.robotstxt.org/wc/norobots-rfc.html)
explicitly shows Disallow: /, and people are using it, so
why is -s2 ignoring it?
-Lars

All articles

Subject	Author	Date
Allows follow robots.txt doesn't quite		02/25/2004 10:21
Re: Allows follow robots.txt doesn't quite		02/25/2004 19:52