HTTrack Website Copier
Free software offline browser - FORUM
Subject: Problem with robots.txt - pls help
Author: Ian
Date: 11/28/2004 13:44
 
Hi there,

Just downloaded the httrack for windows, superb!

But having a small problem, read the docs and i believe its
the robots.txt that the website has.

Basically it is skipping the following files. Heres the msg

Note: due to www.somesite.com remote robots.txt rules, links
begining with these path will be forbidden: /order/,
/_fpclass/, /_mmDBScripts/, /_private/, /_vti_log/,
/aspnet_client/, /awmData-incMenu/, /awmData-ko/, /classes/,
/downloads/, /IISResourceKit/, /java/, /newnewnew/,
/OLDsite/, /orderJR/, /SCRIPTS/, /W3C/ (see in the options
to disable this)

I tried adding a filter under options, Add scan rule, saying
"INCLUDE ALL LINKS"

but then it downloads macromedia website because my site has
a link on it...

Any ideas what i should do 

Thanks in advance

ian
 
Reply


All articles

Subject Author Date
Problem with robots.txt - pls help

11/28/2004 13:44
Re: Problem with robots.txt - pls help

11/28/2004 17:52




e

Created with FORUM 2.0.11