HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Cannot download folder by robot.txt rules
Author: Xavier Roche
Date: 07/02/2002 20:50
 
> I am using HTTrack v3.20RC3. I can't download a folder of 
a 
> webpage, the hts-log.txt file has lines:
> ...
> Note: due to ... remote robots.txt rules, links begining 
> with these path will be forbidden: /folder/ (see in the 
> options to disable this)

You can override robots.txt behaviour using 'Set 
options'/'Spider'/'Spider' : 'Do not follow'

But ensure first that you can safely download sections 
protected by robots.txt - this is often large data, 
numerous cgi files and other potentially "dangerous" 
(bandwidth, disk usage) files. Use bandwidth limits if 
necessary (1 simultaneous connection, and bandwidth limit 
to 10KB/s)
 
Reply Create subthread


All articles

Subject Author Date
Cannot download folder by robot.txt rules

07/02/2002 10:16
Re: Cannot download folder by robot.txt rules

07/02/2002 20:50




9

Created with FORUM 2.0.11