| > I am using HTTrack v3.20RC3. I can't download a folder of
a
> webpage, the hts-log.txt file has lines:
> ...
> Note: due to ... remote robots.txt rules, links begining
> with these path will be forbidden: /folder/ (see in the
> options to disable this)
You can override robots.txt behaviour using 'Set
options'/'Spider'/'Spider' : 'Do not follow'
But ensure first that you can safely download sections
protected by robots.txt - this is often large data,
numerous cgi files and other potentially "dangerous"
(bandwidth, disk usage) files. Use bandwidth limits if
necessary (1 simultaneous connection, and bandwidth limit
to 10KB/s)
| |