HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: HTTrack won't copy this site's graphic pictures
Author: Xavier Roche
Date: 07/10/2004 14:30
 
> HTTrack won't copy this site's graphic pictures.
> There is a "pic" folder that contains the sites pictures 

Just take a look at the hts-log.txt log file:

14:26:40 Info:  Note: due to www.freemasonrywatch.org 
remote robots.txt rules, links begining with these path 
will be 
forbidden: /cgi_bin/, /audio/, /noaudio/, /pics/, /Printfac/
, /banners/, /NCDTREE/ (see in the options to disable this)

The webmasters's robots.txt rules request robots to skip 
the pics folder by default.

You can bypass this behaviour in HTTrack, using the Options 
(Set Options / Spider / Spider: no robots.txt rules), *BUT* 
set up reasonnable bandwidth limits to avoid any bandwidth 
overload:
Set Options / Limits / Max transfer rate: 10000
Set Options / Flow Control / Number of connections: 2

I repeat: setup bandwidth limits when donloading large 
amount of data. It's not important if the mirror takes 
hours, but it will allow other users to visit the site 
without problems.

 
Reply Create subthread


All articles

Subject Author Date
HTTrack won't copy this site's graphic pictures

07/09/2004 01:10
Re: HTTrack won't copy this site's graphic pictures

07/10/2004 14:30




7

Created with FORUM 2.0.11