HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Download with Disallow in Robots.txt file
Author: Alex DeGreg
Date: 08/16/2004 19:04
 
> > I have troubles when I try to download the site
> > <http://www.w3schools.com/>
> > HTTrack reports the next error:
> 
> Yes: by default httrack respects robots.txt rules.
> 
> > How can I configure HTTrack in order to download this 
> site?> 
> Change the browser identity in the options (Set Options / 
> Browser ID / Browser "identity") and disable robots.txt 
> (Set Options / Spider / Spider), BUT also setup 
reasonnable 
> download settings (bandwidth AND connections) : 
> 
> Set Options / Flow Control / Number of connections: 2
> Set Options / Limits / Max transfer rate: 10000
> 
> I repeat: don't override default robots.txt and/or user-
> agent settings without changing bandwidth and limits 
> settings, or you will risk bandwidth abuse and server 
> slowdown.
> 

I tried all this settings but nothing was copied :-(
Any other hints?
Thanks
 
Reply Create subthread


All articles

Subject Author Date
Download with Disallow in Robots.txt file

07/12/2004 16:17
Re: Download with Disallow in Robots.txt file

07/17/2004 15:53
Re: Download with Disallow in Robots.txt file

08/16/2004 19:04
Re: Download with Disallow in Robots.txt file

10/30/2017 03:01
Re: Download with Disallow in Robots.txt file

01/06/2020 21:42
Re: Download with Disallow in Robots.txt file

01/06/2020 21:44
Re: Download with Disallow in Robots.txt file

01/06/2020 21:47




d

Created with FORUM 2.0.11