HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Images replaced with HTTP 416 Error HTML page.
Author: GreyWyvern
Date: 03/12/2009 15:05
 
Thanks for the reply!  I tried this like you said, but when I started the
mirror, the spider began following all the robots.txt rules even though I have
set the -s0 argument.  (The intranet site is an offline copy of our live
website so the same robots.txt exists in both).

I had to add a special rule to the robots.txt file:

User-agent: httrack
Disallow:

So it would again start mirroring eveything.  Once the mirror completed, I
found that the same thing had happened.  about 10% of files at random were
overwritten with the 416 HTML message.

How can I disable these "2nd requests" and prevent these files from being
overwritten? :|

Brian
 
Reply Create subthread


All articles

Subject Author Date
Images replaced with HTTP 416 Error HTML page.

03/11/2009 23:51
Re: Images replaced with HTTP 416 Error HTML page.

03/12/2009 01:23
Re: Images replaced with HTTP 416 Error HTML page.

03/12/2009 01:26
Re: Images replaced with HTTP 416 Error HTML page.

03/12/2009 13:31
Re: Images replaced with HTTP 416 Error HTML page.

03/12/2009 15:05
Re: Images replaced with HTTP 416 Error HTML page.

03/12/2009 15:38
Re: Images replaced with HTTP 416 Error HTML page.

03/12/2009 16:46
Re: Images replaced with HTTP 416 Error HTML page.

03/12/2009 17:54
Re: Images replaced with HTTP 416 Error HTML page.

02/27/2011 21:39




2

Created with FORUM 2.0.11