HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Unexpected 412/416 error followup
Author: George Langford
Date: 12/26/2015 15:51
 
The website that I'm trying to retrieve into a user-browsable format (i.e. on a
DVD) is very large (about 5000 files and a similar number of links) so I've
had a long process of weeding out my own HTML-coding errors ... now mostly
complete. HTtrack's log file is extremely helpful in facilitating the locating
of the mis-coded HTML files.

Now my ISP moved that website to another server right in the middle of the
above debugging process, and I had to endure a lot of propagation errors which
caused HTTrack to try to read the new location while I was saving my changes
to the old location. A few days ago that mess finally cleared up (I could have
made things simpler for myself by changing my local Hosts file, but I feared
that I would make things much worse if I did that incorrectly, and I still had
a lot of HTML-coding errors to fix).

Now the following situation has arisen, which I'll bet you thought had been
put to rest a long time ago. Here is an example of one of the many error
messages (redacted of personal information):

"Warning: Unexpected 412/416 error (Requested Range Not Satisfiable) for
domain name/../../filename.jpg, '/home/username/Websites/website
name/../../filename.html' could not be found on disk"

To make matters worse, all the .jpg files in the retrieved website are now
corrupted text files. In actuality, the original website is working fine, even
after the server change.

This has probably happened because all the files in the website have been
deleted from the former server and copied to the new server, which has a
completely different server name and IP address, but the same ISP.

What I have just tried as a cure is to delete the many-times-updated local
website file and start over from scratch. I changed the Scan Rules settings to
include all the file types that are on this webpage ... previously I had
included only the file types that caused HTTrack to complain, but now I've
added +*.jpg, +*.JPG, +*.gif, +*.txt and so on.

BTW, it's my own webpage and domain, and I'm using WebHTTrack..

WebHTTrack is working fine for me on other webpages that haven't recently been
moved by their ISP's.

Here are the current settings under which WebHTTrack is running:

"webhttrack -q -%i -w <http://www.domainname.com/websitename/> -O
"/home/username/Websites/websitename" -n -%P -p7 -N0 -s2 -p7 -D -a -K0 -c4 -%k
-A25000 -F "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)" -%F "<!--
Mirrored from %s%s by HTTrack Website Copier/3.x [XR&CO'2013], %s -->" +*.png
+*.gif +*.jpg +*.JPG +*.pgw +*.pdw +*.psd +*.txt -%s -%u )"

My computer is a Lenovo T420 running the Trisquel 7 linux/GNU operating
system, with 4GB of RAM and a 1,000GB external USB-connected hard drive. The
retrieval process has been taking about 18 hours to download about 1.8GB of
data.
 
Reply Create subthread


All articles

Subject Author Date
Unexpected 412/416 error followup

02/27/2011 21:44
Re: Unexpected 412/416 error followup

12/26/2015 15:51
Re: Unexpected 412/416 error followup

12/27/2015 16:37
Re: Unexpected 412/416 error followup

02/19/2021 17:11




c

Created with FORUM 2.0.11