| [ Reposted on the forum, as the mailing-list is read-only ]
How can I stop HTTrack from saving 404 Not Found messages?
I'm using WinHTTrack Website Copier 3.30 (+swf) to make an
archival copy of an old version of our corporate website.
I tested the "No error pages" in the Build options on a
small sample of the website and it worked as anticipated.
The result being that I did not have a separate HTML file
for each error message, instead the server generated the
error message.
Unfortunately, when I ran HTTrack on the whole website
with "No error pages" selected HTTrack returned 404 Not
Found HTML pages for the broken links.
The puzzling thing is that I was comparing the same sample
of the website.
Here's an example where the links in the sidebar are coded
incorrectly:
<http://www-oldsite.nlc-bnc.ca/window/types/booke.htm>
When HTTrack tries to get the following links
<http://www-oldsite.nlc>-
bnc.ca/window/types/window/windowe.htm
<http://www-oldsite.nlc>-
bnc.ca/window/types/window/types/seriale.htm
it creates new subdirectories and files for the copied
Error 404 file.
In both cases the problem is that there is no 2nd
subdirectory called
window. HTTrack creates the subdirectory "window" with the
file "windowe.htm" (which is a 404 message), plus the new
subdirectory "types" and file "seriale.htm" (which is a 404
message).
The old website has many broken links and HTTrack takes
some of these broken links and recreates the directory
structure with a new 404 Not Found file for each broken
link.
Another related question: Does the Brower ID has any
influence on the error
messages being generated and saved?Any suggestions or ideas would be
appreciated.
Thanks,
--Karen
| |