| I've been analyzing the new.txt generated from my large
crawl of epanorama and have a question...
Why does www.hut.fi/~then/circuits/covox.zip get listed in
new.txt twice, but with different local file names
(covox.zip, covox-2.zip)? When I checked the saved HTML
files that linked to this ZIP file, they were both
correctly linked to covox.zip (covox-2.zip does not even
exist where new.txt said it 'would'.) So is HTTrack doing
some URL checks/conversions/equivalancy checks after these
entries are written in the new.txt? It sure seems like it
in this case. It's not a 'critical' problem (didn't affect
the local copy), but it may complicate troubleshooting in
other cases (as this troubleshooting tool itself may be a
little bit broken.)
Here are the related entries from the new.txt:
-------------------------------------
16:57:45 287/-1 -R-MC- error ('Moved%
20Permanently') text/html 301 html zip
www.hut.fi/~then/circuits/covox.zip
I:/web-archive%20problematic/www.epanorama.net%
2020021228/www.hut.fi/_then/circuits/covox.zip (from
www.epanorama.net/circuits/dacs.html)
-------------------------------------
17:13:12 287/-1 ---MC- error ('Moved%
20Permanently') text/html 301 zip html zip
www.hut.fi/~then/circuits/covox.zip
I:/web-archive%20problematic/www.epanorama.net%
2020021228/www.hut.fi/_then/circuits/covox-2.zip
(from
www.hut.fi/Misc/Electronics/circuits/dacs.html)
| |