| When my httrack is fetching htmls, it sometimes detects an html redirection
(301, 'Moved Permanently').
And most often, httrack will detect the new url, and proceed to download it.
But occasionally,
httrack will detect the redirect, and even parse the next url, and then it
just stops, for reasons unknown. These
hts-log entries here are typical...
HTTrack3.44-1+libhtsjava.so.2 launched on Tue, 06 Sep 2011 19:51:06 at
(httrack -N0 -r1 -%! -A1200000 -m1000000 -T -R1 -z -o0 -r1 -s2 -q - -e -w -O
./htt-UZX -%L htt-UZX/redirection_url.txt )
<some lines edited out here! >
19:51:06 Info: engine: start
19:51:06 Info: engine: check-html: primary/primary
19:51:06 Info: engine: preprocess-html: primary/primary
19:51:06 Info: engine: save-name: local name:
feedproxy.google.com/~r/Makeuseof/~3/qkDFJzy11tI/index.html ->
feedproxy.google.com/_r/Makeuseof/_3/qkDFJzy11tI/index.html
19:51:09 Info: engine: transfer-status: link recorded:
feedproxy.google.com/robots.txt ->
19:51:09 Debug: File checked by cache: feedproxy.google.com
19:51:09 Info: engine: transfer-status: link error (301, 'Moved
Permanently'): feedproxy.google.com/~r/Makeuseof/~3/qkDFJzy11tI/
19:51:09 Debug: File checked by cache: feedproxy.google.com
19:51:09 Warning: Moved Permanently for
feedproxy.google.com/~r/Makeuseof/~3/qkDFJzy11tI/
19:51:09 Warning: File has moved from
feedproxy.google.com/~r/Makeuseof/~3/qkDFJzy11tI/ to
<http://www.makeuseof.com/tag/5-utorrent-addons/>
19:51:09 Info: engine: check-html:
feedproxy.google.com/~r/Makeuseof/~3/qkDFJzy11tI/
19:51:09 Info: engine: preprocess-html:
feedproxy.google.com/~r/Makeuseof/~3/qkDFJzy11tI/
19:51:09 Info: engine: postprocess-html:
feedproxy.google.com/~r/Makeuseof/~3/qkDFJzy11tI/
19:51:09 Info: No data seems to have been transfered during this session! :
restoring previous one!
This error occurs for about 3% of the urls that I httrack, and the 2
"Warning:" lines in the
hts-log are typical log output. Can you suggest any reason why this may be
happening? Can my options for
httrack be improved to fix this please?
Thanks.
| |