| Sounds like you're running into servers that like to
intermittently drop connections. I had a similar problem
when I reaped <http://members.aol.com/bblum6>. Sometimes
HTTrack would be able to grab the whole html page,
sometimes not...it seemed random when it would succeed. I
observed the same behavior in my web browser, IE 6.0, and
figured the problem was with the web server because trying
to reload the page a few times tended to get the desired
results.
I guess the thing that remains to be done is tweaking
HTTrack's update mechanism not to fall victim to flaky
webservers like that membes.aol.com server. (By this I
mean that when I'd use the HTTrack update website option,
I'd sometimes end up being counter-productive as HTTrack
would detect 'changes' in some web pages that were actually
not changes but errors in the download of the html). Is
there a way to detect the broken/terminated connections?
Detecting them might lead to better logic in the HTTrack
engine about whether or not to replace a previously
downloaded copy of a page.
Oh, I didn't try the HTTrack file-by-file confirmation
(Q/A) option in this case because the site wasn't all that
big and the broken files were few.
| |