| Sometimes HTTrack seems just to check if a file exists on
the remote server even there is no need to download it
(because it has already been in the cache). Of course, it
makes the program slower, especially when it re-read the
previously written cache. Instead of just seconds, it needs
hours just to re-read the cache. I really do not know why
it checks anything on the Internet while reading the cache.
But sometimes such not-wanted checking causes serious
problems. A robot on a certain WWW site checks if you are
downloading only one site a time. If not, it
writes "massive download" and it blocks you off so you
cannot browse the website for some time (let's say, for 3
months) at all.
When I tested HTTrack with this site, I set the proper
option in it but the program did something that I was
detected as a mass-downloader and blocked. I am sure the
only possiblility is that it tried to check if a file
exists in the same time when it was downloading another
file. It should never happen! One file should meen one
file, without exceptions, without any needless checking.
Previously I tested another website copier (Teleport Pro)
with the same site and when I set "only 1 file
simultaneously", all was OK. The robot on the server cannot
detect anything suspected. So I think that the option "1
file" in HtTrack does not work too rigorously (in other
words: it does not work at all... because if you really
need to limit yourself to one file maximum, it REALLY must
be obeyed and no other attempts downloading, no checks if
files exist, no testings on the remote server etc. should
be permitted).
Grzegorz JagodziƱski
| |