Problems with checking existing files - HTTrack Website Copier Forum

Subject: Problems with checking existing files

Author: Grzegorz

Date: 12/04/2003 11:32

Sometimes HTTrack seems just to check if a file exists on 
the remote server even there is no need to download it 
(because it has already been in the cache). Of course, it 
makes the program slower, especially when it re-read the 
previously written cache. Instead of just seconds, it needs 
hours just to re-read the cache. I really do not know why 
it checks anything on the Internet while reading the cache.

But sometimes such not-wanted checking causes serious 
problems. A robot on a certain WWW site checks if you are 
downloading only one site a time. If not, it 
writes "massive download" and it blocks you off so you 
cannot browse the website for some time (let's say, for 3 
months) at all.

When I tested HTTrack with this site, I set the proper 
option in it but the program did something that I was 
detected as a mass-downloader and blocked. I am sure the 
only possiblility is that it tried to check if a file 
exists in the same time when it was downloading another 
file. It should never happen! One file should meen one 
file, without exceptions, without any needless checking.

Previously I tested another website copier (Teleport Pro) 
with the same site and when I set "only 1 file 
simultaneously", all was OK. The robot on the server cannot 
detect anything suspected. So I think that the option "1 
file" in HtTrack does not work too rigorously (in other 
words: it does not work at all... because if you really 
need to limit yourself to one file maximum, it REALLY must 
be obeyed and no other attempts downloading, no checks if 
files exist, no testings on the remote server etc. should 
be permitted).

Grzegorz Jagodziñski

All articles

Subject	Author	Date
Problems with checking existing files		12/04/2003 11:32
Re: Problems with checking existing files		12/04/2003 19:44