| > But if we come to a "border" of 60000..80000
> detected links after several hours of runtime,
> HTTRACK slows down dramatically: It never stops or
> hangs, and downloads of files are fast all the time,
There is clearly some bottleneck somewhere in the code (probably a linear or
quadratic loop) which explains this phenomenon.
If someone had the necessary tools to investigate, I'll be happy to fix the
problem. But anyway I'll try to setup a real test to track the problem, when I
have some time.
Regarding "large mirroring projects", a dedicated web crawler is probably a
good choice. HTTrack is more specialized for small and medium ones.
| |