| > Slow update :
> - Ah yes, that might be it, all pages are perl files (.pl)
> with query strings. But that still does not explain why
only
> 1 connection is used at a time?
It does. Link testing is really slow.
Setup (in MIME types) pl <-> text/html and the whole
process should be REALLY faster.
> Losing info in new.* files :
> - Unfortunately I do not cancel the download and let it
> finish, but rather just disconnect at a specific time.
Speeding up the process should limit this issue - I suppose
> How does httrack decide which files to continue
downloading?
- html data "checked" are stored in cache ; that is, after
being parsed. Sudden stops will wipe html data downloaded
in memory but not yet checked
- Non-html data are stored in realtime on disk ; and meta-
data are stored once files are "checked". In case of sudden
break, httrack will still be able to "continue" partial
files by sending the current file size to the server. If
the server is smart, it should work.
I could improve the system by storing more html data in
cache in realtime -- might be done in a future release.
| |