| Hi!
First I have to say your tool is great !!!
I have done serveral successfully mirrors with HTTrack.
But every time I want to update (with my slow ISDN
connection) It takes very long,
when many pages and MANY MANY images are involved.
This happens when the main page has many links to sub-
pages (Image Thumbnail-Page) and from this
page links to the pictures.
As far as I know the update-process of Httrack works as
follows:
- check for changes of HTML files
- get the HTML files -> check for link changes
- get new HTML files
- check for changed Images, etc (takes a long time,
although only HTTP-HEADER Request)
- get new files and new files from new HTML-links
Well, I think some times the update process could be
greatly improved in speed, if the check for changed images
on the server is left out. I mean if the reference to the
image on the HTML-page is the same as before in most cases
I can assume that the image has not changed. Thus, Httrack
does not need to ask the server whether the file has
changed or not. On my system even this part of the update
on a site with many pictures (galleries) where time to
time a new gallery is added does take a very long time .
It would be nice to have a update-filter.
For now a little workaround for this problem and it is
faster than the original update mode:
1.) Update the mirror, download html-files first, exclude
all image files (eg. -*.jpg ...), do not delete old files
(otherwise all image files would be deleted) and wait
until it is finished
2.) Resume/Continue the mirror, now include all files,
here only the new files are downloaded.
And voila you have an fully updated mirror, without
checking every image.
Mabe this is an idea for a speedup update that is not
absulutely accurate in some rare times but I think much
faster.
Greetings
Mathias | |