Breaking HTTrack - HTTrack Website Copier Forum

Subject: Breaking HTTrack
Author: Grzegorz
Date: 12/04/2003 11:03
Why does it take SO LONG just to produce html files from 
the cache (when program says "finishing pending 
transfers")? The question is not out of sense. I can give 
an example.

When I mirror files, I can see some information, 
like "links scanned 197/2827", "files written 1340". Let's 
say that I must break the program for a while. I 
click "Cancel" once and HTTrack stops finding new links.

There were 1340 files already written in the cache and only 
197 of them were already scanned for links. The program 
needed ca. one hour (!!!) just to written the rest of the 
files, i.e. the cached files as the real files in the 
project folder. In fact, I have tried so big projects that 
when I need to break them for a while, I needed 3 hours or 
even more!

I have never seen another program which needs 3 hours 
between the command "finish" and the actual finishing. And 
if I had broken the program immediately, ALL MY WORK WOULD 
HAVE BEEN LOST! Terrible...

So, my choise was: either to wait one hour (or even three 
hours in the other example) until the program finishes the 
project, or to break the program immediately and to lose 
all the results of its job that had been collected many 
hours. I am convinced that such an idea of functioning is a 
very nasty bug of the program.

Which is more, I am sure that all programs should have an 
option that let you save all the results of their work 
within just seconds and never within hours! All other 
programs have such an options (ex. another web mirroring 
tool, Teleport Pro).

I am even convinced that there should exist an autosave 
function. It means, when your computer stops answering 
(when it hangs), you should have a possibility to restart 
your work from the moment when you finish (or from as close 
to this moment as possible). Most of other programs give 
such a possiblility...

In the present shape of HTTrack no autosave is possible. I 
strongly suggest to upgrade the idea of the cache itself. 
As for now, for an unknown reason the program cannot save 
the current raw memory contents (I mean all the data which 
the program uses at a given moment when it is working) and, 
as a result, it needs many hours to re-read the cache 
instead of just to place again in the memory all the data 
it used previously and to continue work exactly from the 
moment when it finished. All save features in computer 
games work this way, so it suld not be very hard to 
implement such a feature in HTTrack.

Grzegorz Jagodziñski
All articles
Subject	Author	Date
Breaking HTTrack		12/04/2003 11:03
Re: Breaking HTTrack		12/04/2003 19:59