| > If I pause httrack as above and then try to unzip
> the new.zip file, I get:
>
> "End-of-central-directory signature not found"
>
> I can unzip the old.zip file, but not the new.zip
> one. And unfortunately, I need to access the
> original content as it was stored in the new.zip
> file.
>
> Any suggestion as to how to circumvent this
> problem?
Just thought of one possible solution, but wondering if there would be an
easier way.
I could wrap httrack inside a script that repeatedly calls it with:
--max-time 60 --continue
In other words, continue the previous crawl for 60 seconds. That way, if the
wrapper script is interrupted, httrack will exit gracefully after a maximum of
60 seconds.
The problem I see with this is that the new.txt, old.txt new.zip and old.zip
will contain at most the urls that were downloaded in the last two 60 second
crawls. So my wrapper would have to do a bit of work to concatenate those into
files say, old-cumulative.txt and old-cumulative.zip.
Is there an easier way to get to my goal than this? It's not hard to
implement, but it's still not negligeable amount of work.
Alain
| |