| Hi there,
- I have a list of 500 html urls that I'm giving to httrack to mirror.
- After httrack is done I run a script that checks to see if all the urls were
successfully downloaded.
- The www.cisco.com webserver I'm mirroring the urls from does not return
last-modified-date
I'm finding a few urls that weren't downloaded.
I'd like to submit just the few missing urls to httrack for download. Is that
a good approach? Or should I just rerun the httrack using the original
filelist?
What would be the best way to go about ensuring I get a complete mirror?
Will the lack of last-modified-date impact httrack's ability to cache / update
the mirror?
The httrack version I'm using is: HTTrack3.44-1-noV6
The httrack command I'm using is:
httrack -r2 -n -z -iC2 -%L /var/local/c3cases/httrack//filelist -O
/var/local/httrack/dest/ +*.js -*.pdf
Here's an example of a url that I'm trying to mirror:
<http://www.cisco.com/en/US/tech/tk365/technologies_tech_note09186a00800945fe.shtml>
| |