HTTrack Website Copier
Free software offline browser - FORUM
Subject: Filelist / Re-running httrack
Author: Pete
Date: 09/28/2011 05:35
 
Hi there,
- I have a list of 500 html urls that I'm giving to httrack to mirror.
- After httrack is done I run a script that checks to see if all the urls were
successfully downloaded.
- The www.cisco.com webserver I'm mirroring the urls from does not return
last-modified-date 

I'm finding a few urls that weren't downloaded.

I'd like to submit just the few missing urls to httrack for download.  Is that
a good approach?  Or should I just rerun the httrack using the original
filelist?
What would be the best way to go about ensuring I get a complete mirror?
Will the lack of last-modified-date impact httrack's ability to cache / update
the mirror?
The httrack version I'm using is: HTTrack3.44-1-noV6 

The httrack command I'm using is:
httrack -r2 -n -z -iC2 -%L /var/local/c3cases/httrack//filelist -O
/var/local/httrack/dest/ +*.js -*.pdf 

Here's an example of a url that I'm trying to mirror: 
<http://www.cisco.com/en/US/tech/tk365/technologies_tech_note09186a00800945fe.shtml>
 
Reply


All articles

Subject Author Date
Filelist / Re-running httrack

09/28/2011 05:35
Re: Filelist / Re-running httrack

09/28/2011 15:06
Re: Filelist / Re-running httrack

09/29/2011 05:52




0

Created with FORUM 2.0.11