Re: Updates with outdated doit.log file - HTTrack Website Copier Forum

Subject: Re: Updates with outdated doit.log file

Author: Sherry Lochhaas

Date: 03/01/2010 19:33

We have a file full of scrape rules that we use on every scrape. Instead of
typing in all of these parameters, we just call a file named
ScanRulesFull.txt. The file had to eventually be moved to a new folder, so
some doit.logs have the old file path in them. 

This is an example of an old doit.log: 
-O "X:\\egRawScraped\\dublincore.org,X:\\egCache\\dublincore.org" -%S
"X:\\UpdateSW\\HTTrack\\ScanRulesFull.txt" +dublincore.org/* dublincore.org/
-iC2 -O "X:\\egRawScraped\\dublincore.org,X:\\egCache\\dublincore.org" -iC2 -O
"X:\\egRawScraped\\dublincore.org,X:\\egCache\\dublincore.org"

This is one with the new file path: 
-O "X:\\egRawScraped\\openlearn.open.ac.uk,X:\\egCache\\openlearn.open.ac.uk"
-%S "X:\\UpdateSW\\HTTrackScanRules\\ScanRulesFull.txt"
+openlearn.open.ac.uk/* openlearn.open.ac.uk/


Were just looking for a way to make the old scrape jobs use the new file
path for our ScanRulesFull.txt file during an update, instead of just copying
the old file path that's in the doit.logs. We (at the WiderNet Project) are
working with 1400 sites so thats a lot of logs to manually change, but we
can do that if needed. Is there any way to bypass whats in the doit.log
during an update? 

Thanks!

Create subthread

All articles

Subject	Author	Date
Updates with outdated doit.log file		03/01/2010 17:56
Re: Updates with outdated doit.log file		03/01/2010 18:11
Re: Updates with outdated doit.log file		03/01/2010 19:33