HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Updates with outdated doit.log file
Author: Sherry Lochhaas
Date: 03/01/2010 19:33
 
We have a file full of scrape rules that we use on every scrape. Instead of
typing in all of these parameters, we just call a file named
ScanRulesFull.txt. The file had to eventually be moved to a new folder, so
some doit.logs have the old file path in them. 

This is an example of an old doit.log: 
-O "X:\\egRawScraped\\dublincore.org,X:\\egCache\\dublincore.org" -%S
"X:\\UpdateSW\\HTTrack\\ScanRulesFull.txt" +dublincore.org/* dublincore.org/
-iC2 -O "X:\\egRawScraped\\dublincore.org,X:\\egCache\\dublincore.org" -iC2 -O
"X:\\egRawScraped\\dublincore.org,X:\\egCache\\dublincore.org"

This is one with the new file path: 
-O "X:\\egRawScraped\\openlearn.open.ac.uk,X:\\egCache\\openlearn.open.ac.uk"
-%S "X:\\UpdateSW\\HTTrackScanRules\\ScanRulesFull.txt"
+openlearn.open.ac.uk/* openlearn.open.ac.uk/


We’re just looking for a way to make the old scrape jobs use the new file
path for our ScanRulesFull.txt file during an update, instead of just copying
the old file path that's in the doit.logs. We (at the WiderNet Project) are
working with 1400 sites so that’s a lot of logs to manually change, but we
can do that if needed. Is there any way to bypass what’s in the doit.log
during an update? 

Thanks!
 
Reply Create subthread


All articles

Subject Author Date
Updates with outdated doit.log file

03/01/2010 17:56
Re: Updates with outdated doit.log file

03/01/2010 18:11
Re: Updates with outdated doit.log file

03/01/2010 19:33




5

Created with FORUM 2.0.11