HTTrack Website Copier
Free software offline browser - FORUM
Subject: Update procedure questions
Author: Joao Luzio
Date: 04/07/2005 18:00
Hi all,
  I recently began to notice some updates of a mirror (using
--update) can actually return considerably more files than
links (in dinamic websites). I'm talking about the links
that appear in the cache files (new.txt).
I got results like: 17000 files, 1650 links or 110000 files,
25000 links. Although not all have this kind of difference.

After some research, I managed to find out that:

(Calling mirror1 the first mirror process, updateOf1 the 2nd
mirror that updated 'mirror1', updateOf2 the 3rd mirror
process that updated 'updateOf1'.)

1 - If mirror1 went ok, but updateOf1 gives problems, for
example networks problems, a update of this (updateOf2) will
have more files (and i'm not talking about the hts-cache/*).
In the new.txt the entries of these urls appear as added
instead of updated.
2 - In some cases these extra files still exist in the
original website.
3 - Some updates just return more files regardless or having
errors or not.

So, my question is: 
1 - is this a bug? 
2 - how can i prevent this? (i was thinking in the lines of
getting the 'good' mirror for updating, after having errors
in one. But that might be overkill for big mirrors).

Thanks in advance,
  João Luzio

All articles

Subject Author Date
Update procedure questions

04/07/2005 18:00
Update procedure questions (cont.)

04/07/2005 19:38


Created with FORUM 2.0.11