HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Only download files not existing locally
Author: Xavier Roche
Date: 10/01/2002 19:15
 
> Here is my situation.
> We use (win)httrack to create static mirrors for 
otherwise 
> dynamic sites.
> One of our projects creates a very large site (around 
15000-
> 2000 files, 95% originally dynamic html files).
> The complete htttrack mirror takes approx 12 hours.
> Of course we only create the mirror after we have 
concluded 
> the dynamic site is a 100% correct.
> Sometimes during the mirror the response contains a 
> ColdFusion server error message, something like: 
> Error processing request... 
> Typically indicating server overload.
> Note, this is still a 200 response!
> It is not hard to find these cases in the mirrored site. 
> (Simply do a string find in the mirrored tree)
> What I would like to do is, delete the mirrored files 
with 
> errors, than update the mirror, without processing all 
> correct files again! 
> Is it possible to achieve what I would like to do?> What other tweaks do
possibly need?
Wow.. quite hard to handle. A solution might be to patch 
the hts-cache/new.ndx index file and invalid entries that 
were incorrectly downloaded (by replacing link path by 
XXXX, BUT ensure that you won't change the index size by 
inserting characters), but this is quite tricky to do..  

Another way (if the server overload if the reason of all 
problems) would be to reduce the server overload, setting 
maximum simultaneous connection to 2 or 3 in httrack.

 
Reply Create subthread


All articles

Subject Author Date
Only download files not existing locally

09/30/2002 14:28
Re: Only download files not existing locally

10/01/2002 19:15




1

Created with FORUM 2.0.11