HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: clarification on the update functionality needed
Author: Xavier Roche
Date: 04/17/2013 21:21
 
> > Is there functionality in httrack not to save the
> > html files if they were previously saved?
To be more precise, you have to revalidate the html file remotely (that is,
ask the server whether the file has changed or not) in all cases for the
update process.

(The html file is rewritten on disk anyway, because link computation may
change the link layout - for example, foo.html and Foo.html might be different
files, and might be ordered differently, leading to different foo.html and
foo2.html files)

The real problem is that MANY servers DO NOT CARE to fulfill update requests:
they always return a "this page has been modified" status when httrack asks
"has this page changed since last time ?" - and you end up retransmitting all
data. That's unfortunate, but I can not do anything against lazy webmasters
:(

 
Reply Create subthread


All articles

Subject Author Date
clarification on the update functionality needed

04/17/2013 14:22
Re: clarification on the update functionality needed

04/17/2013 15:26
Re: clarification on the update functionality needed

04/17/2013 16:25
Re: clarification on the update functionality needed

04/17/2013 17:34
Re: clarification on the update functionality needed

04/17/2013 21:21
Re: clarification on the update functionality needed

04/17/2013 23:13




6

Created with FORUM 2.0.11