HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: already answered
Author: Xavier Roche
Date: 07/07/2002 22:22
 
> Why is it that the cache stores the full html source and 
> not only header information for updates?> The answer is given in:
> Re:I can reproduce it
> from Xavier

Right - to make short: HTML files stored locally are 
modified html forms, for example, links like:
<http://www.foo.fom/~smith/Bar.Html>
will be modified into:
_smith/bar.html

Note the change of the ~ character and the uppercase 
characters modified. Such changes are necessary to comply 
with local filesystem rules (creating, under windows, a 
file which contains the ':' character, for example - or the 
~ character with Unix systems, is impossible), but this is 
erasing relevant information (the original URL)

That's why the engine has to store somewhere an "original" 
html form, to be able to do updates.

Note that you can safely erase "old.*" files in the hts-
cache directory, this will save 30% space.. and you can 
wipe the whole hts-cache directory, when burning a project 
on a CD, or when distributing it.

> HTTTrack really is a useful tool!

Thanks :) Some remaining bugs, but I expect to wipe them 
all soon
 
Reply Create subthread


All articles

Subject Author Date
Cache

07/07/2002 18:55
already answered

07/07/2002 22:04
Re: already answered

07/07/2002 22:22




3

Created with FORUM 2.0.11