| > The rules for 'creating .tmp files' and 'overwriting
> still existing files' and the situation of 'having
> corrupted, because unfinished downloaded files' is
> confusing me.
tmp files are internal, as William noticed. They are intermediate files used
for slots (downloads) not yet validated by the cache. However, pure "data"
files should be stored in disk anyway, and upon "continue an interrupted
mirror", be left as is.
The problem is for html files: if not validated by the cache, there is no way
to simply let the files on disk, because links have to be completely
rewritten.
> So, why isn't there an option which just compares
> the file structure web<->hd, copies the file if
There is no such option in HTTP, unfortunately. Web crawlers depend on link
analysis, and on server's ability to reply correctly to refresh requests
without re-downloading everything.
| |