HTTrack Website Copier
Free software offline browser - FORUM
Subject: It is a bug - Xavier!
Author: William Roeder
Date: 07/31/2010 19:02
 
> > Anything not already processed at the time of
> > interruption has been lost from the cache
> > and will always be redownloaded.
> 
> But non-redirected files that were downloaded hours
> or days later are not redownloaded.  This only (and
> always) happens with these same files.
> 
> > Dynamic content (content created on the fly)
> > always has a newer timestamp
> > and is always redownloaded.
> 
> But this content is static (mp3, mp4, etc).
> 
> One place where this seems to happen is
> <http://ocw.mit.edu/courses/electrical-engineering-an>
> d-computer-science/6-013-electromagnetics-and-applic
> ations-fall-2005/textbook-with-video-demonstrations/
Now that you gave a url, I see the problem. On that page are links (mp4 and
rm) to
<http://www.archive.org/download/MitOpencoursewareElectromagneticFieldsAndEnergy/>
But instead of using DNS rotation for load distribution, archive.org responded
with 302 - file moved temporarily to
<http://ia341339.us.archive.org/1/items/MitOpencoursewareElectromagneticFieldsAndEnergy/>
So HTT gets the files from there.
On the update, archive.org responds with a different 302 ia341338 when I
tried. Since this is a new host the files are new by definition and are
downloaded. The original files are purged at the end.

Xavier, on 302's store the file in the original host directory structure, not
in the 302's and on update test against the original's timestamp to avoid
redownloading.

Until this is fixed, I suggest you add to the mirror the ia341338 url and
filter -www.archive.org/*.mp3 -www.archive.org/*.rm
 
Reply Create subthread


All articles

Subject Author Date
Bug causing recopying?

07/29/2010 19:25
Re: Bug causing recopying?

07/29/2010 22:11
Re: Bug causing recopying?

07/29/2010 23:50
It is a bug - Xavier!

07/31/2010 19:02




a

Created with FORUM 2.0.11