HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Redirecs getting redownloaded many times
Author: Xavier Roche
Date: 04/07/2004 21:35
 
> I'm doing regular downloads of www.jp.dk, and I notice a
> disturbing effect:  When a page has moved, the 302 headers
> may get downloaded many times without the page itself ever
> appearing.

This is a big issue that I wanted to fix one day - I 
finished to code it, but it requires some testing :)
The problem is simple: files such as /foo require an 
additional test, and tests are not immediately taken in 
account. In a "redirect loop", intermediate states are not 
saved, leading to new requests if the link is seen again.
I added a cache for all these states, and test requests 
should not be done twice anymore - please give you feedback 
about this new release.

> 4722 <http://www2.jp.dk/info> 302

This shouldn't happend anymore - I hope.

> Shouldn't it be recorded somehow that the redirects have
> been followed?
Yep :)
(See beta-2 currently available)

> P.S. I find it amusing that the first line says 
essentially
> 'HTTrack launched at <site>'.  Sounds like a cruise 
missile
> or something:)

Well, as long as it doesn't crash, everything is fine :)
 
Reply Create subthread


All articles

Subject Author Date
Redirecs getting redownloaded many times

04/07/2004 10:40
Re: Redirecs getting redownloaded many times

04/07/2004 21:35




3

Created with FORUM 2.0.11