HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: infinite loop at downloading a webpage
Author: Xavier Roche
Date: 08/15/2003 11:39
 
> When I downloading my website (it's an dynamic website 
with 
> a ASP/DB, very huge (around 200.000 files, 7 GB), spread  
> over different servers) I'm running into one big problem. 
> The graphics directory seams to be downloaded in a loop 
for 
> ever and ever. 
> 
G:\lsm_offline_130803b\LSM\lsbu.marketing.agilent.com\graphi
cs\graphics\graphics\graphics\graphics\graphics\graphics\gra
p
> Any idea what I do wrong or how I can solve the problem?
I can't reach lsbu.marketing.agilent.com ; but I suppose 
that you have the following case:

- a bad link somewhere (like /foo) generates a "false" 404 
page (a regular 200 page with an error message inside)
- inside this broken page, another link such as
img src="graphics/"
- another "broken" 404 page for /foo/graphics
- another link to graphics/ which will cause httrack to 
follow foo/graphics/graphics
.. and so on

To fix that permanently:

Ensure your server correctly respond to errors ; that is:
- either with a "404" error page (this is the only good 
solution - see RFC2616)
- redirects to a central "404" error page (using a 302 http 
error code with a redirection)

But sending a regular (200 error code) page in case of 
error is really dangerous.

As a temporary fix:

Add the following scan rule (set options / scan rules)
-*/graphics/graphics/*

 
Reply Create subthread


All articles

Subject Author Date
infinite loop at downloading a webpage

08/15/2003 11:12
Re: infinite loop at downloading a webpage

08/15/2003 11:39
Re: infinite loop at downloading a webpage

08/15/2003 12:14
Re: infinite loop at downloading a webpage

08/15/2003 13:04
Re: infinite loop at downloading a webpage

08/15/2003 13:09
Re: infinite loop at downloading a webpage

08/15/2003 13:30
Re: infinite loop at downloading a webpage

08/15/2003 13:45
Re: infinite loop at downloading a webpage

08/18/2003 02:07




d

Created with FORUM 2.0.11