HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: recursive scanning is a problem
Author: Xavier Roche
Date: 11/29/2002 07:37
 
> Hello. Are there any options to prevent Httrack v3.22-3 
> from downloading the same links and files if some html 
> pages recursively point to each other ? Lots of pages I 
try 
> to download unfortunately link back to a page or file 
> already downloaded and downloading starts again. I 
verified 
> they are not dynamic links, just regular html links.

httrack always checks for duplicate files ; the check is 
based on links. 
In your case links must change in a way (like a different 
timestamp in the link, or something else) ; for example

foo.html?id=1234 -> foo.html?id=5678 -> foo.html?id=9012

Unfortuntaley, httrack can not "know" that all these links 
are identical, and will cause a loop.

You can either limit the depth, or use filters to exclude 
duplicate files, depending on the site url structure

 
Reply Create subthread


All articles

Subject Author Date
recursive scanning is a problem

11/29/2002 01:25
Re: recursive scanning is a problem

11/29/2002 07:37
Re: recursive scanning is a problem

12/13/2002 06:55
Re: recursive scanning is a problem

12/16/2002 22:13
Re: recursive scanning is a problem

12/23/2002 18:14
Re: recursive scanning is a problem

02/16/2003 15:04
Re: recursive scanning is a problem

03/01/2003 22:24




3

Created with FORUM 2.0.11