Re: recursive scanning is a problem - HTTrack Website Copier Forum

Subject: Re: recursive scanning is a problem

Author: Xavier Roche

Date: 11/29/2002 07:37

> Hello. Are there any options to prevent Httrack v3.22-3 
> from downloading the same links and files if some html 
> pages recursively point to each other ? Lots of pages I 
try 
> to download unfortunately link back to a page or file 
> already downloaded and downloading starts again. I 
verified 
> they are not dynamic links, just regular html links.

httrack always checks for duplicate files ; the check is 
based on links. 
In your case links must change in a way (like a different 
timestamp in the link, or something else) ; for example

foo.html?id=1234 -> foo.html?id=5678 -> foo.html?id=9012

Unfortuntaley, httrack can not "know" that all these links 
are identical, and will cause a loop.

You can either limit the depth, or use filters to exclude 
duplicate files, depending on the site url structure

Create subthread

All articles

Subject	Author	Date
recursive scanning is a problem		11/29/2002 01:25
Re: recursive scanning is a problem		11/29/2002 07:37
Re: recursive scanning is a problem		12/13/2002 06:55
Re: recursive scanning is a problem		12/16/2002 22:13
Re: recursive scanning is a problem		12/23/2002 18:14
Re: recursive scanning is a problem		02/16/2003 15:04
Re: recursive scanning is a problem		03/01/2003 22:24