HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: repeated grabs of same file + filter doesn't work
Author: Xavier Roche
Date: 12/28/2002 17:27
> Problem 1 - the program seem to be grabbing multiple
> instances of certain web pages from the above website,
> namely privacy.shtml, tou.shtml, legalfaq.shtml and some
> other web pages as well.

Different URLS? Like ..privacy.shtml?id=12 
and ..privacy.shtml?id=34 ?
Look in hts-cache/new.txt for refferences to privacy.shtml -
 the URLS should be different?
> I have noticed that these web pages
> are pretty much referenced from the bottom of almost every
> page in the website. This is causing extreme delays in the
> full download of the website

Darn.. try avoiding these files (options/scan rules : -
*privacy.shtml* -*tou.shtml* -*legalfaq.shtml*), but I'm 
still surprised that this problem can occur

> Problem 2 - Under the 'Scan Rules' of the program, I have
> specified for the program to avoid downloading anything in
> the 'ladder' folder by using the following scan rule:
> However, the program is still downloading the contents of
> the website.

Are you still getting links from ?? 

- Ensure that* is the last 
filter, and that other filters are not placed before (like 

- Ensure that another domain is not used, like

> Under the 'Links' tab of the options, I have
> made sure the 'Get non-HTML files related to a link... but
> that doesn't seem to help either. Am I specifying the 
> rule incorrectly or something?
Doesn't seem so. Note that filters are always prioritary, 
except for the "external depth" option (this option should 
never been used, anyway..)

Reply Create subthread

All articles

Subject Author Date
repeated grabs of same file + filter doesn't work

12/24/2002 06:44
Re: repeated grabs of same file + filter doesn't work

12/28/2002 17:27
Re: repeated grabs of same file + filter doesn't w

01/03/2003 22:31


Created with FORUM 2.0.11