HTTrack Website Copier
Free software offline browser - FORUM
Subject: Suggestion - Blacklist URL option
Author: TSU
Date: 09/26/2010 00:55
 
Am surprised this option doesn't already exist and should be incredibly simple
and easy to implement.

Webhttrack seems to be having a lot of problems determining what is internal
and external when evaluating links on a typical aggregating website, like a
major news site.

For example, if I set the external depth link option to zero and overall link
depth to some number like 4, I get only the homepage. Then, if I set the
external depth link to one, it takes forever.

It seems to me that if a Blacklist option exists, I could first do a short
test run noting all external links being downloaded, then manually list those
external links to be excluded and re-run.

Webhttrack would simply do a quick lookup to the Blacklist and compare the URL
before each link download or during a process of building the structure of the
website to be replicated. Doing it once early could eliminate oodles of CPU
cycles compared to real time lookups.

When implementing a Blacklist, it would also be useful to support wildcards,
before, in the middle and after strings. Sorry I can give examples, it seems
that this Forum text editor balks  on asterisks.

thx.
Tony
 
Reply


All articles

Subject Author Date
Suggestion - Blacklist URL option

09/26/2010 00:55
Re: Suggestion - Blacklist URL option

09/26/2010 01:22




3

Created with FORUM 2.0.11