HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: More Spider Behavior Options
Author: Spider
Date: 02/25/2001 11:16
> In your case, what you want to do is somethink like 
> mirroring several sites, as if you were mirroring 
> next site at the end of the current one. 

This would be ok if it only was about a limited pre-
known number of websites, but to use HTTRack as a 
spider (go everywhere, also out from the original 
links) it is of no help.

Currently I don't know of any consumer-oriented 
spidering program with serious capabilities. Teleport 
has a version that can be configured for extended 
spidering, costs several thousands USD(!).

HTTRack is very nice, but with a large link list 
(>10,000 links) it takes forever to get any result. 
With an option like I described in the first msg, one 
would both get immediate results and also get all the 
outlinks later, after all original links scanned.

The causality would be easy to do in a limited fashion 
as some kind of attribute in the filters section, 
however maybe not many people have need for this, so 
could be effort wasted.

But I think there could be a need for various 
spidering functions? Maybe there could be two options, 
layer-scan and depth-scan, layer-scan like now and 
depth-scan would go each tree branch until the end 
before starting any other branch. Horizontal vs 
Vertical. Whatever.

It would also help scanning large single sites, as 
there would be faster end results going thru huge 
linktrees straight to the bottom.

(Options like this would make HTTRack more widely 
usable, as it would broaden the scope from simple 
website mirroring/copying to spidering and finding.)
Reply Create subthread

All articles

Subject Author Date
More Spider Behavior Options

02/25/2001 02:47
Re: More Spider Behavior Options

02/25/2001 10:06
Re: More Spider Behavior Options

02/25/2001 11:16
Re: More Spider Behavior Options

02/25/2001 11:33
[REPOST] Re: More Spider Behavior Options

02/26/2001 10:23


Created with FORUM 2.0.11