HTTrack Website Copier
Free software offline browser - FORUM
Subject: More Spider Behavior Options
Author: Spider
Date: 02/25/2001 02:47
 
It'd be nice if there were more options to control the 
behavior of the spidering function. With a large link 
list, it would be better to be able to scan one site 
wholly, before going to the next site in list.

Together combined with "go everywhere on the web" it 
would form a nice behavior: first go to the first link 
on the list, download it fully, and put any external 
links found to the bottom of the spidering stack.

Then go to the next and do the same. When all original 
links scanned, start from the links found under the 
1st original site etc.

Without this, with a large link list to scan, there 
may be a damn long wait before anything of interest is 
actually pulled form the web, as the spider scans 1 
level at a time from each link... :(

(Another improvement on this would be a causal scan, 
ie IF found files *.jpg[>20] THEN go deeper on this 
site ELSE go next dirtree/site)
 
Reply


All articles

Subject Author Date
More Spider Behavior Options

02/25/2001 02:47
Re: More Spider Behavior Options

02/25/2001 10:06
Re: More Spider Behavior Options

02/25/2001 11:16




d

Created with FORUM 2.0.11