HTTrack Website Copier
Free software offline browser - FORUM
Subject: not listening to directory rules
Author: Steve
Date: 05/31/2002 05:48
 
Hi,

Was gathering the site <http://sunsite.anu.edu.au/paftad>, 
using stay on address, only go down travel options. 
However, <http://sunsite.anu.edu.au/index>, /pub, and other 
links were followed. Even links that were html.

The problem with this is that 
<http://sunsite.anu.edu.au/paftad> is actually 
<http://sunsite.anu.edu.au/paftad/index.htm> - I think 
httrack is seeting /paftad as a toplevel resource, and 
therefore thinks it can go anywhere on the toplevel. This 
isn't the case, and as a result my usual ~ 20 files became 
a faire few more :)

Is this a bug, or should we take more care with the 
starting url? When I fully qualified the starting url, the 
links higher up were correctly ignored. Log output:
13:36:07        Debug:  upper link canceled: 
sunsite.anu.edu.au/
13:36:07        Debug:  (wizard) cancelled foreign domain 
link: link sunsite.anu.edu.au/ at 
sunsite.anu.edu.au/paftad/main.htm

running httrack v3.16, on rh linux 7.2.

options used: /usr/bin/httrack 
<http://sunsite.anu.edu.au/paftad> -
O "/pandasworking/30005/20020531" -zfI0A60000c12tx%xb1%sqZ%
Hr50M1000000000E172800%Pnp3Da -j -%A standard -%U pandora -
#Z -
#f
 
Reply


All articles

Subject Author Date
not listening to directory rules

05/31/2002 05:48
Re: not listening to directory rules

05/31/2002 06:12




4

Created with FORUM 2.0.11