Re: Hyperlinks without www -symbol (or similar)

Subject: Re: Hyperlinks without www -symbol (or similar)

Author: Tapio Sirola

Date: 01/07/2004 23:03

Thank you Xavier, 

this is a great program and only getting better as I'm
learning to use it. However, when I try to download page
(www.someweb.com/folder/file.html) I still can't get it to
work with scan rules filter

+geocities.com/*

and it doesn't work with

+*geocities.com*

either. However it works if I filter with the option
'Include link(s) -> ALL LINKS' or in other words with

+*

but then I obviously get too many files and really have no
filter at all, so I can't do that. Thus I figure that it's
not a robots.txt problem either because it picks up the
geocities.com links fine with +* which shouldn't happen if
it was a robots.txt issue (switching robots.txt rules off
doesn't help either).

I have no other filters than the default filters that are
there with HTTrack 3.30 so I don't think that could be
it...?? It just seems that the filtering doesn't understand
a link that doesn't have a leading www -symbol (or similar)
but could that really be the case?
Also, to my understanding I don't need a stricter filter
than +geocities.com/* because HTTrack will download only
those links targeting certain pages at geocities.com/. So if
there are certain number of geocities.com links on
www.someweb.com/folder/file.html it will download only those
links (+ of course sublinks since I allow it to go down).

The sites that I want to download have a lot of links to
geocities.com and I can't or don't want to type the
subfolders separately.

Finally, what do you mean by 'yourhomestead' in
+geocities.com/yourhomestead/* ?
I'll be forever grateful if you can solve this one for me...

Best regards,
Tapio

Create subthread

All articles

Subject	Author	Date
Hyperlinks without www -symbol (or similar)		01/07/2004 17:37
Re: Hyperlinks without www -symbol (or similar)		01/07/2004 18:38
Re: Hyperlinks without www -symbol (or similar)		01/07/2004 23:03
Re: Hyperlinks without www -symbol (or similar)		01/08/2004 17:21
Re: Hyperlinks without www -symbol (or similar)		01/15/2004 08:26