HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: One site, multiple hosts
Author: Xavier Roche
Date: 01/15/2002 22:11
 
> If a site is mirrored over multiple hosts (eg: 
> www.foo.com and www.bah.com) what would the 
difference 
> be in supplying the two urls at the beginning (eg: 
> httrack <http://www.foo.com> <http://www.bah.com> ... 
> [plus options, filters, etc]), or supplying one url 
at 
> the begining and the other as an accepting host 
filter?
No difference, IF there is at least one link on the 
first website which refers to the second.
In fact, when you enter an URL, the engine does the 
following:
- adding the URL on the list of URLs to mirror 
(stack), exactly as if it was discovering a link on an 
HTML page
- adding a default filter <URL>* before all filters 
(BEFORE is important: if you specified '-*' as filter 
to forbide everything, this will not change anything 
because in '+<URL>* -*' the last filter will be 
prioritary)

> The reason I ask is that we have just made a website 
> unhappy by hitting them a fair bit due to an 
infinite 
> loop we got stuck in (hitting the same page a lot), 
> which had a multiple host setup as described above.

Argh.. may be due to a cgi or something similar.. in 
this case, adding something like '-www.foo.com/*/cgi-
bin/*' is a good idea.. or setup a depth which will 
block loops (example: depth=20 - large enough for most 
sites, and small enough to limit loops)

 
Reply Create subthread


All articles

Subject Author Date
One site, multiple hosts 01/15/2002 02:46
Re: One site, multiple hosts 01/15/2002 22:11




6

Created with FORUM 2.0.11