HTTrack Website Copier
Free software offline browser - FORUM
Subject: External sites
Author: Carl
Date: 05/20/2014 22:36
 
I started a new project today to copy a single website/domain, and launched the
mirror. After half-an-hour I looked and noticed that it was dutifully trying
to mirror/scan pages from several external sites (most notably wikipedia).

I stopped the scan, and went into OPTIONS > LIMITS > MAX EXTERNAL SITE DEPTH =
0, and that seems to have solved the issue. In reading the docs, they clearly
state that HTTrack will "avoid crawling external sites". This is what I
expected, but clearly is not the case.

My installation is routine, and I don't believe I have made any radical
settings that might cause HTTrack to overrun it's SOP and begin crawling
external sites.

Can someone point me to settings that I should look at to prevent this in the
future?
 
Reply


All articles

Subject Author Date
External sites

05/20/2014 22:36
Re: External sites

07/03/2014 02:33




e

Created with FORUM 2.0.11