Re: webhttrack - ext deep 1 doesn't work. - HTTrack Website Copier Forum

Subject: Re: webhttrack - ext deep 1 doesn't work.

Author: Leto

Date: 07/31/2006 02:29

> . Optnions were: can go down, can go outside domain
> (whole web - I wanted it to can get pages from
> gallery), external deep 1 (only 1st page of every
> external website) . 

You shouldn't need to change those options at all.  Allowing "go everywhere on
the web" will basically try to download the Web...



> Filters: -www.cssbeauty.com/* -cssbeauty.com/* (I
> don't want it to retrieve whole cssbeauty, which is
> quite big site)
> +www.cssbeauty.com/archives/category/* (I want it
> can go to every subcategory, not only business.
> links look like  
> <http://www.cssbeauty.com/archives/category/CATEGORY/>
> ).+cssbeauty.com/archives/category/* (the same
> reason) +*.png +*.gif +*.jpg +*.css +*.js
> -ad.doubleclick.net/* (standard).

You changed the Global Travel Mode so you are allowing HTTrack to go anywhere. 
Your filters do not restrict any other domains than cssbeauty.com


What you want is something like:
------------------------------
Start URL:
<http://www.cssbeauty.com/archives/category/business/>

Options:
Experts Only > use defaults
Links > Get non-HTML files
Limits > Max External Depth=1

Scan rules:
-*.amazon.com/*
-*.cssbeauty.com/*
+*.cssbeauty.com/archives/category/*
-------------------------------

Setting "Max External Depth" allows the first page of any outside domain to be
captured (means you'll probably get pages linked from ads too).  That is
complimented by "Get non-HTML files" which gets required images, CSS, etc from
those pages.

Create subthread

All articles

Subject	Author	Date
webhttrack - ext deep 1 doesn't work.		07/30/2006 22:07
Re: webhttrack - ext deep 1 doesn't work.		07/31/2006 02:29
Re: webhttrack - ext deep 1 doesn't work.		07/31/2006 13:38
Re: webhttrack - ext deep 1 doesn't work.		07/31/2006 13:54
Re: webhttrack - ext deep 1 doesn't work.		08/01/2006 00:33