|
> You shouldn't need to change those options at all.
> Allowing "go everywhere on the web" will basically
> try to download the Web...
I thought that setting up external depth level to 1 will limit downloading the
whole web to 1 external level (that it "more important"). Since httrack with
Your settings is now downloading external pages without "go everywhere" on,
You have right.
> What you want is something like:
> ------------------------------
> Start URL:
> <http://www.cssbeauty.com/archives/category/business/>
>
>
> Options:
> Experts Only > use defaults
> Links > Get non-HTML files
> Limits > Max External Depth=1
>
> Scan rules:
> -*.amazon.com/*
> -*.cssbeauty.com/*
> +*.cssbeauty.com/archives/category/*
> -------------------------------
>
> Setting "Max External Depth" allows the first page
> of any outside domain to be captured (means you'll
> probably get pages linked from ads too). That is
> complimented by "Get non-HTML files" which gets
> required images, CSS, etc from those pages.
...It is working but it is just downloadink dozens of MB of mp'3s form
audio.cbsnews.com/ (maybe linked to some external page, don't think that there
is a link to them from cssbeauty). This catalog has 60 MB at the moment, and
still many mp3's to download. Will adding -* to filters do the job ? (then
setting external depth 1, according to manual "it is overriding all other
options (filters and default engine limiter)" - so it should still download
one external level, with graphics (+*.giv and other formats added to the
filter)
but won't download mp3... ?
Should I add +*.css to filters ? | |