| Dear HTTrack Forum members:
I am interested in downloading the articles in the
Washington Post (http://www.washingtonpost.com/wp-
dyn/print/) each day for a period during which I'll be
away from home. I wanted to set up the dowloads in
advance and then ask a family member to do them and e-
mail them to me while I'm away. I'm in the habit of
reading the Post each day, and I didn't want to have
to catch up with scads of issues when I return.
Each daily version of the Post has several sections,
and each section has a home page. On Mondays, for
example, there are basically four sections I'm
interested in -- the main (A) section, Metro, Style,
and Business -- and a home page, more or less, for
each of these sections.
Can I list each of the four home pages in WinHTTrack
and create a scan rule that would first tell the
program not to download any linked pages (-*.*) except
HTML pages with the word "articles" in the URL
(+*articles*.html). (All the Washington Post articles
have URLs of the form
<http://www.washingtonpost.com/wp-dyn/articles/A49148>-
2001Jun10.html>.)
The Washington Post section home pages contain links
to a lot of other extraneous things -- like classified
ads, banner ads, other section pages I'm not
interested in (or are appropriate only for other days
of the week), subscription information, etc. -- and I
would need to eliminate all those items from the
download to make the duration and size of the download
reasonable.
So my idea is to ask HTTrack to download nothing
whatsoever, except for HTML URLs containing the
term "articles".
Would this work? I'm not sure if it would because it's
not given as an example in the help files or user
manual. Also, would these scan rules prevent HTTrack
from downloading the section home pages themselves, on
which the article links are found (because of the <-
*.*> command)?
I've tried to use several offline browsers over the
years to do this exact job, and although I'm an
extremely experienced computer user, I have never been
successful in setting up one of the offline browsers
to do what I wanted. I seem to be incapable of setting
up any of these programs properly. That's why I'm
asking for assistance on this.
Thanks for any advice you can provide.
Regards, Stu Borman | |