| Thanks so much for your invaluable expert assistance.
Your reply explains to a large extent why I was not
getting the results I expected and why I have been so
frustrated in trying to set up various offline
browsers.
First, I didn't realize that the Washington Post's
robots.txt file was working against me.
Second, as you predict, I have found that my scan
rules caused a lot of old Washington Post articles to
be downloaded, instead of just the new articles from
that day. This was essentially causing each download
to be never-ending because of related articles being
downloaded recursively, as you mention.
I would not have been able to figure out these
problems on my own. In fact, a download such as this
is tricky to set up properly without professional
help. Setting up scan rules properly seems very
similar to programming.
Thanks again for your assistance. I haven't tried the
new strategies you've taught me yet, but I will soon.
Regards, Stu Borman
| |