Re: Downloading Washington Post content - HTTrack Website Copier Forum

Subject: Re: Downloading Washington Post content

Author: Stu Borman

Date: 06/12/2001 11:19

Thanks so much for your invaluable expert assistance. 

Your reply explains to a large extent why I was not 
getting the results I expected and why I have been so 
frustrated in trying to set up various offline 
browsers. 

First, I didn't realize that the Washington Post's 
robots.txt file was working against me. 

Second, as you predict, I have found that my scan 
rules caused a lot of old Washington Post articles to 
be downloaded, instead of just the new articles from 
that day. This was essentially causing each download 
to be never-ending because of related articles being 
downloaded recursively, as you mention. 

I would not have been able to figure out these 
problems on my own. In fact, a download such as this 
is tricky to set up properly without professional 
help. Setting up scan rules properly seems very 
similar to programming. 

Thanks again for your assistance. I haven't tried the 
new strategies you've taught me yet, but I will soon. 

Regards, Stu Borman

Create subthread

All articles

Subject	Author	Date
Downloading Washington Post content		06/11/2001 17:07
Re: Downloading Washington Post content		06/11/2001 21:15
Re: Downloading Washington Post content		06/11/2001 21:20
Re: Downloading Washington Post content		06/12/2001 11:19