| I should have mentioned that I am using Windows 98
This most recent experiment has been running some hours now
(since my previous post).
It seems to be the most successful of my experiments so far.
I decided to specify as URL to download
groups.yahoo.com\group\mygroup\message
and then I put an exclude on urls containing
groups.yahoo.com\group\mygroup\messages
in their url (to suppress those pesky index pages)
although I notice that index pages from \messages are still
being written, but not as much as in my other
experiments ....
but I still would be most grateful for a detailed
explanation of how to make the download do exactly what I
want (namely):
1. only grab html posts from the \message
2. do not write index pages from \messages
3. it would be ideal if no pictures were downloaded (since
they are yahoo ads), but simply the text.
The other thing I notice which is quite bizarre (but this
is a problem with yahoo groups, and not httrack), is that
several copies of each post are downloaded, some with a
pure number name (e.g. 1, 127, 1021) but also other files
like 101aef0 (in addtion to 101affa, 101, and 101c5b0).
Each of those files will be identical when viewed in the
browser. Does anyone have any idea what those many
strange files are?
Thanks in advance for your help and suggestions.
| |