| Great software!
However, I can't figure out how to get all the html pages
in a subdirectory, and all the associated non-html files.
The complicating factor appears to be that I either have to
enter the site at an autologin page or use an html login to
set the cookies correctly.
I've spent a couple evenings reading the documentation and
forums, and trying things out. But with no luck.
Here's the whole story:
I have a subscription to rcmicroflight.com, so I downloaded
their site. I'm trying to be nice about it, and have
reduced the number of connections and bandwidth limit.
However, when I do this, my cookies time out, and I don't
get all of the site. Using the update feature appears to
use as much bandwidth as the initial download, and since it
walks the tree the same way each time, I still don't get
the whole site.
I know that I want all the html pages in (for example)
rcmicroflight.com/aug03, and all the associated images,
which are mostly in rcmicroflight.com/images/aug03.
I've tried the command line below with -r1, but I just get
the index page of aug03, not the whole directory. When I
try -r4, I get most of the site.
Do you have any suggestions? Scripting is an option. Thanks
in advance.
httrack <http://www.rcmicroflight.com/autologin.asp?ID=XXXXXXXXXXXXXXX> -c1
-b1 -f0 -X0 -v -r4 -O d:\mirror -
A10000 -n -* +aug03/*
Where XXXXXXXXXXXX is a numeric user id. | |