| >I would like to manually log into the website before having >HTTrack run its
scan.
>Any way to do this?
Yes, probably.
> I would assume that the log in state is being carried in a cookie.
Yes, usually.
>But that state information is not being carried from web page to web page
and
> instead when each page is visited, it is thinking
> the user is not logging in and is asking for the log
> in information again before showing the actual web
> page.
I'm hazy on how Capture URL works, but I have a feeling that it's more
suitable for the websites of the username@password:website.address.com kind?
The way I do logging in is:
- login in using a browser
- export the cookies for the website using anything that lets you get a
Netscape-style cookies.txt - there are many extensions for Firefox if you
search addons.mozilla.org for "export cookies"
- drop the cookies.txt into the project folder, same place that you find
hts-log.txt
- make sure the logout link is not in httrack download scope, or it'll log out
at some point
- start httrack in whatever way you do it, I don't think command-line vs
graphical makes a difference here...
| |