| I've been trying to grab a site that uses authentication
via cookies. I can browse the site just fine but when I try
to use HTTrack (even when I'm browsing the site while it's
running) I get 403 forbidden for any files that require
login. I tried moving my cookies, and HTTrack made it's own
little cookie.txt file but that did not work.
I copied that cookie.txt file (and my IE cookie/index.dat)
pretty much everywhere but it still doesn't work. I was
reading on the forums how if you clear your cookies and use
capture url (I already capped and it still didn't work)
then it might work. So I logged into the site and then
cleared cookies, temp inet files and history. I clicked on
the link I want with capture URL and it caught it (same
link as it had caught before) but now it's returning me to
the login page.
The funny thing is that the main index.html page that is
inaccessible without logging in WAS captured the first time
around (before cookie clear) but anything on that
index.html that was behind auth (pages, gifs, etc...) was
403 forbidden. So it got the first page, kind of. *shrug*
After clearing my cookies it won't even go to that state of
functionality. However, if I get rid of the URL that it
captured and replace it with the actual url of the site
then it will try and capture the index.html page (and
return a 403 forbidden) instead of referring me to the
login page. It *won't* capture that page and deny the
pictures like it used to, it won't even capture the
index.html page now.
I'm confused, and even replaced the PHPSESSID (I assume
this means it's a PHP form authentication system) with the
one found in my cookie in temporay internet files. That
didn't work so I replaced all the information in the
cookies.txt file made by HTTrack with my real cookie and it
still didn't work.
This is starting to be frustrating and I'm sure there's an
easy workaround for this. Any help would be greatly
appreciated.
Despite the ranting I do love your program! :D | |