HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: cookie.txt info not sent to site
Author: rexx
Date: 06/18/2013 22:41
My apologies ... Somehow I left off the last bit of my post when I pasted it

Notes: In the interests of privacy, I removed personal info from the log info.
i.e. "thesite" won't work.
This is a private torrent tracker site. The page I'm after is filtered to
display only the torrents I'm interested in. And while they do have various
rss feeds they are limited. And I'd guess they have there reasons for not
making them more customisable. 

Although it's pretty crude and not to efficeint, as it's cobbled together
after a few hours learning some commands, I've managed to get the job done
with a shell script using curl + sed. However I'd still preferr to use httrack
as a more elegant and easily adaptable solution, if it's possible to get past
this problem I'm having.

I've been using webhttrack to get the options right (as per log) and will then
use httrack from the terminal later.


Back to the present ...

Yes I did read all the information available
<> . And re-read it again just

As I see it, it isn't a question of logging in each time, I'm permanently
logged in to the site via firefox. So I'd be trying to login while already
logged in. And also I'd like to avoid "logging in" to the site a few times
each hour. 

It seems to me that each "request" for a page or torrents needs to have the
cookie information as part of the htmlheaders. (I used
<> to see what was happening, so I could get
curl to work) In which I also mimic my browsers ID as I did in httrack. As per
the curl command in previous post and this one to get the torrent files.

curl --remote-name-all  --cookie "uid=xxxxxxxx;pass=xxxxxxxxxxxxxxxxxxx" -A
'Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:20.0) Gecko/20100101 Firefox/20.0'
--compressed -K urls.txt


I realise that this probably isn't the purpose for with httrack was originally
developed. But it cold be really useful used like this. i.e. as a highly
customisable replacement for various rss feeds, which I can review at leisure
without having to login in to various sites.

Anyway I was kinda hoping that it was just some setting I'd missed or messed
up that would get httrack "to use the cookie info with all the requests". Also
I'm pretty new to linux and all this stuff, so most of the above is either
guesswork or "trial and error".

I will try the "--catchurl" login option just to see if it works, when I have
some time, probably only on the weekend. 
Reply Create subthread

All articles

Subject Author Date
cookie.txt info not sent to site

06/16/2013 00:24
Re: cookie.txt info not sent to site

06/16/2013 03:07
Re: cookie.txt info not sent to site

06/18/2013 22:41
Re: cookie.txt info not sent to site

07/14/2013 09:57
Re: cookie.txt info not sent to site

07/14/2013 20:53


Created with FORUM 2.0.11