HTTrack Website Copier
Free software offline browser - FORUM
Subject: Authentication and URL queries
Author: xenoglaux
Date: 12/05/2011 21:11
 
Hi, I'm trying to download one page which has a lot of content under various
url queries:

foo.bar.com/page.html?query=something#=somethingelse
(+ many, many queries in similar format)

All the query sub-pages that I want to download are linked from page.html, and
the crawler seems to be able to find them no problem if I put this in the
filter field:

+foo.bar.com/page.html*

However, every page in foo shows a "This page may contain adult content!"
warning, which you must click through ("click through" = click a button, which
displays the hidden content without taking you to a different page) to view
the content if you are not logged in. If you are logged out and you click
through once, I believe it sets a cookie as I do not have to click through
again in the same session.

I've used the URL capture feature to give httrack a cookie to bypass the adult
content warning, but the cookie only seems to work for page.html - not any of
the query sub-pages.  I see the adult content warning for the queries...and
the content past the warning is not saved within the mirror. Clicking through
the warning inside a mirrored page gets me "page not found".

How can I get the cookie to apply to the query sub-pages as well as the
mainpage, so that I can mirror the content and not the warning?
Thanks.
 
Reply


All articles

Subject Author Date
Authentication and URL queries

12/05/2011 21:11




a

Created with FORUM 2.0.11