HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: form based
Author: Fireman
Date: 03/01/2008 21:24
 
> Once you've captured the cookies you should be able
> to start on any page you like.

I'm confused.  There's a cookies.txt file that I've captured  and it's used
for authentication.  That works correctly, and I can start on any page I want. 
I'd like to start on the page that has the form on it that I quoted.  When I
start there, I get that page, but I just get the form itself on that page. 
When I browse the page (select one of the two options from the pull down and
then click the image), it works, but it requests the .php file from the
original server, which responds by sending the selected file.

I understand that I won't get the server side .php file itself, but I want to
get both of the two different files that are optionally selectable.  

In an effort to get them, I tried using catchurl and selecting one of the two
files, then clicked on the download image.  This produced a hts-post0 file.  I
then downloaded starting right from that file using:

some.place.com/index.php?>postfile:hts-post0

and this generated the single file I wanted, but incorrectly named.

Perhaps this would be clearer -   Here is the hts-ioinfo.txt data  First you
see the request for the php file, then the response

[0] request for some.place.com/index.php?>postfile:hts-post0:
<<< POST /index.php HTTP/1.1


[0] response for some.place.com/index.php?>postfile:hts-post0:
code=200
>>> Content-disposition: attachment; filename="06106.rar"
>>> Content-Transfer-Encoding: binary
>>> Accept-Ranges: bytes
>>> Content-Type: rar


I expected it to pay attention to the Content-disposition and provide the file
named as 06106.rar  instead it comes in as index989ab.html

> You can't download some.place.com/index.php from a
> server. php, asp, cgi etc are server side files. 
> All you can get from a server is html.  Thus the
> extention rename. If you rename them then windows
> won't know how to open the file and links from one
> page to another won't work anymore.

All I really want are the 2 files.  I don't care too much if I have to rename,
that's minor, but I need to be able to have httrack parse the starting page
and request the site/index.php file twice from the form I posted by looking at
the 2 pulldown options on the form, once as site/index.php?ID=1 and once as
site/index.php?ID=2 (it's really more like:

site/index.php?index.php?com=doit&contractId=1&x=86&y=12)

I'm not sure if it can do that.  I think I've got everything configured
correctly, but it seems to be ignoring the form and the requests for index.php
that occur if the image is clicked.  

> site/index.php?ID=1 and site/index.php?ID=2 usually
> are two separate distinct pages so httrack renames
> them indexHHHH.htm where HHHH is a unique hex code.

This explains the naming, but I only get this when I start httrack at the
site/index.php?ID=1 file with an hts-post0 file catchurl'd  by starting at the
download link for that file. Is httrack able to parse the form to retrieve the
two files?  Thanks for the info so far.

 
Reply Create subthread


All articles

Subject Author Date
form based

03/01/2008 03:55
Re: form based

03/01/2008 16:54
Re: form based

03/01/2008 18:51
Re: form based

03/01/2008 21:24
Re: form based

07/25/2010 16:34




b

Created with FORUM 2.0.11