| >>>>At 06:53 AM on 26-06-2003, the honorable Xavier Roche
said:
>
> You can try :
>
> 1. to use "use http/1.0 requests" option (--http10) to
> avoid HEAD requests (a GET request will be used instead
> with no body read)
Thanks for the quick reply (and great software!). Yes, i
tried this (and MANY other things ~smile~) before asking.
HTTrack still gave the file an .HTML extension. At the end
of this message, i've included the hts-ioinfo.txt file that
is generated when forcing HTTP/1.0 in case you have any
other comments. In that file there is a GET sent, plus the
POST from the postfile. It seems if i could just tell
HTTrack to not do or ignore the results of that GET, then
all would be well...but of course i'm not sure about that!
> 2. force .exe to be pdf: see MIME Types (--assume)
> exe <-> application/x-pdf
>
Unfortunately, i can't do this...at least it would be a
very big pain. The actual download operation reads many
such pages, some of which really are HTML with links in
them, some are PDF and some are DOC (msword) files. All the
pages use the exact same URL (...\state_register.exe).
The HTML pages use embedded query strings, again with the
same URL, to link to subpages.
====== START OF hts-ioinfo.txt FILE ======
[0] request for www.scstatehouse.net/cgi-
bin/state_register.exe?>postfile:E:/Web%20Sites/test/hts-
post2:
<<< GET /cgi-bin/state_register.exe HTTP/1.0
<<< Connection: close
<<< Host: www.scstatehouse.net
<<< User-Agent: Mozilla/4.5 (compatible; MSIE 4.01; Windows
NT)
<<< Accept: image/gif, image/x-xbitmap, image/jpeg,
image/pjpeg, image/svg+xml, */*
<<< Accept-Language: en, *
<<< Accept-Charset: iso-8859-1, iso-8859-*, utf-8
[0] response for www.scstatehouse.net/cgi-
bin/state_register.exe?>postfile:E:/Web%20Sites/test/hts-
post2:
code=200
>>> HTTP/1.1 200 OK
>>> Date: Thu, 26 Jun 2003 16:33:59 GMT
>>> Server: Apache/2.0.43 (Win32)
>>> Content-Length: 17465
>>> Connection: close
>>> Content-Type: text/html; charset=ISO-8859-1
[1] request for www.scstatehouse.net/cgi-
bin/state_register.exe?>postfile:E:/Web%20Sites/test/hts-
post2:
<<< POST /cgi-bin/state_register.exe HTTP/1.1
<<< Accept: image/gif, image/x-xbitmap, image/jpeg,
image/pjpeg, image/svg+xml, application/msword, */*
<<< Accept-Charset: iso-8859-1, iso-8859-*, utf-8
<<< Accept-Encoding: gzip, deflate, compress, identity
<<< Accept-Language: en, *
<<< Content-Type: application/x-www-form-urlencoded
<<< User-Agent: Mozilla/5.0 (compatible; Customised HTTP
Client; Windows NT)
<<< Content-length: 64
<<< Host: www.scstatehouse.net
<<< Connection: Keep-Alive
<<<
first=FILE&usercode=mortkas&password=3259&years=2002&file=sr
27-5
[1] response for www.scstatehouse.net/cgi-
bin/state_register.exe?>postfile:E:/Web%20Sites/test/hts-
post2:
code=200
>>> HTTP/1.1 200 OK
>>> Date: Thu, 26 Jun 2003 16:34:03 GMT
>>> Server: Apache/2.0.43 (Win32)
>>> Keep-Alive: timeout=15, max=100
>>> Connection: Keep-Alive
>>> Transfer-Encoding: chunked
>>> Content-Type: application/msword
====== END OF hts-ioinfo.txt FILE ======
| |