|
Hello,
I am having a problem with HTTrack version 3.23 naming
files as .HTML rather than according to their Content-Type
(usually .PDF or .DOC). I've read the other previous
messages on the forum regarding this, but my problem seems
different.
When downloading a URL using POST, the Content-Type is
reported incorrectly as "text/html" by the server in
response to a HEAD request, but when the actual POST
request is made the server does respond with a correct
Content-Type type of "application/msword" (for example).
Sadly, it seems HTTrack is only using the response to the
HEAD request and thus giving the file an .HTML extension.
Is it possible for HTTrack to use the response from the
POST instead of the HEAD if Content-Type differs between
the two?
For example, using this URL:
<http://www.scstatehouse.net/cgi-bin/state_register.exe?>postfile:hts-post0>
Where the hts-post0 file looks like this:
============================
POST /cgi-bin/state_register.exe HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg,
image/pjpeg, image/svg+xml, */*
Accept-Charset: iso-8859-1, iso-8859-*, utf-8
Accept-Encoding: gzip, deflate, compress, identity
Accept-Language: en, *
Content-Type: application/x-www-form-urlencoded
User-Agent: Mozilla/5.0 (compatible; Customised HTTP
Client; Windows NT)
Content-length: 64
Host: www.scstatehouse.net
Connection: Keep-Alive
first=FILE&usercode=mortkas&password=3259&years=2002&file=sr
27-5
====== END OF hts-post0 FILE ======
The hts-ioinfo.txt file looks like this:
===========================
[0] request for www.scstatehouse.net/cgi-
bin/state_register.exe?>postfile:E:/Web%20Sites/test/hts-
post2:
<<< HEAD /cgi-bin/state_register.exe HTTP/1.1
<<< Connection: Keep-Alive
<<< Host: www.scstatehouse.net
<<< User-Agent: Mozilla/4.5 (compatible; MSIE 4.01; Windows
NT)
<<< Accept: image/gif, image/x-xbitmap, image/jpeg,
image/pjpeg, image/svg+xml, */*
<<< Accept-Language: en, *
<<< Accept-Charset: iso-8859-1, iso-8859-*, utf-8
<<< Accept-Encoding: gzip, deflate, compress, identity
[0] response for www.scstatehouse.net/cgi-
bin/state_register.exe?>postfile:E:/Web%20Sites/test/hts-
post2:
code=200
>>> HTTP/1.1 200 OK
>>> Date: Thu, 26 Jun 2003 00:18:34 GMT
>>> Server: Apache/2.0.43 (Win32)
>>> Keep-Alive: timeout=15, max=100
>>> Connection: Keep-Alive
>>> Content-Type: text/html; charset=ISO-8859-1
[0] request for www.scstatehouse.net/cgi-
bin/state_register.exe?>postfile:E:/Web%20Sites/test/hts-
post2:
<<< POST /cgi-bin/state_register.exe HTTP/1.1
<<< Accept: image/gif, image/x-xbitmap, image/jpeg,
image/pjpeg, image/svg+xml, */*
<<< Accept-Charset: iso-8859-1, iso-8859-*, utf-8
<<< Accept-Encoding: gzip, deflate, compress, identity
<<< Accept-Language: en, *
<<< Content-Type: application/x-www-form-urlencoded
<<< User-Agent: Mozilla/5.0 (compatible; Customised HTTP
Client; Windows NT)
<<< Content-length: 64
<<< Host: www.scstatehouse.net
<<< Connection: Keep-Alive
<<<
first=FILE&usercode=mortkas&password=3259&years=2002&file=sr
27-5
[0] response for www.scstatehouse.net/cgi-
bin/state_register.exe?>postfile:E:/Web%20Sites/test/hts-
post2:
code=200
>>> HTTP/1.1 200 OK
>>> Date: Thu, 26 Jun 2003 00:18:35 GMT
>>> Server: Apache/2.0.43 (Win32)
>>> Keep-Alive: timeout=15, max=99
>>> Connection: Keep-Alive
>>> Transfer-Encoding: chunked
>>> Content-Type: application/msword
====== END OF hts-ioinfo.txt FILE ====== | |