HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Content-type from server not being used
Author: john
Date: 06/26/2003 18:55
 
>>>>At 06:53 AM on 26-06-2003, the honorable Xavier Roche 
said:
> 
> You can try :
> 
> 1. to use "use http/1.0 requests" option (--http10) to
> avoid HEAD requests (a GET request will be used instead
> with no body read)

Thanks for the quick reply (and great software!). Yes, i 
tried this (and MANY other things ~smile~) before asking. 
HTTrack still gave the file an .HTML extension. At the end 
of this message, i've included the hts-ioinfo.txt file that 
is generated when forcing HTTP/1.0 in case you have any 
other comments. In that file there is a GET sent, plus the 
POST from the postfile. It seems if i could just tell 
HTTrack to not do or ignore the results of that GET, then 
all would be well...but of course i'm not sure about that!

> 2. force .exe to be pdf: see MIME Types (--assume)
> exe <-> application/x-pdf
> 

Unfortunately, i can't do this...at least it would be a 
very big pain. The actual download operation reads many 
such pages, some of which really are HTML with links in 
them, some are PDF and some are DOC (msword) files. All the 
pages use the exact same URL (...\state_register.exe).
The HTML pages use embedded query strings, again with the 
same URL, to link to subpages.

====== START OF hts-ioinfo.txt FILE ======

[0] request for www.scstatehouse.net/cgi-
bin/state_register.exe?>postfile:E:/Web%20Sites/test/hts-
post2:
<<< GET /cgi-bin/state_register.exe HTTP/1.0
<<< Connection: close
<<< Host: www.scstatehouse.net
<<< User-Agent: Mozilla/4.5 (compatible; MSIE 4.01; Windows 
NT)
<<< Accept: image/gif, image/x-xbitmap, image/jpeg, 
image/pjpeg, image/svg+xml, */*
<<< Accept-Language: en, *
<<< Accept-Charset: iso-8859-1, iso-8859-*, utf-8


[0] response for www.scstatehouse.net/cgi-
bin/state_register.exe?>postfile:E:/Web%20Sites/test/hts-
post2:
code=200
>>> HTTP/1.1 200 OK
>>> Date: Thu, 26 Jun 2003 16:33:59 GMT
>>> Server: Apache/2.0.43 (Win32)
>>> Content-Length: 17465
>>> Connection: close
>>> Content-Type: text/html; charset=ISO-8859-1


[1] request for www.scstatehouse.net/cgi-
bin/state_register.exe?>postfile:E:/Web%20Sites/test/hts-
post2:
<<< POST /cgi-bin/state_register.exe HTTP/1.1
<<< Accept: image/gif, image/x-xbitmap, image/jpeg, 
image/pjpeg, image/svg+xml, application/msword, */*
<<< Accept-Charset: iso-8859-1, iso-8859-*, utf-8
<<< Accept-Encoding: gzip, deflate, compress, identity
<<< Accept-Language: en, *
<<< Content-Type: application/x-www-form-urlencoded
<<< User-Agent: Mozilla/5.0 (compatible; Customised HTTP 
Client; Windows NT)
<<< Content-length: 64
<<< Host: www.scstatehouse.net
<<< Connection: Keep-Alive

<<< 
first=FILE&usercode=mortkas&password=3259&years=2002&file=sr
27-5
[1] response for www.scstatehouse.net/cgi-
bin/state_register.exe?>postfile:E:/Web%20Sites/test/hts-
post2:
code=200
>>> HTTP/1.1 200 OK
>>> Date: Thu, 26 Jun 2003 16:34:03 GMT
>>> Server: Apache/2.0.43 (Win32)
>>> Keep-Alive: timeout=15, max=100
>>> Connection: Keep-Alive
>>> Transfer-Encoding: chunked
>>> Content-Type: application/msword

====== END OF hts-ioinfo.txt FILE ======
 
Reply Create subthread


All articles

Subject Author Date
Content-type from server not being used

06/26/2003 03:43
Re: Content-type from server not being used

06/26/2003 06:53
Re: Content-type from server not being used

06/26/2003 18:55




f

Created with FORUM 2.0.11