HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: How to retrieve database generated image tags?
Author: Xavier Roche
Date: 05/01/2003 15:21
> I have a site that I'm trying to get some pages of.
> Unfortunately the image tags are not refering directly to 
> the image but rather seem to have to round trip to the 
> database for the image information. When I 'Save Image 
> within the browser the image is called 'image' and 
> to be a .jpg file.
> Here is an example of the image tags.
> <img 
> category117/product838/image/?> size=300x300&helper=1049354555.01'
> border='0'>
> So how can I get the images to be copied correctly?
Humm. First problem: the filename ends with "/", which is 
generally used for top index in folders. httrack then 
assume that the link is an html file.

You have to disable this behaviour, using
'Set options' / 'Spider' / 'Check document type' to 'If 

But this is a rather strange idea.

The second problem is that the CGI behing is dumb and is 
not able to answer properly to "HEAD" requests, which is 
totally incompatible with RFC2616. httrack is using head 
requests to detect the document type before naming it, so 
this is a problem:


HTTP/1.0 200 OK
Server: Zope/(Zope 2.4.3 (source release, python 2.1, 
linux2), python 2.1.1, linux2) ZServer/1.1b1
Date: Thu, 01 May 2003 15:25:06 GMT
Ms-Author-Via: DAV
Content-Type: application/octet-stream
Accept-Ranges: none
Connection: close
Etag: ts51802647.5
Content-Length: 151013
Last-Modified: Thu, 01 May 2003 15:24:07 GMT

For that, there is fortunetaly a hack:
'Set options' / 'Spider' / 'Force old HTTP/1.0 requests'

With these two settings adjusted, it should now work 
Reply Create subthread

All articles

Subject Author Date
How to retrieve database generated image tags?

05/01/2003 14:40
Re: How to retrieve database generated image tags?

05/01/2003 15:21


Created with FORUM 2.0.11