| > I cannot use httrack for a purpose that would be the
> most useful, e.g. to return documents from a
> documentary database on the web.
> In the database there are generated index files
> containing links (to documents) like this:
> <http://www.somewebsite.com/docs/pdfgate.cgi>
>id=2&filename=somefilename.pdf
Use the 'Force old HTTP/1.0 requests (no 1.1)' in 'Set
options'/'Spider', you may get the correct file types
and ensure that you do not use MIME types (such as
cgi -> text/html).
Besides, is you want to get some useful information in
the query string, such as the 'filename' parameter, go
to 'Set options'/'Build' and select 'User-defined
structure'. Then, select 'Options' in this subtab, and
type in as filesystem mask:
%h%p/%n%[filename].%t
This will name all documents using the embedde
parameter, such as
<http://www.somewebsite.com/docs/pdfgate.cgi?id=2&filename=somefilename.pdf>
->
C:\My Web
SItes\foobar\www.somewebsite.com\docs\pdfgatesomefilena
me.pdf.pdf
Yes, the .pdf.pdf is ugly, and you could use:
%h%p/%n%[filename]
But in this case some html files will be badly named.
(you can not select specific pasm for specific files
yet, this may be added in the future)
| |