HTTrack Website Copier
Free software offline browser - FORUM
Subject: PDF files are saved with title instead of filename
Author: Lucio Fassio
Date: 01/30/2007 13:17
 
I'm usig HTTPTrack to index a web site using MS Index Server since sep 2006. I
found it very helpful and I dropped a previous spider.

Everything is well but now it happens that pdf files are saved and referred in
the mirror copy using the original file name instead of the actual file name
(the one used to save the document on the web server).

It seems that this problem did not happen before 3.40-2.

I enclose a row from \hts-cache\new.txt that explains the problem:

17:08:37	172630/172630	U-----	200	untouched ('OK')	application/pdf
date:Fri,%2020%20Oct%202006%2011:05:20%20GMT
<http://www.comune.aosta.it/download/file/325.pdf>
c:/websites/inva.comuneaosta.index/ComuneAosta/www.comune.aosta.it/download/file/Regolamento%20ICI.pdf
(from
<http://www.comune.aosta.it/it/comune/atti_ufficiali/regolamenti_comunali/>)
 
This way the reference to the file is wrong because i get (from the search
engine):
<http://www.comune.aosta.it/download/file/Regolamento%20ICI.pdf>

while it should be:
<http://www.comune.aosta.it/download/file/325.pdf>

The same problem happens with other mime types (doc, xls, ...).

May you suggest a circumvention or a fix ?
Thanks a lot.
 
Reply


All articles

Subject Author Date
PDF files are saved with title instead of filename

01/30/2007 13:17
Re: PDF files are saved with title instead of filename

02/08/2007 11:16




0

Created with FORUM 2.0.11