HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Unable to handle getfile.php type webpage
Author: Jason Leinbach
Date: 08/28/2005 21:07
 
Yup. It's the spider option itself that makes proper file naming problematic. 
An archiver has to process and redirect every link and file on every page - in
theory, you could put in lots of subroutines for trying to extract the "right"
filenames, but it would be hard, slow, and never 100% reliable anyway. The
primary goal of HTTrack is to recreate a browsable website on the local
machine, not to retrieve all directory structures intact, so I suspect this is
very low on the to-do list. wget may address this, depending on the
permissions on the image directory, but not likely. I suspect  you're
currently out of luck if you want *both* automated crawling *and* unaltered
filenames.
 
Reply Create subthread


All articles

Subject Author Date
Unable to handle getfile.php type webpage

08/17/2005 19:13
Re: Unable to handle getfile.php type webpage

08/17/2005 21:35
Re: Unable to handle getfile.php type webpage

08/18/2005 01:35
Re: Unable to handle getfile.php type webpage

08/28/2005 21:07
Re: Unable to handle getfile.php type webpage

01/28/2007 23:04




a

Created with FORUM 2.0.11