| I am trying to use httrack from a Linux command line to mirror a dynamic
website managed with drupal. For directory index pages, drupal omits the file
name (which is usually something like "index.html"), and generates a URL of
the form:
<http://domain/path/dirname>
By default, httrack writes the page to something like
..../path/dirname.html
When I try to browse the mirrored site, using the URL
<http://mirror/path/dirname>
expects to find .../path/dirname/index.html, not .../path/dirname.html, and a
"Page Not Found" error results.
That httrack behaviour can be overridden with
"-N "%h%p/%n/index.html"
and directory index pages work fine.
However, if drupal should generate a URL containing a filename, such as
<http://domain/path/dirname/index.html>, httrack (with the above -N option)
creates
.../path/dirname/index/index.html
which fails. Other URLs containing filenames at the end are also broken.
Is there some way to have httrack detect a URL ending with a component with a
"." and cancel the effect of the -N option? Thanks.
| |