| > I am trying to copy a site with lots of files. It is using
> the directory browsing function of a web server and it
gives
> me pages with heading so I can sort the page on file name,
> file size, file date etc.
> The problem is that every page generates several
index.html
> files, each one corresponding to a different sort option.
> This is generating a lot of overhead and unnecessary files
> since I am only interested in the meat, ie., the files in
> the directories.
> How can I avoid this?
Use scan rules ; such as:
-www.example.com/*listing?key=* +www.example.com/*listing?key=size
(here all links such as www.example.com/..listing?key=..
will be forbidden, except those with key=size)
| |