| > I have been trying to download a 'wiki' as well as
> several forum websites. In all cases the download
> seems endless, with multiple copies of the same
> file(s) being created.
Unfortunately, httrack can not "guess" that the links are actually the same.
You can not "collate" links either with httrack - but you may exclude download
of the additional links using scan rules.
> It seems that the "index.php?title=X" part leads
> Httrack to create separate html files. Is there any
> way by using either filters, options or both, to
> force Httracks to only do one copy of each file it
> find, rather than multiples? Thanks in advance.
It seems that you are crawling diffs, history, etc. - which are different
content.
You may however exclude them, for example using the following scan rules
(Options / Scan Rules):
-*action=* -*diff=*
| |