| Hi,
I tried to mirror a dynamically generated page with links like
www.example.com/script?opt1=foo&opt2=bar&origin=3450004&session=Vgl9gs
only $opt1 and $opt2 influence the content, $origin and $tracking are
irrelevant for the mirror.
Filenames are created with -N dir%p/%[opt1::_::]%[opt2::_::]%n.%t
Due to crazy navigation, links point to the same content with different
$origin and every page is generated with a new $session. Hence it ends up with
filenames incremented like foo_bar_script-2.html etc. which have the same
content despite some timestamps and of course again links with new $session.
This unnecessarily bloats up the mirror.
It would be nice if there were an option like
--ignore-parameter=origin,tracking
that filters before link comparison.
Or is there a way to turn off filename incrementation that would skip the
download if the file exists? It should really avoid redownloading because
files can be rather big.
I found some other posts with this issue and wrote a comment but it is a
rather old thread that doesn't come up, so I post this again.
<http://forum.httrack.com/readmsg/6881/6843/index.html>
<http://forum.httrack.com/readmsg/16643/16493/index.html>
Thanks for the cool software.
| |