HTTrack Website Copier
Free software offline browser - FORUM
Subject: request: dynpage ignore parameter function
Author: steppa
Date: 08/20/2007 05:12
 
Hi,

I tried to mirror a dynamically generated page with links like
www.example.com/script?opt1=foo&opt2=bar&origin=3450004&session=Vgl9gs
only $opt1 and $opt2 influence the content, $origin and $tracking are
irrelevant for the mirror.
Filenames are created with -N dir%p/%[opt1::_::]%[opt2::_::]%n.%t

Due to crazy navigation, links point to the same content with different
$origin and every page is generated with a new $session. Hence it ends up with
filenames incremented like foo_bar_script-2.html etc. which have the same
content despite some timestamps and of course again links with new $session.
This unnecessarily bloats up the mirror.

It would be nice if there were an option like 
--ignore-parameter=origin,tracking 
that filters before link comparison.

Or is there a way to turn off filename incrementation that would skip the
download if the file exists? It should really avoid redownloading because
files can be rather big.

I found some other posts with this issue and wrote a comment but it is a
rather old thread that doesn't come up, so I post this again.
<http://forum.httrack.com/readmsg/6881/6843/index.html>
<http://forum.httrack.com/readmsg/16643/16493/index.html>


Thanks for the cool software.
 
Reply


All articles

Subject Author Date
request: dynpage ignore parameter function

08/20/2007 05:12




a

Created with FORUM 2.0.11