| > > i'm just having a problem dowloading a huge dynamic web site
> > it's in jsp and i try to rename all the files to html
> > extensions with a 'user define structure' like
> > %h%p/%n_%[node]_%[lang].html where node and lang are HTPP
> > params to the jsp.
> > My problem is that sometimes some other params appears in
> > the URLS like 'refresh=...', to display the same content,
> > and it seems that HTTrack duplicate the file with a -1 -2 .. extension
>
> That's perfectly normal: httrack finds a name collision,
> and is obliged to rename the local file(s)
>
> > Someone has an idea either to remove these HTTP params
> from the downloaded website or to refresh the whole
> downloaded website ?>
> I'm not sure I understand what you want to do exactly:
>
> 3. do you want to consider the links with 'refresh=..'
> identical to those without this parameters (that is,
> this parameter is useless) ?> Then, this is not yet possible without some
coding (maybe
> by hacking the "check-link" callback using some C coding)
>
Hi all,
I am sorry to reply an old post, but I have the same problem that was
considered in this post, and I don't know if in this time (the original post
comes from 2003, three years ago), it has been somehow automated.
I would like to backup a dynamic web that has some significative params,
but there is also a "jsession" parameter that varies from request to request,
but it is uselles. It is exactly the third case that was commented by Xavier
in the old post. An example page from my interest web is: (it is a touristic
web)
<http://www.turismocastillayleon.com/cm/turcyl/tkContent;jsessionid=7F8E7761C59E3832D7DE80DE09761AFC?idContent=111&locale=es_ES&textOnly=false>
Does anyone know if there is a simple (i.e., without recompiling) way to
do this? I haven't found any "omit param" option, or similar, but I consider
that it might be very interesting for these "difficult" cases.
Thanks in advance,
Enriquevagu | |