HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Duplicate Images
Author: William Roeder
Date: 07/13/2007 04:14
 
> I'm getting tens of thousands of duplicate images
> because the website has chosen for some reason (log
> analysis maybe?) to include a query string on their
> images, i.e.:
> 
>    example.com/foo.gif?x=000000
>    example.com/foo.gif?x=000001
> 
> Is it possible to generically say that if an image
> file has a parameter that you only download one?  It
> seems like this would need to be a heuristic
> optimization that would need to be built into
> HTTrack...

The problem is that there is no way to tell whether foo.gif?x=0 is or is not
the same as foo.gif?x=1 without downloading them.
 
Reply Create subthread


All articles

Subject Author Date
Duplicate Images

07/12/2007 16:39
Re: Duplicate Images

07/13/2007 04:14
request: dynpage ignore parameter function

08/19/2007 01:01




4

Created with FORUM 2.0.11