Well I started a new project...
used
<http://www.ukdfd.co.uk/ukdfddata/showcat.php?cat=all&page=1&what=allfields&name=Cheryl%20Hodgson&name=Cheryl%20Hodgson&mcats=all>
as the starting URL...
and added these scan rules:
+*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/*
-mime:application/foobar
+*/*name=Cheryl%20Hodgson*
the last one to find URLs with her name in them (meaning the 16 across the
bottom of the results page).
I stopped it after about 10 megs or so and it seemed to have done it's job. It
even got the items desciption pages of what was on display.
You may need to add more scan rules to get rid of any undesirable pages or
widen things so you get a mini-site type of mirror.
But give that a try and let us know. |