| I am familiar with using a text file in order to create a URL list to mirror
sequentially numbered pages e.g.
www.shoesizes.com/shoeboxes/500/fashion.html
www.shoesizes.com/shoeboxes/501/fashion.html
www.shoesizes.com/shoeboxes/502/fashion.html
but is there a syntax rule that would enable me to mirror a site that has
pages of the form:
www.shoesizes.com/shoeboxes/@@@@/fashion.html
The four characters @@@@ are actually various permutations of the alphanumeric
characters, for example 023e and 053d. I don't know how to create a text file
containing all the possible permutations (but it would be enormous) and I know
by trial and error that only a small number of the possible maximum number of
pages actually exist. So I am looking for a way to interrogate the site
automatically and retrieve the pages that do actually exist.
Thanks for your help
| |