| For reference here is the files I am trying to get and the only way to access
them through several links
prefix.asite.com.au/a/b/c/[all-something, like a list]
prefix.asite.com.au/a/b/c/[a number]
asite.com.au/d/b/c/e/[name.pdf] - These PDFs are what I want
[ ] indicates a directory that changes depending on which PDF I want to
access
> There is no magic here. What part of forbidden
> wasn't clear?It was clear to me, I was just saying for clarity. I know most
websites don't allow directory access anyway
> Either through the links or you guess the URLs:
> Number sequences: How to mirror only files/URLs
> using a certain ID/number range -
> <http://httrack.kauler.com/help/URL_number_sequences>
That may or may not help (I might try and implement). Problem is the links
with the numbers aren't the final destination. It doesn't help with the
prefix.asite.com.au and the asite.com.au change in domain.
> Filters do just that, they don't enable magic. If
> you want one file type: -* +*.html +*.XXX
And stop talking about magic. That's why I am asking, because the filters
don't 'do magic', that I know that this is definitely going to be tricky (as
said in the subject of this thread!)
I read somewhere for it to be able to get PDFs it will need the HTMLs anyway
that link to it (spider through these links)
I might as well reveal the website address, if this is still not clear enough
Jason | |