| > > Unlike most of you, I am trying to delimit rather
> > than limit my spidering
>
> :p
>
> > -C0N1003s0K%e9999r9999zI0b1nBe
>
> One big advice: do not use depth settings if possible (e
> and r options), but use scan rules instead. This is much
> more reliable, and should not cause as many problems as
> depth settings can.
>
> Example 1: download two sites with cross links, but nothing
> more:
> -* +www.example.com/index.html +www.anotherexample.com/*
>
> Example 2: same as #1, but download only /articles/ section
> in site 2
> -* +www.example.com/index.html
> +www.anotherexample.com/articles/*
>
>
The problem is that I'm trying to download an indefinite and indeterminate
number of sites, so I don't know what scan rules to use. I'm essentially
looking for an infinite crawl, so I thought -e would be what I wanted. In
fact,
I thought that the -e option would be exactly the thing that would defeat
limits on cross-site travel....
\M | |