| > I'm having some difficulty with a site capture I'm
> running. Although I've set the external mirroring
> depth to 0, (and left it unset, and set it to 1 in
> some of my debugging) crawls of this specific site
> end up trying to capture Wikipedia in addition to
> the target site. I believe I've isolated the link
> that causes it, and I've posted a demo site to show
> this problem at <http://stempac.net>. The link on
> "Page Two" to wikicommons under the picture of the
> courthouse is the offending link.
Did you leave the default Scan Rules (Options / Scan Rules), which are by
default "+*.png +*.gif +*.jpg +*.css +*.js" ? | |