| XR, merci, mais encore ce non va pas.
Here are my settings and results . . .
action=download(no "questions")
url=http://www.newsisfree.com/sources/bycat/1
options (mostly default). . .
rules=<as you specified>
links=attempt to detect all (also tried the opposite)
build=no external pages
spider=no robots rules (also tried Force HTTP1)
browser=IE6, no footer
log,etc=make index (also tried opposite)
experts=go down,same address, relative-URI,
store html(tried with and without "first")
click Next,Finish
I see immediately, in the list of files being processed:
www.newsisfree.com/sources/bycat/1 (as expected)
but also:
www.newsisfree.com/sources/bycat/0
www.newsisfree.com/sources/bycat/
www.newsisfree.com/sources/bycat
So then I cancel.
There are MANY files in the /bycat/ directory,
and I want to avoid accessing any of them except /bycat/1
Comment puis-je ca faire?merci encore, R. | |