| > +*.png +*.gif +*.jpg +*.css +*.js
> -ad.doubleclick.net/* -mime:application/foobar
> +*.asp
Adding the asp allows mirroring asp files from ANYWHERE, potentially
mirroring
the entire internet.
Don't use filters this way. If you want everything just use the near flag
(get
non-html files related)
> This is the rules, all the other setting are
> default,however I can not mirror
> "www.w3schools.com",many of its pages are asp.
asp is irrevelent. You can not get server side files from the public side of
a
web server, only the html output and related files.
> However this site can be
> mirrored---http://www.w3school.com.cn/ Its page
> are asp also.
That's because that site doesn't have a robot.txt blocking /images etc.,
which
you would have known had you looking at the log file. | |