| > <http://www.[sourceweb].com>
If you start from the top, the engine will be, by
default, authorized to crawl everywhere on this
website. Use strict filters (-* and then +..), such as:
-* +www.[sourceweb].com/whateveryouwant/* +*.png
+*.gif +*.jpg +*.css +*.js +*.pdf +*.doc
Then you may want to use --update (for.. updating)
which may be bette in this case. Ah, depth=2 may NOT
be sifficient, also (3 may be better if you have N+1
links?)
Warning: The cache is absolutely necessary for
updates, don't use the --cache=0 option
| |