I've tried a number of filters and switches but I'm still getting httrack
crawling www.gringoes.com.br and I'd like to keep it to www.gringoes.com.
Here's the command I entered:
httrack <http://www.gringoes.com/forum/default.asp> -O
/Users/rsmcgown/Desktop/gringoes-forum -#L10000000 -d -S -D -s2 %e0 -%A
asp=text/html -v "-*" "-*.gringos.com*" "-*gringoes.com.br*" "-*reply*"
"-*terms*" "-*member_profile*" "-*new_reply_form*" "-*active_topics*"
"-*calendar*" "-*search*" "-*RSS*" "-*printer_friendly*" "-*pm_buddy*"
"-*pm_new*" "-*login*" "-*report_post*" "+*forum_topics.asp*"
"+*forum_posts.asp*"
Any ideas how I can prevent crawling outside www.gringoes.com/forum/
? |