HTTrack Website Copier
Free software offline browser - FORUM
Subject: Having trouble restricting the crawl
Author: Sean
Date: 04/09/2015 18:07
 
I've tried a number of filters and switches but I'm still getting httrack
crawling www.gringoes.com.br and I'd like to keep it to www.gringoes.com.
Here's the command I entered:
httrack <http://www.gringoes.com/forum/default.asp> -O
/Users/rsmcgown/Desktop/gringoes-forum -#L10000000 -d -S -D -s2 %e0 -%A
asp=text/html -v "-*" "-*.gringos.com*" "-*gringoes.com.br*" "-*reply*"
"-*terms*" "-*member_profile*" "-*new_reply_form*" "-*active_topics*"
"-*calendar*" "-*search*" "-*RSS*" "-*printer_friendly*" "-*pm_buddy*"
"-*pm_new*" "-*login*" "-*report_post*" "+*forum_topics.asp*"
"+*forum_posts.asp*"

Any ideas how I can prevent crawling outside  www.gringoes.com/forum/
?
 
Reply


All articles

Subject Author Date
Having trouble restricting the crawl

04/09/2015 18:07
Re: Having trouble restricting the crawl

04/23/2015 21:49




4

Created with FORUM 2.0.11