HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: HTTrack seems to be spidering whole site
Author: Charles
Date: 11/29/2013 02:59
 
If you haven't found a solution yet below are example rules to consider setting
up in a similar way, adjusting to your needs.

+*.css +*.js -ad.doubleclick.net/* -mime:application/foobar
+*.gif +*.jpg +*.jpeg +*.png +*.bmp
-http://www.forum/*
+http://www.forum/*.jpeg
+http://www.forum/*.jpg
+http://www.forum/*.png
+http://www.forum/*.gif
+http://www.forum/*.bmp
+http://www.forum/*.css*
+http://www.forum.com/forum/saveme*

and if you do want the javascript from this forum add:

+http://www.forum/*.js


In the above, the rules: -http://www.forum/* :is saying you don't want
anything mirrored from this URL, but the rules after this rule are overriding
the first rule and saying what you do want mirrored from the URL.

Also if this forum has the "next page" and "previous page" links on the pages
you might need to fine tune your rules to exclude them if you don't need those
links. The same goes for "print view", "email to a friend", etc. The links
that are sometimes hidden in drop down menus on these forums.


Good Luck




 
Reply Create subthread


All articles

Subject Author Date
HTTrack seems to be spidering whole site 11/27/2013 11:24
Re: HTTrack seems to be spidering whole site 11/29/2013 02:59




6

Created with FORUM 2.0.11