Re: Scanrule wildcard clarification - HTTrack Website Copier Forum

Subject: Re: Scanrule wildcard clarification

Author: Cliff Missen

Date: 07/05/2009 00:20

OK. To clarify...

When mirroring a site, I want to retrieve incidental files from other sites,
like links to logos, CSS, Javascript, seals of approvals, etc., that appear in
the HTML pages I'm retrieving from the mirror site.  However, I do not want to
retrieve things of significance, like PDFs, Word docs, Powerpoints, video,
etc.

At the same time, I do not want ANYTHING, even incidentals, from sites like
googleanalytics.com or yahoo.com.

So I use the following rules...

-*
+*.jpg 
+*.gif 
+*.css 
+*.js 
+*.png
-*google-analytics*
-*paypal.com*
-*akamaitech.com*
-*google.com*

Am I on the right track here?
-- Cliff

Create subthread

All articles

Subject	Author	Date
Scanrule wildcard clarification		06/29/2009 04:58
Re: Scanrule wildcard clarification		06/29/2009 17:21
Re: Scanrule wildcard clarification		07/05/2009 00:20