HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Scanrule wildcard clarification
Author: Cliff Missen
Date: 07/05/2009 00:20
 
OK. To clarify...

When mirroring a site, I want to retrieve incidental files from other sites,
like links to logos, CSS, Javascript, seals of approvals, etc., that appear in
the HTML pages I'm retrieving from the mirror site.  However, I do not want to
retrieve things of significance, like PDFs, Word docs, Powerpoints, video,
etc.

At the same time, I do not want ANYTHING, even incidentals, from sites like
googleanalytics.com or yahoo.com.

So I use the following rules...

-*
+*.jpg 
+*.gif 
+*.css 
+*.js 
+*.png
-*google-analytics*
-*paypal.com*
-*akamaitech.com*
-*google.com*

Am I on the right track here?
-- Cliff

 
Reply Create subthread


All articles

Subject Author Date
Scanrule wildcard clarification

06/29/2009 04:58
Re: Scanrule wildcard clarification

06/29/2009 17:21
Re: Scanrule wildcard clarification

07/05/2009 00:20




7

Created with FORUM 2.0.11