HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Scanrule wildcard clarification
Author: William Roeder
Date: 06/29/2009 17:21
 
> I'm using wildcards to exclude a site, say Google,
> from my mirror.  
By default httrack only mirrors the starting site unless external depth or
filters must allow it.

> However, I want to include incidental js, jpg, gif,
> png, and css files from other sites.
get non-html files

> I'm using the following recipie...
> 
> --can-go-down
default
> --stay-on-same-address
default
> -*
> +*.jpg +*.gif +*.css +*.js +*.png
no +mime:text/html you'll get only one page

> Then I add...
> -*google.com*
> But I still get garbage files from Google.  
From what you said above you can only get jpg/gif/js...

> Is there a difference between ?...
> -*google.com*
this could be a site filter or a directory or file
> and
> -*google.com
this can only be a filename

 
Reply Create subthread


All articles

Subject Author Date
Scanrule wildcard clarification

06/29/2009 04:58
Re: Scanrule wildcard clarification

06/29/2009 17:21
Re: Scanrule wildcard clarification

07/05/2009 00:20




e

Created with FORUM 2.0.11