HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Need help with Scan Rules
Author: William Roeder
Date: 10/06/2009 17:05
 
> I've been trying to get my Scan Rules just right for
> about an hour now and I'm having no luck.

> An example of one JPEG is:
> 
> <http://baka-images.com/forum/gallery/16/453-05100910>
> 0004.jpeg
> 
> +*/forum/gallery/*.jpg
if the files are .jpeg and your filter for .jpg that does nothing.

> Furthermore, there are thumbnail images all prefixed
> with thumb_. So I have this now:
> 
> -*thumb_*.jpg
> +*/forum/gallery/*.jpg

This says don't get any thumb jpg's except for those in forum/gallery. Order
is important.

If you don't want any jpg except gallery but not thumb:
-*.jpg +*/forum/gallery/*.jpg -*thumb_*.jpg

> However, this does not work. It downloads the thumbs
> anyway and everything else (html, xml, everything).
By default httrack tries to download everything. You didn't say anything about
other files.

> How can I adjust the site rules so that I only get
> the JPG files? I don't want the thumb_*.jpg files,
> and I don't want any HTML, XML, PHP, or other

You MUST allow httrack to get the html files, or how can it possibly find the
image filenames?
You want nothing but the html and non-thumb jpg:
-* +mime:text/html +*/forum/gallery/*.jpg -*thumb_*.jpg
 
Reply Create subthread


All articles

Subject Author Date
Need help with Scan Rules

10/06/2009 00:10
Re: Need help with Scan Rules

10/06/2009 17:05
Re: Need help with Scan Rules

11/04/2009 13:54




e

Created with FORUM 2.0.11