HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Command Line Spidering
Author: William Roeder
Date: 03/25/2008 19:34
 
>  httrack <http://www.websitetospider.com/page.cfm>
> -O "./spidereddata" "-*" "+.cfm" "+.htm" "+-html"
> "+*websitetospider.com/listings.cfm/listing/*"  -r6

missing asterisks on the cfm/htm..

> httrack will follow the links that don't match my
> patterns to find the other ones. I really want it to

It won't.  The best you can do is spider the html and only download images etc
on the pages you want:
-* +*/listing/* +*.cfm +*.htm*

Alternative is to start httrack on a */listing/* page, then
-* +*/listing/* will work.
 
Reply Create subthread


All articles

Subject Author Date
Command Line Spidering

03/25/2008 14:03
Re: Command Line Spidering

03/25/2008 19:34
Re: Command Line Spidering

03/26/2008 13:18




2

Created with FORUM 2.0.11