HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: rush.co.uk rip help
Author: William Roeder
Date: 07/30/2009 05:32
 
> Note for example missing flash titles, and racoon
> logo, which is referenced int he html file as:
> 
> ../../www.rush.co.uk/sites/default/files/images/Raco
> on_Logo.jpg

<http://rush.co.uk/robots.txt> includes:
Disallow: /scripts/
Disallow: /sites/

you'll have to override robots.txt (options -> spider) to get the first
section of disallows. You probably should then filter out the bottom section:
-*/admin/* -*/comment/reply/* -*/contact/*
-*/logout/* -*/node/add/* -*/search/* -*/user/register/*
-*/user/password/* -*/user/login/* -*?q=*
 
Reply Create subthread


All articles

Subject Author Date
Re: rush.co.uk rip help

07/29/2009 23:37
Re: rush.co.uk rip help

07/30/2009 00:31
Re: rush.co.uk rip help

07/30/2009 05:32
Re: rush.co.uk rip help

07/30/2009 05:54




4

Created with FORUM 2.0.11