HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Disabling robots.txt file
Author: Rebekah
Date: 04/26/2004 00:08
 
> Scan rules depend on what you find in robots.txt, and the
> site you want to mirror.
> Which site is it and which folder do you need ?
Not sure what you mean by 'what you found in robots.txt' I 
was trying to download www.learningpage.com but I couldn't 
get it to work. The next time I watched the screen 
intently as I started the download and 'error  
=robots.txt' flashed up on the screen then disappeared. 
the other times I had tried to download the site I had 
been distracted from the screen and had missed this.

So I disabled robots.txt and have managed to download most 
of the site. I have encountered the %pdf message on some 
of the files though, that I see someone else has also had, 
but I have been unable to find if there was ever a 
solution to that problem.

 
Reply Create subthread


All articles

Subject Author Date
Disabling robots.txt file

04/18/2004 09:10
Re: Disabling robots.txt file

04/18/2004 10:29
Re: Disabling robots.txt file

04/19/2004 00:41
Re: Disabling robots.txt file

04/19/2004 09:16
Re: Disabling robots.txt file

04/26/2004 00:08
Re: Disabling robots.txt file

07/10/2009 13:22
Re: Disabling robots.txt file

04/26/2013 06:06




e

Created with FORUM 2.0.11