HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Why do you allow robots.txt to be overriden?
Author: Ben
Date: 01/13/2006 12:23
 
Just to put a quick 2p in. 

1. The main reasons I use httrack are (a) it has the best
non-JavaScript-dependent link rewriting that I have found, and (b) it can
override robots.txt. Both of these matter because I work for a digital
archive: we do long-term preservation of websites *with permission*. They are
often websites which are no longer maintained, the webmaster has long gone,
no-one can remember the password to get in and change robots.txt for our
benefit, and if we heeded it we'd only get a fraction of the content that
*they want* us to archive.

2. If httrack took this feature out, anyone who really wanted to rip you off
would find another way. It would just greatly inconvenience legitimate
archiving operations such as www.ndad.nationalarchives.gov.uk and
www.webarchive.org.uk.

Most governments recognise that preservation takes precedence over copyright,
so they grant immunity from normal copyright restrictions to certain archives
and libraries. 

If a non-legitimate person rips you off, you are of course entitled to take
court action against them. Don't blame httrack. If someone breaks into your
house using a hammer, you don't try to ban the sale of hammers. 

 
Reply Create subthread


All articles

Subject Author Date
Why do you allow robots.txt to be overriden?

12/27/2005 21:13
Re: Why do you allow robots.txt to be overriden?

12/27/2005 22:00
Re: Why do you allow robots.txt to be overriden?

12/27/2005 22:39
Re: Why do you allow robots.txt to be overriden?

12/28/2005 00:47
Re: Why do you allow robots.txt to be overriden?

12/30/2005 11:18
Re: Why do you allow robots.txt to be overriden?

01/13/2006 12:23
Re: Why do you allow robots.txt to be overriden?

12/25/2006 19:24
Re: Why do you allow robots.txt to be overriden?

06/26/2007 19:49
Re: Why do you allow robots.txt to be overriden?

04/04/2010 08:06




2

Created with FORUM 2.0.11