| > By not respecting robots.txt and allowing unlimited number
> of connections when copying sites your software has
> succeeded in bringing our server to a halt on Sunday.
This is not the default behaviour, and can only be turned
on by the user. Contact the user's administrator to stop
this abuse.
> Costing thousands of dollars in man hours to remedy the
> problem and get the server back online.
Please read:
<http://www.httrack.com/html/abuse.html#WEBMASTERS>
> Steps must be taken to remedy this situation. Bandwidth on
> most web servers is not free and the cost to site owners
> could be into millions of dollars due to use of software
> such as yours.
Default options in httrack, again, are nice enough to avoid
any bandwidth abuse (maximum of 25K/s, 4 connections,
robots turned on, user-agent declared). Unfortunately a
minotiry of (l)users wil always misuse the tools they have
and cause bandwidth or load problems.
This is a problem for all offline browsers and ftp
downloaders : you can easily clobber the remote server's
bandwidth with aggressive options turned on.
But, here again, the problem is only due to a minority of
bad users: most of httrack's users do respect webservers,
and rely on offline browsers to access websites : they
often do not have permanent connection - or have an
expensive one, and/or can not have any connection at all
and must view websites on offline CDrom medias.
Filtering all offline browser tools will solve the problem,
but will also harm all these users : don't forget that.
| |