HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: defending a site, bandwidth leeching
Author: Xavier Roche
Date: 01/22/2005 09:52
 
HTTrack is not the "best" tool to leech images or other 
king of material. It is mainly used to backup sites and 
make archives of live content. Default settings are 
generally fine (a maximum of 25KB/s, spider signature and 
default robots.txt following), but a minotity of bad users 
can easily clobber a website. As they can clobber an ftp 
site with a 10-threading ftp leecher.

The <http://www.httrack.com/html/abuse.html#WEBMASTERS> 
page contains some hints and advise, I would suggest the 
following:

- no javascript/form hacks
- robots.txt to prevent spidering large sections (images, 
for example)
- hidden links that point to these sections ("fake" images) 
that automatically bans the incoming IP for a given period 
of time (1 hour, for example)

The abuse faq contains an example of script (and the script 
can be hidden using apache rewriting rule, or php4 module 
rule that allow to transform scripts into "folder names")

 
Reply Create subthread


All articles

Subject Author Date
defending a site, bandwidth leeching

01/17/2005 10:01
Re: defending a site, bandwidth leeching

01/17/2005 19:10
Re: defending a site, bandwidth leeching

01/17/2005 21:22
Re: defending a site, bandwidth leeching

01/18/2005 08:14
Re: defending a site, bandwidth leeching

01/18/2005 17:55
I wanty to share my ideas...

01/19/2005 16:18
Re: defending a site, bandwidth leeching

01/22/2005 09:52




9

Created with FORUM 2.0.11