| > but HTTrack could use a little bit more elegancy,
> like Google for instance.
Google (and other search engines) can very easily spread
hits by "mixing" the millions pages they have to fetch. For
example, if a complete refresh is 1 week long, and if you
have 7 pages in your website, search engines will fetch 1
page per day on your site.
But HTTrack users would be a bit frustrated to wait 1 week
for a 7-page website mirror, not to mention the need to
keep the PC alive during the night (this is an extreme and
louzy example, I admit)
Well, actually HTTrack does implements by default limits to
avoid bandwidth overload from beginners:
- 25 KB/s maximum
- robots file and META enabled
The 25KB limit is the most important default feature - and
it is generally the best solution to avoid overload.
Of course, users can override these limits and can abuse
websites. But this is unfortunately beyond my control, and
the reponsability of everyone. If someone is clobbering
your bandwidth, the best way is to report the incident to
the coresponding network postmaster that will take actions
to stop the problem.
| |