| > What robots.txt lines are recognized and honored by
> httrack, including extensions?
No extensions are recognized -- only basic robots rules (ie. the original
Netscape speficication)
For the rate limiter, the default is 25KB/s, and it can not be easily
increased beyond 100KB/s. You may see isolated peaks due to TCP buferring, but
the average rate should be respected if the limits have not been overriden.
Anyway even this rate may cause some issues ; you can put by default
cpu-aggressive pages in robots.txt, or, in this case, temporarily blacklist
the bad citizen which is causing these slowdowns. | |