| Thanks for the reply! I tried this like you said, but when I started the
mirror, the spider began following all the robots.txt rules even though I have
set the -s0 argument. (The intranet site is an offline copy of our live
website so the same robots.txt exists in both).
I had to add a special rule to the robots.txt file:
User-agent: httrack
Disallow:
So it would again start mirroring eveything. Once the mirror completed, I
found that the same thing had happened. about 10% of files at random were
overwritten with the 416 HTML message.
How can I disable these "2nd requests" and prevent these files from being
overwritten? :|
Brian | |