| I can't, with Httrack or a couple of other downloading spiders, download a new
site I'm having built from the development website. I get a 503 server not
available error.
Presumably the dev site is set up to forbid spiders (the developer is on
holiday!) The pages load fine with a browser. I've set httrack to ignore the
robots.txt and to use a couple of different user agent names, still the same.
How to make it work? Apart from the robots.txt, how does the server know it is
a spider and not a browser? I am only trying to download a single page, so
it's not the bandwidth or anything like that.
Cheers
Andy | |