HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: How to get rid of anti-robots protection
Author: Martin Katz, Ph.D.
Date: 02/17/2009 09:52
 
I have a similar problem in WinHTTrack. It is probably not my "browser
identity", which is set to 
"Mozilla/4.78 [en] (Windows NT 5.0; U)"
When I try to mirror the site, I get a .gif file with the words "stop wasting
bandwidth".

I want to follow robots.txt in general. I think the site is rejecting me,
because I request a robots.txt and I am telling it I am using a browser.
 
Can I include a filter "-*site-name.com/robots.txt"?Would it be better to use
"googlebot" as my browser identity?
Thanks in advance for any advice.
 
Reply Create subthread


All articles

Subject Author Date
How to get rid of anti-robots protection

11/19/2008 09:02
Re: How to get rid of anti-robots protection

11/19/2008 17:40
Re: How to get rid of anti-robots protection

11/21/2008 15:45
Re: How to get rid of anti-robots protection

11/21/2008 16:52
Re: How to get rid of anti-robots protection

02/17/2009 09:52
Re: How to get rid of anti-robots protection

05/02/2019 07:35




3

Created with FORUM 2.0.11