HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: robots.txt
Author: twj
Date: 11/13/2015 19:11
 
> A robots.txt file is a convention used by sites to
> tell spiders (like google and httrack) what files
> they want to allow or forbid them from spidering.
> 
> You can tell HTTrack to ignore the robots.txt file
> if you have a legitimate reason to spider the site.

Thanks Mike... We own the site and our host wont help us. I have tried many
forms of exclusion but nothing as worked.

the pcglobal.ca/index.txt file reads:
User-agent: *
Disallow: /
 
Reply Create subthread


All articles

Subject Author Date
robots.txt

10/05/2015 18:06
Re: robots.txt

10/08/2015 00:22
Re: robots.txt

11/13/2015 19:11
Re: robots.txt

11/13/2015 19:16




1

Created with FORUM 2.0.11