HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: How to completely ignore 'robots.txt'?
Author: The Dro
Date: 05/23/2015 22:13
 
You would basically just need to use the command to ignore robots.txt. If
you're using command line to run HTTRACK just type in:

httrack <http://website.com/directory-you-want-to-rip>  -O "/path-to-save-files" 
-%v -s0

the "-s0" parameter that you pass should block out the application from
listening to robots.txt
 
Reply Create subthread


All articles

Subject Author Date
How to completely ignore 'robots.txt'?

11/30/2001 11:41
Re: How to completely ignore 'robots.txt'?

11/30/2001 17:43
Re: How to completely ignore 'robots.txt'?

10/03/2010 02:01
Re: How to completely ignore 'robots.txt'?

02/19/2013 19:52
Re: How to completely ignore 'robots.txt'?

01/11/2015 22:18
Re: How to completely ignore 'robots.txt'?

03/18/2015 15:10
Re: How to completely ignore 'robots.txt'?

05/23/2015 22:13
Re: How to completely ignore 'robots.txt'?

05/25/2017 03:56
Re: How to completely ignore 'robots.txt'?

01/16/2020 19:36
Re: How to completely ignore 'robots.txt'?

04/21/2023 23:47




3

Created with FORUM 2.0.11