HTTrack Website Copier
Free software offline browser - FORUM
Subject: robots.txt
Author: Cat
Date: 05/06/2012 15:25
 
Hi. I tried downloading this website but it seems robots.txt is preventing me
but even when I disable robots.txt in the options I can't download the
website.

Here's the log with default settings:

HTTrack3.45-4+htsswf+htsjava launched on Sun, 06 May 2012 13:22:28 at
<http://www.spelljammer.com/> +*.png +*.gif +*.jpg +*.css +*.js
-ad.doubleclick.net/* -mime:application/foobar
(winhttrack -qwC2%Ps2u1%s%uN0%I0p3DaK0H0%kf2A25000%f#f -F "Mozilla/4.5
(compatible; HTTrack 3.0x; Windows 98)" -%F "<!-- Mirrored from %s%s by
HTTrack Website Copier/3.x [XR&CO'2010], %s -->" -%l "en, en, *"
<http://www.spelljammer.com/> -O1 "C:\Web Sites\Spelljammer" +*.png +*.gif
+*.jpg +*.css +*.js -ad.doubleclick.net/* -mime:application/foobar )
Information, Warnings and Errors reported for this mirror:
note: the hts-log.txt file, and hts-cache folder, may contain sensitive
information,
 such as username/password authentication for websites mirrored in this
project
 do not share these files/folders if you want these information to remain
private
13:22:29 Warning:  File not parsed, looks like binary:
www.spelljammer.com/robots.txt
HTTrack Website Copier/3.45-4 mirror complete in 1 seconds : 2 links scanned,
1 files written (859 bytes overall) [1392 bytes received at 1392 bytes/sec],
1306 bytes transfered using HTTP compression in 2 files, ratio 67%
(No errors, 1 warnings, 0 messages)




Here's the log when I let the program ignore the robots.txt rules:

HTTrack3.45-4+htsswf+htsjava launched on Sun, 06 May 2012 13:23:50 at
<http://www.spelljammer.com/> +*.png +*.gif +*.jpg +*.css +*.js
-ad.doubleclick.net/* -mime:application/foobar
(winhttrack -qiC2%Ps0u1%s%uN0%I0p3DaK0H0%kf2A25000%f#f -F "Mozilla/4.5
(compatible; HTTrack 3.0x; Windows 98)" -%F "<!-- Mirrored from %s%s by
HTTrack Website Copier/3.x [XR&CO'2010], %s -->" -%l "en, en, *"
<http://www.spelljammer.com/> -O1 "C:\Web Sites\Spelljammer" +*.png +*.gif
+*.jpg +*.css +*.js -ad.doubleclick.net/* -mime:application/foobar )
Information, Warnings and Errors reported for this mirror:
note: the hts-log.txt file, and hts-cache folder, may contain sensitive
information,
 such as username/password authentication for websites mirrored in this
project
 do not share these files/folders if you want these information to remain
private
13:23:50 Info:  engine: transfer-status: link updated: www.spelljammer.com/ ->
C:/Web Sites/Spelljammer/www.spelljammer.com/index.html
No files purged
HTTrack Website Copier/3.45-4 mirror complete in 1 seconds : 1 links scanned,
1 files written (859 bytes overall), 1 files updated [693 bytes received at
693 bytes/sec], 859 bytes transfered using HTTP compression in 1 files, ratio
51%
(No errors, 0 warnings, 1 messages)



Why doesn't it download the website?
 
Reply


All articles

Subject Author Date
robots.txt

05/06/2012 15:25
Re: robots.txt

05/06/2012 21:28
Re: robots.txt

05/06/2012 23:06
Re: robots.txt

05/07/2012 15:24
Re: robots.txt

05/07/2012 22:19




4

Created with FORUM 2.0.11