HTTrack Website Copier
Free software offline browser - FORUM
Subject: filters + robots.txt on command line
Author: Andreas
Date: 04/26/2004 09:13

I don't understand how the s-Option on httrack 3.30 (command
line) works: For example I want to mirror I only want to get this page
nothing else; I need error pages and want to get the
pictures in this page. If I call 
httrack -x -%e0 -r1 -n ''
I don't get the picturs, because they are excluded by
Info:   Note: due to remote robots.txt rules,
links begining with these path will be forbidden: (...)
/wwwroot/gif_01/, /wwwroot/gif_02/ (see in the options to
disable this)

So I have to set -s0 or -s1 (and don't understand the
difference by the way):
httrack -x -%e0 -r1 -n -s0 ''
But here still the pictures are missing - no errors or infos
in the log.
This gets the pictures - but I don't want the r2-Option:
httrack -x -%e0 -r2 -n -s0 ''
So I extend the command by some positive filters instead of
the r2-Option. Here just one filter for one of the pictures:
httrack -x -%e0 -r1 -n -s0 ''

But I still don't get any picture! Now I don't have an idea
what to try more. Can you help?I put the log below.

Thanks, Andreas


cat hts-log.txt
HTTrack3.30 launched on Mon, 26 Apr 2004 09:40:42 at
(httrack -x -%e0 -r1 -n -s0
+ )

Information, Warnings and Errors reported for this mirror:
note:   the hts-log.txt file, and hts-cache folder, may
contain sensitive information,
        such as username/password authentication for
websites mirrored in this project
        do not share these files/folders if you want these
information to remain private

HTTrack mirror complete in 0 seconds : 1 links scanned, 1
files written (8313 bytes overall) [8488 bytes received at
8488 bytes/sec]
(No errors, 0 warnings, 0 messages)


All articles

Subject Author Date
filters + robots.txt on command line

04/26/2004 09:13


Created with FORUM 2.0.11