| Hi,
again on filters:
(tried with httrack on windows)
I try to download these uri's
(are in root-test.txt):
<http://root.cz/print.phtml?id=780>
<http://root.cz/print.phtml?id=1023>
<http://root.cz/print.phtml?id=1058>
(in fact there will be more uri's from another
server, but both shows same problems; but
for testing purposes these 3 uri's are fine)
I want to download these pages and all files
shown or linked from them:
(*.ps,*gz,*.tex,*.jpg,*.jpeg, *.png ........)
but not links to other html files - in this web
phtml files (index*,clanek*).
So I thought it wuld be easier to exclude files
I don't want then specify all filetypes I want
to download
(With specifying which files I want it worked fine:
httrack --list "c:\websites\root-test.txt"
-W --depth=4 --ext-depth=0
-O "c:\websites\root-pokus\01" -%v
-* +root.cz/*jpg +root.cz/*jpeg +root.cz/*gif
.... the same for all possible/wanted filetypes
--assume phtml=text/html)
So I tried this(commands shoud be on one line):
httrack --list "c:\websites\root-test.txt"
-W --depth=4
--ext-depth=0
-O "c:\websites\root-pokus\02" -%v
-* -root.cz/index.* -root.cz/clanek.*
-root.cz/reklama.* +root.cz/*
--assume phtml=text/html
and this:
httrack --list "c:\websites\root-test.txt"
-W --depth=4 --ext-depth=0
-O "c:\websites\root-pokus\yy" -%v
-* -root.cz/index.phtml?oblast=*
-root.cz/clanek.phtml?id=* -root.cz/reklama.*
+root.cz/*
--assume phtml=text/htm
I supposed that I (from the first example):
-* ---> excluded all files
-root.cz/index.* ---> excluded all files beginning
with index*
...
+root.cz/*---> included all other files
But it deosn't work, it still downloads all
index.phtml?oblast=* and clanek.phtml?id=* files.
Is there a way how to make it work in the way with
excluding files?(I really don't want to specify all possible
filetypes that can exist (except phtml ;-)))
(what about -n option ? It isn't clear to me, what
is this option exactly used for.)
Thanks
B z F | |