| Hi,
really great program, I just have three questions
about filters (I'm using linux version):
1, when I want download jpg and jpeg will regexp
'+*jpe*g'
and for htm and html
'*html*'
work?
2, I want to download files ending with
eg. 10, 20, 1003, 1289, 1345
from URI like this:
<http://foo.com/print.phtml?id=1234>
is it possible to do it with one command or is
necessary to use separate commands for each?(and is possible to specify range
eg. 12-100
or even mixed: 5, 8, 12-100, 128, 1006-1152 ?)
3, on page eg.
<http://root.cz/index.html>
(are articles and discussions)
(clanek = article)
Art.1011 (http://root.cz/clanek.phtml?id=1011)
Disc.1011
<http://root.cz/forum/diskuse.php3?clanek=1011&>;
vlakno=0&stav=0&vse=Zobrazit+v%B9e
Art.1010 (http://root.cz/clanek.phtml?id=1010)
Disc.1011
<http://root.cz/forum/diskuse.php3?clanek=1010&>;
vlakno=0&stav=0&vse=Zobrazit+v%B9e
("end" of index.html)
their printer friendly version are
articles
<http://root.cz/print.phtml?id=1011>
<http://root.cz/print.phtml?id=1010>
("print" instead of "clanek")
discussions
<http://root.cz/forum/diskuse.php3?clanek=1011&>;
vlakno=0&stav=0&vse=Zobrazit+v%B9e&print=1
<http://root.cz/forum/diskuse.php3?clanek=1010&>;
vlakno=0&stav=0&vse=Zobrazit+v%B9e&print=1
(there is appended "&print=1" on end of the URI)
but they aren't linked on index.html (but they are
on Art.1011 (http://root.cz/clanek.phtml?id=1011))
Is it possible to download it only the index.html
file and only printer-friendly pages with images
and other wanted datas?I tried it - but no success - it was only possible
to do it with downloading Art.10xy too.
(Yes, there is probably a way - download index.html
and with some bash scripting extract URI's , replace
(probably with sed) clanek with print and feed it
back to httrack - but it is hard way)
Thanks
BzF
| |