| I downloaded the Linux tar.gz archive for httrack 3.44.1 and compiled the
sources. I am using the following command line:
httrack --skeleton -d --search-index -%q --assume
asp=text/html,aspx=text/html,jsp=text/html,do=text/html -%F "Mirrored [from
host %s [file %s [at %s]]]" -A100000000 -%c25 -T30 -V "grep --ignore-case
--files-with-matches --file=keywords.txt \"\$0\" >> matches.txt" -%v -z -%L
Seeds.txt -O downloads
The file Seeds.txt contains a simple list of starting URLs (no filters). After
a while, I noticed that httrack has downloaded several images, pdf files, and
even zip files. I thought the skeleton option restricts httrack to html file
downloads. Am I doing something wrong here, or is it a bug in httrack? | |