| Hi, newbie here.
I'm not a newbie to copying websites. I used w3mir and wget before. I am
running httrack 3.48-24 on devuan (debian) linux.
The problem is that httrack appears to be downloading most of the web. The
command I am running is:
httrack --mirror --connection-per-second=2 --sockets=8 --timeout=45
--retries=12 --host-control=0 --extended-parsing=true --near --structure=0
--generate-errors --cookies=0 --check-type=1 --keep-alive --urlhack
--protocol=4 --cache=0 --display -p7 example.com
What I expected was for it to do:
wget -NEpkrl5 --wait 1 -t 3 --timeout 45 example.com
What I get is that httrack is downloading part of wikipedia and twitter. It
looks like it will download the whole thing if I let it.
Even when I do not pass --near, httrack still gets twitter.
What's going on here? | |