Stop httrack from downloading the whole internet

Subject: Stop httrack from downloading the whole internet

Author: David

Date: 11/10/2018 06:13

Hi, newbie here.
I'm not a newbie to copying websites. I used w3mir and wget before. I am
running httrack 3.48-24 on devuan (debian) linux.
The problem is that httrack appears to be downloading most of the web. The
command I am running is:
httrack --mirror --connection-per-second=2 --sockets=8 --timeout=45
--retries=12 --host-control=0 --extended-parsing=true --near --structure=0
--generate-errors --cookies=0 --check-type=1 --keep-alive --urlhack
--protocol=4 --cache=0 --display -p7 example.com
What I expected was for it to do:
wget -NEpkrl5 --wait 1 -t 3 --timeout 45 example.com
What I get is that httrack is downloading part of wikipedia and twitter. It
looks like it will download the whole thing if I let it.

Even when I do not pass --near, httrack still gets twitter.
What's going on here?

All articles

Subject	Author	Date
Stop httrack from downloading the whole internet		11/10/2018 06:13
Re: Stop httrack from downloading the whole internet		11/17/2018 21:41
Re: Stop httrack from downloading the whole internet		11/18/2018 03:19