| I found that if I limit the speed and number of connections so as to not tax
the archive.org server, I am getting good results.
However, I have a question about using more than one URL web address in the
project. This isn't really specific to Archive.org, but it relates to what I
am doing because part of the downloaded site is linking back to archive.org
instead of downloading that section of the website:
When I specify multiple URLS, am I going to get multiple copies of the
website's pages in archive.org? OR, will HTTRACK figure out that there are
multiple links to the same page and adjust according so that you get one copy
of each page?
I'm hoping that if I pickup the URL's that get bounced back to archive.org and
add them to the Web Address URLs list, that HTTRACK will be clever enough to
eliminate the duplicates.
I'm working with this, so if I don't get an answer before I figure it out via
my testing, I'll come back and post the answer. | |