| Hello,
We have started a Webarchiving project at the
University of Groningen.
For this project we have to download and archive sites
of Dutch political parties. To perform the download we
use Httrack version 3.05 (Unix-version).
With this Unix version we have encountered a problem
we cant explain ourselves:
Httrack seems to interpret internal links sometimes as
external links without any obvious reason.
I will try to explain our problem:
In the original page index.html of the site
<http://www.cda.nl/> a part looks like this:
<map name="link_top">
<area shape="rect" coords="0,0,34,15"
href="/articles/22/" alt="Partij">
<area shape="rect" coords="35,0,114,15"
href="/articles/43/" alt="Tweede Kamer">
<area shape="rect" coords="115,0,189,15"
href="/articles/23/" alt="Eerste Kamer">
<area shape="rect" coords="190,0,264,15"
href="/articles/41/" alt="Eurodelegatie">
<area shape="rect" coords="265,0,294,15"
href=http://www.cdja.nl/ alt="CDJA">
</map>
<map name="link_bottom">
<area shape="rect" coords="0,0,109,13"
href="/articles/354/" alt="Wetenschappelijk Instituut">
<area shape="rect" coords="110,0,202,13"
href="/articles/1/" alt="Scholingsinstituut (SI)">
<area shape="rect" coords="203,0,294,13"
href="/articles/16/" alt="Bestuurdersvereniging">
<area shape="rect" coords="0,14,64,27"
href="/articles/18/" alt="CDA-Vrouwen">
<area shape="rect" coords="65,14,144,27"
href="/afdelingen/" alt="CDA-afdelingen">
<area shape="rect" coords="145,14,294,27"
href="/articles/20/" alt="Internationale samenwerking
(EFS)">
<area shape="rect" coords="0,28,84,41"
href="/articles/11/" alt="Netwerk Migranten">
<area shape="rect" coords="85,28,169,41"
href="/articles/732/" alt="Ouderenplatform">
<area shape="rect" coords="170,28,254,41"
href="/articles/729/" alt="Dertigersgroepen">
<area shape="rect" coords="255,28,294,41"
href="/articles/728/" alt="Overigen">
</map>
But in the download links are translated in external
links:
<map name="link_top">
<area shape="rect" coords="0,0,34,15"
href="../external.html?link=www.cda.nl/articles/22/"
alt="Partij">
<area shape="rect" coords="35,0,114,15"
href="../external.html?link=www.cda.nl/articles/43/"
alt="Tweede Kamer">
<area shape="rect" coords="115,0,189,15"
href="../external.html?link=www.cda.nl/articles/23/"
alt="Eerste Kamer">
<area shape="rect" coords="190,0,264,15"
href="../external.html?link=www.cda.nl/articles/41/"
alt="Eurodelegatie"
>
<area shape="rect" coords="265,0,294,15"
href="../external.html?link=www.cdja.nl/" alt="CDJA">
</map>
<map name="link_bottom">
<area shape="rect" coords="0,0,109,13"
href="../external.html?link=www.cda.nl/articles/354/"
alt="Wetenschappelij
k Instituut">
<area shape="rect" coords="110,0,202,13"
href="../external.html?link=www.cda.nl/articles/1/"
alt="Scholingsinstit
uut (SI)">
<area shape="rect" coords="203,0,294,13"
href="../external.html?link=www.cda.nl/articles/16/"
alt="Bestuurdersver
eniging">
<area shape="rect" coords="0,14,64,27"
href="../external.html?link=www.cda.nl/articles/18/"
alt="CDA-Vrouwen">
<area shape="rect" coords="65,14,144,27"
href="afdelingen/index.html" alt="CDA-afdelingen">
<area shape="rect" coords="145,14,294,27"
href="../external.html?link=www.cda.nl/articles/20/"
alt="International
e samenwerking (EFS)">
<area shape="rect" coords="0,28,84,41"
href="../external.html?link=www.cda.nl/articles/11/"
alt="Netwerk Migrante
n">
<area shape="rect" coords="85,28,169,41"
href="../external.html?link=www.cda.nl/articles/732/"
alt="Ouderenplatfo
rm">
<area shape="rect" coords="170,28,254,41"
href="../external.html?link=www.cda.nl/articles/729/"
alt="Dertigersgro
epen">
<area shape="rect" coords="255,28,294,41"
href="../external.html?link=www.cda.nl/articles/728/"
alt="Overigen">
</map>
The probleem seems to occur only in the Unix version,
we cant reproduce this in the Windows version of
Httrack.
Any help would be much appriciated.
If you need more information about this strange
problem dont hesitate to contact me.
Thanks in advance,
Henk Druiven
University Library Groningen
| |