| I am making a mirror or a site then making a second mirror of the same site and
comparing the two result sets for differences.
However, my first scrape, say 'a', will present the link href as
'../somefolder/index.html' Then my second scrape, 'b', will present the exact
same link with the href as '../somefolder.html'
In the case where the href is '../somefolder/index.html' there does not appear
to be a '../somefolder.html' file, whereas in the case that it does exist the
link points to that instead of the index.html
I have no idea why httrack, using the same options and same site would build
the link structure differently
Command:
/usr/bin/httrack <http://website> -O /path/to/output/
This does not appear to be the case on all sites, but is, consistently, on
others.
Can anyone shed any light on this? | |