| I am having the same problem with downloading sites from the
wayback system (web.archive.org). I think there are a few
problems coming into play that would make the above filter
not work:
1) HTTrack seems to get confused by the URLs - it sees the
http:// in the middle of the URL and thinks this is the
beginning of the URL, thus, all of your sites come in with
bad relative URLs, with only the current site indexed (not
the site that's actually on the wayback machine) - meaning
it tries to visit the domain listed in the middle of the
URL, and not from the wayback system.
2) It also seems to miss everything if you don't allow it to
go up and down URLs (it may be something wayback is doing to
prevent basic bots)
I have been playing with this for the last 5 hours with no
luck!! Please help! Thanks | |