Re: Getting a site through an expired domain - HTTrack Website Copier Forum

Subject: Re: Getting a site through an expired domain

Author: Xavier Roche

Date: 06/27/2004 10:09

> The problem is, that the HMTL source of the site contains 
> absolute links to the subpages, e.g. 
> <http://www.sitetoarchive/page1.html> instead of just 
> page1.html.

It potentially means that the page was not downloaded (if 
so, use scan rules, such as +www.sitetoarchive/*), ot some 
javascript was involved (in such case, seahc&replace is the 
only solution)

> I need an (advanced) option in HTTrack that allows the 
> substitution of each link that contains 
> <ttp://www.sitetoarchive> with say 
> <http://10.10.10.10/sitedirectory>.

On Linux/Unix, a simple line such as:
find . -type f -name "*.html" -exec sh -c "cat {} | 
sed -e 's/www.sitetoarchive/10.10.10.10/g'>_tmp && mv -
f _tmp {}" \;

should do the trick.

Create subthread

All articles

Subject	Author	Date
Getting a site through an expired domain		06/24/2004 18:28
Re: Getting a site through an expired domain		06/25/2004 12:35
Re: Getting a site through an expired domain		06/27/2004 10:09