HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Getting a site through an expired domain
Author: Xavier Roche
Date: 06/27/2004 10:09
 
> The problem is, that the HMTL source of the site contains 
> absolute links to the subpages, e.g. 
> <http://www.sitetoarchive/page1.html> instead of just 
> page1.html.

It potentially means that the page was not downloaded (if 
so, use scan rules, such as +www.sitetoarchive/*), ot some 
javascript was involved (in such case, seahc&replace is the 
only solution)

> I need an (advanced) option in HTTrack that allows the 
> substitution of each link that contains 
> <ttp://www.sitetoarchive> with say 
> <http://10.10.10.10/sitedirectory>.

On Linux/Unix, a simple line such as:
find . -type f -name "*.html" -exec sh -c "cat {} | 
sed -e 's/www.sitetoarchive/10.10.10.10/g'>_tmp && mv -
f _tmp {}" \;

should do the trick.


 
Reply Create subthread


All articles

Subject Author Date
Getting a site through an expired domain

06/24/2004 18:28
Re: Getting a site through an expired domain

06/25/2004 12:35
Re: Getting a site through an expired domain

06/27/2004 10:09




9

Created with FORUM 2.0.11