HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: recovering a mrror with new.dat
Author: Xavier Roche
Date: 09/18/2002 07:22
 
[sorry for the delay - my dsl connection was dead for few 
hours]

>in 2000 i mirrored a web site with winhttrack and now i
>need to prove that the pages did not have a disclaimer on
>them.  i only have the new.dat and related files from the
>cache, all others have been erased.  can i recover my
>mirror just from the cache files?  help please, tomorrow i
>need to show a printed document of at least one of the
>pages in question!

The cache (new.ndx + new.dat) normally stores all raw html 
data (untouched)

Therefore, potentially, yes, you should be able to recover, 
except for "binary" files (that is, gif files, zip files 
and so on..) ; you will be able to recover html files only 
(or data that was seen by httrack as html file)
Note: I suggest, before any action, to make a backup of all 
these files.

1. The simple way (I hope it will work)

- Make a backup copy of the complete old project ; the 
operations below may DELETE it if you don't do something 
riche
- Launch WinHTTrack Website Copier (recent release, such as 
3.20) and create a new project by entering a new name (for 
example recovery)
- After clicking on the FIRST "next" button ; WinHTTrack 
will immediately create the correspoding folder in your 
httrack projects, AND an EMPTY hts-cache
- Copy your old project cache in this new project dir (the 
goal is to "fill" the hts-cache folder) to replace the 
empty hts-cache directory
- Change the WinHTTrack action to "* Continue interrupted 
download" (NOT "update" or "mirror")
- Type in the EXACT URL (I say: EXACT URL ; for example 
www.foo.com/bar/ and NOT www.foo.com/bar) of the desired 
page to be recovered
- Go in Set Options / Limits and set Maximum Mirroring 
Depth to 0 (don't crawl)
- Go in Set options / Experts Only and set Rewrite Links 
to "Original / Original" so that nothing will be touched in 
the html links
- Go in Set Options / Browser ID and set the HTML FOOTER 
to "(none)" so that nothing will be added in the html page
- Start the mirror (click twice on NEXT)

If you're lucky (no redirect page or javascript crap) the 
operation will take 1 or 2 seconds. You will find in the 
new project directory the desired original html page, 
untouched, as it was when you mirrored it last time.

If not, contact me (roche@httrack.com), I'll see what can 
be done.
 
Reply


All articles

Subject Author Date
Re: recovering a mrror with new.dat

09/18/2002 07:22




f

Created with FORUM 2.0.11