HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Forcing httrack to retrieve non-standard objects
Author: Xavier Roche
Date: 07/30/2002 20:38
 
> We want to retrieve a whole website that it's based on 
> Oracle Portal. The elements in the tool are called 
> like '.show' objects and are relative URLs that are well 
> formed inside a JavaScript call like this one:
> Any idea ? What are we doing wrong ? Why httrack doesn't 
> retrieve those elements.

Because there is no "real" javascript engine embedded ; 
only guesses and simple parser heutistics to detect links 
in active code like document.write() or foo.src='foo.gif';
But httrack can not "understand" functions, complex 
expressions, and can not evaluate all embedded functions 
inside the html page.

You can, however, "force" httrack to capture files using 2 
ways:

- Invisible unused tags containing either 'src' or 'href' 
attributes:

<x-httrack-load src="foo.gif">
<x-httrack-load src="bar/foobar.html">
<x-httrack-load src=http://www.example.com/bar.zip

- Inactive javascript functions:

<script language="javascript">
<!--

function x_httrack_load() {
  foo = "foo.gif";
  foo = "bar/foobar.html";
  foo = <http://www.example.com/bar.zip>;
}

// -->
</script>

You can also, if the overhead is too important, generate an 
intermediate page using

<x-httrack-load src="httracklinks.html">

and placing in httracklinks.html all links to be downloaded.

 
Reply Create subthread


All articles

Subject Author Date
Forcing httrack to retrieve non-standard objects

07/30/2002 20:30
Re: Forcing httrack to retrieve non-standard objects

07/30/2002 20:38




f

Created with FORUM 2.0.11