| > I think this site is a good example how easy it is to
> make "interpreting" Javascript really complicated for a
> mirror program like httrack. The page defines a JS
> function launchit, which is mainly a wrapper for
> window.open(). Hence, httrack's Javascript parser will
> fail, if it simply searches for the string 'window.open'.
Exactly. The javascript parse is really basic: it is an
automaton, which extracts all strings inside the code,
recognizing javascript comments zones and strings zones.
Strings on "blessed" locations (such as foo.open() or
foo.src=..) are analyzed, and other ones are generally left
as is, except for obvious cases (foo.bar("foo.gif") will
trigger an URL fetching). Several other cases (limited
document.write() sections) are also parsed, but you can
very easily fool the engine by adding some "+" or
additional functions.
The only was would be a complete javascript analysis (not
only interpreter, as "entropy" can be introduced by user
interaction, timestamp and so on..)
| |