HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: dirty javascripting
Author: Xavier Roche
Date: 07/01/2002 21:46
 
> i just want to ask one question - how come that HTTrack 
> can't handle dirty javascripting as for example IE!? is 
> there no solution for that or what!?
Browsers like IE or Netscape have a javascript engine 
embedded, and can execute scripts, register functions, and 
so on. This require a huge amount of code, and some 
CPU/memory.

HTTrack only attempt to patch javascript code, by detecting 
specific cases (like foo.src="bar.gif"). Use of functions 
and complex expressions prevent httrack from properly 
detecting URLs generated. For that, I would have to embed a 
javascript engine into httrack, too. 

This would mean thousands of lines of code, and some VERY 
hard work (because the engine would have to execute 
javascript, PLUS "guess" all URLs generated, for example, 
by mouse clicks). Imagine the number of lines in httrack, 
and multiply by 2 or 3.. with a dirtier code. This would be 
impossible to maintain for a single person (I guess this 
would require at least a team of 3 or 4 folks)

Anyway, even with this solution, I doubt this would be 
sufficient: code like
foo.write(a+b+'.gif');
with a='http://' and b='www.foo.com/bar'
would be impossible to patch, even if the detection could 
be possible.

Okay, so we'd have to execute javascript code, 
detect "hidden" URLs, AND "understand" the way the code is 
designed, and elaborate a strategy to patch the code.. at 
this point, my computer will be smarter than me, and will 
be able to answer by itself all questions on this forum :)

 
Reply Create subthread


All articles

Subject Author Date
dirty javascripting

07/01/2002 21:28
Re: dirty javascripting

07/01/2002 21:46




1

Created with FORUM 2.0.11