| > i just want to ask one question - how come that HTTrack
> can't handle dirty javascripting as for example IE!? is
> there no solution for that or what!?
Browsers like IE or Netscape have a javascript engine
embedded, and can execute scripts, register functions, and
so on. This require a huge amount of code, and some
CPU/memory.
HTTrack only attempt to patch javascript code, by detecting
specific cases (like foo.src="bar.gif"). Use of functions
and complex expressions prevent httrack from properly
detecting URLs generated. For that, I would have to embed a
javascript engine into httrack, too.
This would mean thousands of lines of code, and some VERY
hard work (because the engine would have to execute
javascript, PLUS "guess" all URLs generated, for example,
by mouse clicks). Imagine the number of lines in httrack,
and multiply by 2 or 3.. with a dirtier code. This would be
impossible to maintain for a single person (I guess this
would require at least a team of 3 or 4 folks)
Anyway, even with this solution, I doubt this would be
sufficient: code like
foo.write(a+b+'.gif');
with a='http://' and b='www.foo.com/bar'
would be impossible to patch, even if the detection could
be possible.
Okay, so we'd have to execute javascript code,
detect "hidden" URLs, AND "understand" the way the code is
designed, and elaborate a strategy to patch the code.. at
this point, my computer will be smarter than me, and will
be able to answer by itself all questions on this forum :)
| |