HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: improved parser, when ?
Author: Xavier Roche
Date: 09/15/2013 10:04
 
> I'm just asking when we get an improved text parser
> to ''break'' those bloody web pages whom javascripts
> protect them so good ?
This is not as simple as it look - you do not just need an "improved parser",
you need a real parser with static analysis of the code behind. Executing the
javascript is already a complex task, and yet it would not be sufficient (all
execution pathes might not be run, and you have to solve the link rewrite
issue)

I'm afraid "improving" the parser will be extremely hard - I'm aware of the
limits of the current one, but I do not have any magic solution yet
unfortunately.
 
Reply Create subthread


All articles

Subject Author Date
improved parser, when ?

09/14/2013 16:55
Re: improved parser, when ?

09/15/2013 10:04
Re: improved parser, when ?

09/15/2013 14:32




a

Created with FORUM 2.0.11