|
> I tried them but they return the link (absolute or
> relative) but not the tag in which they were declared.
OK, I did not understood your problem.
>
> So I got all links, all css, html pictures ...
> How can I make the difference between html declared by
> iframe or frame tag and those declared by a href?>
> I want to get only inline objetc need to diplay a page.
Xavier already announced a new callback which provides
what you need; if you need a really urgent solution, you
could use the callbacks preprocess-html and
postprocess-html. In preprocess-html, you can "hide" the
URLs you don't want to be processed further, eg., by
replacing them with a base-64 encoded string; after the
page has been processed by httrack, you can restore the
the original data. <adv>With Python, this is a quite easy
job, provided the HTML data you need to parse is halfway
valid ;) </adv>
Abel | |