| ckteunissen - at - telfort - dot - nl
Feature request
My website is <http://www.cleonis.nl>
In the source code of my webpages in some cases the alt= attribute of images
contains in between words the string <br>
(Here I used the "escape" < rather than the character itself. (Force of
habit; many forums parse html elements). In plain text: <br>)
(This line break element comes from duplicating the image caption as the
content of the alt attribute. That turned out to be a bad idea, and I'm moving
away from that.)
When trying to create a HTTRack mirror of my site:
I noticed that with that string present somewhere in the alt= attribute
HTTrack won't include the image in the mirrored website.
I'm aware that the content of the alt attribute should be plain text anyway;
so I know it's _my_ problem, not a HTTrack bug.
Still, I'm surprised.
I wonder, why does the content of the alt= attribute matter at all? It seems
to me that for HTTrack it suffices to parse the src= attribute only. It seems
to me HTTrack performance would be more foolproof if the alt= attribute is
excluded from being parsed.
| |