| > I wonder, why does the content of the alt= attribute
> matter at all? It seems to me that for HTTrack it
> suffices to parse the src= attribute only. It seems
> to me HTTrack performance would be more foolproof if
> the alt= attribute is excluded from being parsed.
Did you have extend parsing turned on (attempt to detect all)?
The parser may be very simplistic. Simply skips the initial quote if present
and stops at the trailing quote or space or angle brackets.
| |