| Reply to Xavier:
>>>>>>
Humm, this is weird, because httrack does not handle
lines at all, but only the whole stream. I suspect some
nasty transfer error or update bug, maybe. Did you attempt
to edit the file with some editor ? This might be the
cause.
<<<<<<
No, I tried editing a different file to restore missing
bits that WinHTTrack had dropped from the mirror, but not
this one. I was careful to choose for the "before and
after" sample a file that was exactly as downloaded by
WinHTTrack.
I forgot to say before that this truncation is evident in
many files mirrored from the Dolmetsch site, so I doubt
very much that this is a freak transmission error, which
would be more random. In particular, the header lines in
*all* the music dictionary pages were truncated and all at
the *same* place (except for page B where it occurs a
little further on).
It may well be an update bug though; I did run an update
on the mirrored site after the initial download. I'll
experiment some more and get back to you again on this.
>>>>>>
I tried to mirror
<http://www.dolmetsch.com/defsg.htm>
and the page looked ok with 3.33-rc6 afais
<<<<<<
Is that the same as the Windows version?Have you tried it with WinHTTrack
3.32-2 (the one I am
using). And in your test, were the links in the mirrored
file edited by WinHTTTrack? (If you only tried to mirror a
single file, not a whole site or a whole set of files from
a site, it may not have converted any links to refer to
the localised structure. In that case, I would expect no
change from the original.
>>>>>>
Well, httrack does not change anything, actually: if the
page was LF convention it is still LF convention. Only
relevant links are patched on-the flo - the rest of the
data is ok.
<<<<<<
I have some more info for you regarding the line
termination protocols and what WinHTTrack is doing: I have
examined these files in a hex editor. You are right in
saying that the original files on the Dolmetsch site
follow the (Unix/Mac) single LF convention. However in the
mirrored copies, WinHTTrack has *added* a CR-LF pair after
each occurrence of a single LF in the original. I am
pleased to see that in doing this, the Windows version is
following the DOS convention for line termination (as
expected by Windows), but if it is going to make such
changes wouldn't it be better for it to *replace* single
LFs with a CR-LF, rather than add a CR-LF to each single
LF.
At any rate, it seems clear that WinHTTrack *does* take
notice of line breaks and also makes changes to them. This
seems to contradict what you have said about HTTrack not
changing anything except links and also about how it
handles lines. This reinforces my suspicion that there is
some link between this behaviour and the line truncation
bug.
BTW, I am using WinHTTrack 3.32-2 under Windows XP Pro,
ver 2002, on a 731 MHz Pentium 3. My internet connection
is dial-up using a USB V.90 56K Modem.
| |