HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Turkish charecters
Author: Xavier Roche
Date: 04/08/2003 23:15
 
> > When I download Turkish website I am getting question 
> > marks '?' instead of Turkish charecters such as ðüçþ

Problem detected: Many (most) pages are UCS-2 unicode (that 
is, 16-bit
raw data), which is strongly unadvised on the internet: 
you'd better use
UTF-8, which is more compatible, and more portable (many 
new characters
can not be represented anymore as UCS-2 characters, and 
besides utf-8 is
the de-facto standard now for xml and html)

Currently httrack does not properly recognize all UCS-2 
pages ; I will
improve the UCS2 detection in the next release, but this 
won't fix the
problem: non-ascii characters will be lost anyway.

You can generally easily save as "UTF-8" using more tools ; 
in this case
the "charset" will have to be defined like:

<meta http-equiv="Content-Type" content="text/html; 
charset=utf-8">
 
Reply Create subthread


All articles

Subject Author Date
Turkish charecters

04/07/2003 22:36
Re: Turkish charecters

04/07/2003 22:59
Re: Turkish charecters

04/08/2003 23:15




b

Created with FORUM 2.0.11