HTTrack Website Copier
Free software offline browser - FORUM
Subject: Defaults, unquoted links, UTF-to-ISO conversion
Author: Joe Forster/STA
Date: 03/10/2004 15:34
 
Dear Xavier,

When I found WinHTTrack, I deleted Teleport Pro within a few 
days, :-) as WinHTTrack is an excellent piece of software! 
However, if I may have note a few problems I ran into so 
far...

I'd like to be able to store the default settings, those 
that I have once set in the "Set options" submenu. Now I 
have to set the options again and again, although I have 
found out long ago what the best settings are for my needs.

I found that the program is unable to mirror links that are 
not inside quotation marks, like (A HREF=blabla)blabla(/A) 
(angle brackets replaced with round parentheses...). 
(Certainly, this is against the standards, still, I'd like 
to be able to mirror such sites!) Perhaps, a solution would 
be to accept such links and detect the end of the target URL 
by searching for the first closing angle bracket after the 
equation mark.

Today, I found a (Hungarian) site that uses UTF-8 encoding. 
(Again, what a stupid thing to do, instead of using the 
proper ISO-8859-x code page; ISO-8859-2 in Hungary!) All 
links, including the names of downloadable files (JPEG 
pictures, Excel spreadsheets, Word documents) are encoded 
with UTF-8 and some names _do_ include national accented 
characters. The result is a catastrophe: now I have files on 
my hard disk that I cannot open, copy, rename or delete 
using their original, long names (only the short names) as 
they contain UTF-8-style weird double characters (e.g. 0xC3 
0xA1 for รก, a with a single accent). Also, many of these 
mirrored files contain a standard "Error 404" page; 
apparently, the program couldn't download them properly. It 
would be nice, if the program could convert UTF-8 encoded 
characters into a ISO-8859-x code page specified by the 
user.

Thank you for your reply in advance and I wish you and your 
co-programmers further good luck with your project.

Joe Forster/STA
sta@c64.org
 
Reply


All articles

Subject Author Date
Defaults, unquoted links, UTF-to-ISO conversion

03/10/2004 15:34
Re: Defaults, unquoted links, UTF-to-ISO conversion

03/13/2004 09:24
Re: Defaults, unquoted links, UTF-to-ISO conversion

03/13/2004 15:02




2

Created with FORUM 2.0.11