| I had the stable version and everything worked just fine except for non-english
a href src links. Well, what a pleasant surprise, according to the History log
version 3.47-17+ Fixed: URL-encoding issue within URI - exactly what I needed
:)
So, I downloaded the last version 3.47-27 and tried it out.
I tried to download for this matter a local flowers store in Israel with
Hebrew url links insides.
Here's my command line:
httrack -u1 -%p -b0 --mirror --robots=0 --update -F "Mozilla/5.0 (Windows NT
6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72
Safari/537.36" -%l "he,en-US;q=0.8,en;q=0.6,en-GB;q=0.4"
<http://www.zer4u.co.il/> -O "/vagrant/websites/zer4u" "+*.zer4u.co.il/*"
"-mime:application/*" -v
and when httrack just started to seeing these links it freaked out with these
kind of errors (it just a sample of the log):
12:32:25 Warning: engine: warning: serialize error for
www.zer4u.co.il/××ª× ×ת_×××× to
/vagrant/websites/zer4u_118/www.zer4u.co.il/×ª× ×ת_××××.html.tmp:
open error (directory exists, file does not exist): Protocol error
12:32:25 Warning: engine: warning: serialize error for
www.zer4u.co.il/××ר××_×©× to
/vagrant/websites/zer4u_118/www.zer4u.co.il/×ר××_ש×.html.tmp: open
error (directory exists, file does not exist): Protocol error
12:32:25 Warning: engine: warning: serialize error for
www.zer4u.co.il/××ª× ×ת_××××ת_×ת to
/vagrant/websites/zer4u_118/www.zer4u.co.il/×ª× ×ת_××××ת_×ת.html.tmp:
open error (directory exists, file does not exist): Protocol error
12:32:29ww.zer4uError: ×"Error when decompressing" (-1) at link
www.zer4u.co.il/××ר××_×©× (from www.zer4u.co.il/)
12:32:29 Error: Unable to save file
/vagrant/websites/zer4u_118/www.zer4u.co.il/×ר××_ש×.html: Protocol
error
and so on and on...
(you may see the gibberish chars at the url)
Well I guess that it's something that related to the url-encode fix, otherwise
I can't explain it.
I'm running Linux version 3.2.0-0.bpo.2-686-pae (Debian 3.2.20-1~bpo60+1)
Is it really a bug? Any suggestions?
Thanks! | |