binary files treated as html - HTTrack Website Copier Forum

Subject: binary files treated as html

Author: Annagrahm

Date: 03/15/2002 21:12

I previously reported that HTTrack insisted on 
downloading and treating binary files as HTML.

This included parsing the file (looking for links), 
changing the extensions, changing the local index.html 
to match the changed filename, and no telling what 
else.

This in spite of the original web site's web pages as 
being correct.  As showing the links as .DLL's, .CAB, 
etc. etc.  And my being able to download the files 
manually with no problems.

And HTTrack correctly showed the link name in its 
display.  That correct name just didn't make it to the 
rest of HTTrack.

I tried several versions of 3.0 and even an old v2.3 
of HTTrack.

All three had the same problems with the web site.  
(Plus, all three insisted on 'hammering' the web site 
with a massive amount of activity and useless link 
checking, etc.  That often consumed more bandwidth 
than the downloads themselves.  And often resulted in 
the web site's firewall blocking me for several hours.)

I looked around for a better web sucker and I found 
several.  Some I didn't like, others didn't work, etc.

But I ended up trying GetLeft (on sourceforge) and I 
can definetly say it *IS* able to download the web 
site correctly.  Including all those files that 
HTTrack screwed up.

Maybe it's because it's smarter internally, and can 
handle a wider array of web sites.  Or maybe it's 
because it's stupid and just blinding accepts the web 
site and data as it is, without trying to second guess 
it.  It doesn't try and be clever.

Whatever the reason, the fact that it actually *WORKS* 
makes it a better web sucker than WinHTTrack.

Plus, as a bonus, it doesn't "hammer" the site with 
constant activity.  It easily and quickly parses the 
index.html, calmly downloads the files and goes on to 
the next.  None of the constant hammering that HTTrack 
does.  None of the useless link 'pre-checking', either.

All articles

Subject	Author	Date
binary files treated as html		03/15/2002 21:12
Re: binary files treated as html		03/15/2002 23:58