| i would imagine that the same code that HTTrack uses to
save websites could also be used as a link checker - to
check that all the links on a website work, up to some
depth... so that's my first suggestion - a link checker be
incorporated into HTTrack, sorta.
my second suggestion is that perhapes HTTrack could
perhapes rename files based on the documents title, or a
portion of the title, or some part of text on the homepage
itself. the changed name could be saved in some sort of
table, and then, whenever the original name is found in an
html file, it's changed to the new name - the name in the
table. this would probably require a second pass, though -
a pass after all the files had been saved, and the table
had been generated.
my third suggestion is that perhapes HTTrack could perform
alterations on the html as it is saving it. for
example... say some homepage contains code to display an
ad. there would be a table, an img command, and perhapes
some simple javascript. when archiving a page that's
hosted on, say, geocities, you'll be getting ads that
aren't part of the original code. while i don't think
HTTrack could figure out which parts of the code to remove,
the user could. the user could figure it out by looking at
one or maybe two pages, and then paste that into some text
area within some window of HTTrack, and then, as HTTrack is
downloading each file, it would delete that portion of the
file. or rather, it would delete that portion of the file
as the download of that file was complete.
also, internet explorer can save homepages in a "web
archive" mht format... this format saves an ind. page and
all the images on that page into one file that is viewable
by ie (maybe by other browsers, too... i dunno). if this
is an open format, perhapes it could be incorporated into
HTThreads as a feature that can be enabled, but that is
disabled by default?
finally, usenet messages bundle images / attachments within
them with unecode... would it be possible to do this with
html pages, as well? i would think it would be easy enough
to try, but... i never have, heh, and am too lazy too :)
-----------
<http://www.frostjedi.com/phpbb>
| |