| When i was planing somthing like a search engine running
on my pc some years ago, i did somthing similar to a
website-copier, but of course not as good as httrack. But
i think it would be easy to let httrack become something
like a search-engine:
Let the user type some expression (a word or something
more complex like a boolean phrase or so). When recieving
a html-file, it will only be saved and processed if it
fits the expression. It would avoid downloading pages that
don't mach the the thing the user is interested in (e.g.:
a page about dogs might be linked to many other dog-pages
but also to another page about cats, but the surfing dog-
owner doesn't like cats...).
When definig a high value for the external link depth (so
that the whole web will be searched) and forbidding
internal links (so that only the main-page or index-page
will be downloaded) httrack would become a personal web
search engine.
If you think this is easy to do and usefull, think about
this, which is pherhaps more difficult: There may be pages
that do not actually contain the word typed in by the
user, but contain links to other pages that do. Such pages
should be processed but not be saved.
And again ANOTHER suggestion:
There should be an option that the user can chose which
makes httrack convert internet-links ( = links to files
that are not stored locally) to plain text so the user
won't click it. Or inserting some very small symbol before
the link or after the link which tells him: "Warning, it's
not stored on your disk!". Making the link itlic with the
<i></i> code would also be good.
Ps:
It's a cool thing that there are freeware- and open-source-
programmers out there. I don't want to imagine the
computer-world without them! | |