| > When i was planing somthing like a search engine running
> on my pc some years ago, i did somthing similar to a
> website-copier, but of course not as good as httrack. But
> i think it would be easy to let httrack become something
> like a search-engine:
Well.. the feature is a bit complex to implement, and most
users wouldn't be able to use it IMHO. But you can also
override the wizard system to fit your needs:
See the callbacks-example.c example, and the httrack
programming page (http://www.httrack.com/html/plug.html)
The check-html callback may be useful:
"Called when a document (which may not be an html document)
is to be parsed. The html address points to the document
data, of lenth len. The url_adresse and url_fichier are the
address and URI of the file being processed
return value: 1 if the parsing can be processed, 0 if the
file must be skipped without being parsed"
Callback signature:
int (* myfunction)(char* html,int len,char*
url_adresse,char* url_fichier);
And callback plug:
httrack --wrapper check-html=mycallbackfile:myregexpfunction
This will require some coding ; but using the regex library
and the example provided ; it should be feasable. The only
detail to solve would be how to pass the string to search
to the function.. maybe through a /var file during the
script initialization
> And again ANOTHER suggestion:
> There should be an option that the user can chose which
> makes httrack convert internet-links ( = links to files
> that are not stored locally) to plain text so the user
> won't click it. Or inserting some very small symbol
before
> the link or after the link which tells him: 'Warning,
it's
> not stored on your disk!'. Making the link itlic with the
> <i></i> code would also be good.
This can be done using the "No external pages" option
x replace external html links by error pages (--replace-
external)
You can then replace the small html wrapper by any other
file to fit your needs
> It's a cool thing that there are freeware- and open-
source-
> programmers out there. I don't want to imagine the
> computer-world without them!
Humm.. most of the biggest software companies in this world
would LOVE to crush open source. We (open source
developpers) are still able to code for the communauty ;
but this may not last forever. Think about the idiotic
patents that clueness companies are buying everyday without
even bothering about prior act.. they have the money to
threaten most of us.
| |