HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Suggestion: something like a persona search engine
Author: Xavier Roche
Date: 06/09/2003 12:21
 
> When i was planing somthing like a search engine running 
> on my pc some years ago, i did somthing similar to a 
> website-copier, but of course not as good as httrack. But 
> i think it would be easy to let httrack become something 
> like a search-engine:

Well.. the feature is a bit complex to implement, and most 
users wouldn't be able to use it IMHO. But you can also 
override the wizard system to fit your needs:

See the callbacks-example.c example, and the httrack 
programming page (http://www.httrack.com/html/plug.html)

The check-html callback may be useful:
"Called when a document (which may not be an html document) 
is to be parsed. The html address points to the document 
data, of lenth len. The url_adresse and url_fichier are the 
address and URI of the file being processed
return value: 1 if the parsing can be processed, 0 if the 
file must be skipped without being parsed"

Callback signature:
int (* myfunction)(char* html,int len,char* 
url_adresse,char* url_fichier);

And callback plug:
httrack --wrapper check-html=mycallbackfile:myregexpfunction

This will require some coding ; but using the regex library 
and the example provided ; it should be feasable. The only 
detail to solve would be how to pass the string to search 
to the function.. maybe through a /var file during the 
script initialization

> And again ANOTHER suggestion: 
> There should be an option that the user can chose which 
> makes httrack convert internet-links ( = links to files 
> that are not stored locally) to plain text so the user 
> won't click it. Or inserting some very small symbol 
before 
> the link or after the link which tells him: 'Warning, 
it's 
> not stored on your disk!'. Making the link itlic with the 
> <i></i> code would also be good.

This can be done using the "No external pages" option
  x  replace external html links by error pages (--replace-
external)

You can then replace the small html wrapper by any other 
file to fit your needs

> It's a cool thing that there are freeware- and open-
source-
> programmers out there. I don't want to imagine the 
> computer-world without them!

Humm.. most of the biggest software companies in this world 
would LOVE to crush open source. We (open source 
developpers) are still able to code for the communauty ; 
but this may not last forever. Think about the idiotic 
patents that clueness companies are buying everyday without 
even bothering about prior act.. they have the money to 
threaten most of us.
 
Reply Create subthread


All articles

Subject Author Date
Suggestion: something like a persona search engine

06/09/2003 02:00
Re: Suggestion: something like a persona search engine

06/09/2003 12:21
Re: Suggestion: something like a persona search engine

02/07/2013 23:47




6

Created with FORUM 2.0.11