|
Hi, I just started using HTTrack quite recently. I
first downloaded 3.06 and (using the provided
httracklib.* and example.* files in the lib dir)
created a simple program I could link to HTTrack.
After downloading 3.07, I noticed the lib dir has
changed, including the removal of the httracklib.*
files. These files had a bit of documentation in them
regarding the callback functions which I found useful.
Anyway, here is a list of the callback functions. I've
tried to trace them a bit to see what they do. I'm not
sure about these descriptions, perhaps Xavier Roche or
someone else more familiar with HTTrack could check
them when time allows. For what they are worth, here
they are:
Functions to call at start-up:
------------------------------
Name: hts_init
Purpose: initialisation function
must be called first
Params: None
Return: int
0 = failure
1 = success
Name: hts_main
Purpose: main entry point
Params: int argc (number of arguments)
char** argv (arguments)
Return: int
-1 = failure
0 = success or user discontinue
Name: htswrap_add
Purpose: adds a callback (wrapper) function
must be called after hts_init() but
before hts_main
Params: char* name (special callback
identifier, see below)
void* fct (address of callback
function)
Return: int
0 = failure
1 = success
List of callback functions
--------------------------
Their 'name' is the special identifier which must be
passed to htswrap_add()
-------------------------------------------------------
---------------------
Name: "init"
Purpose: Called during HTTrack initialization
Params: None
Valid Return: Nothing
Name: "query", "query2", "query3"
Purpose: Called to get an answer to a
question
Params: char* question
Valid Return: char* response, which can be ""
Name: "start"
Purpose: Called before HTTrack starts the
mirror
Params: httrackp* opt
(all the options for this session,
see 'htsopt.h')
Valid Return: int
0 = HTTrack should end
1 = HTTrack should continue
(default behaviour)
Name: "change-options"
Purpose: Called when options have been
changed by HTTrack
Params: httrackp* opt
(all the options for this session,
see 'htsopt.h')
Valid Return: int 1 (ignored as far as I can
tell, but to be safe use 1)
Name: "link-detected"
Purpose: Called when a link is detected
Params: char* link (the text of the 'href='
attribute)
links are usually relative, so this
text will likely
be something like 'filename.ext'
or 'subdir/filename.ext'
Valid Return: int
0 = process the link
1 = ignore the link
Name: "check-link"
Purpose: Called to test if a link should be
accepted or refused
Params: char* host_name (eg: www.foo.com)
char* filename (eg: /index.html)
int current_status
0 = link should be accepted
1 = link should be refused
-1 = the engine has no opinion
(by default, the link will be
refused)
Valid Return: int
0 = process the link
1 = skip the link
-1 = no opinion (by default,
current status will be kept)
Name: "check-html"
Purpose: Called to check if an HTML file
should be parsed after download
Params: char* buffer_html (address of the
HTML buffer)
int buffer_html_size (size of this
buffer in bytes)
char* host_name (eg: www.foo.com)
char* filename (eg: /index.html)
Valid Return: int
0 = do not parse this file
1 = parse this file (default
behaviour)
Notes: This function is also called for
the primary URL,
before downloading.
In this case, it is passed the URL
in 'buffer_html'
and the word "primary"
in 'host_name'
and "/primary" in 'filename'.
Name: "save-file"
Purpose: Called when a file is about to be
saved
Params: char* filename
(path to local file, starting with
the prefix given
with the -O command line option; if
the given prefix
was relative then this name will be
relative also
Valid Return: int
0 = don't save the file
1 = save the file (default behavior)
Name: "loop"
Purpose: Called during a download loop,
after every chunk of bytes
Params: many, many
Valid Return: int
0 = HTTrack should end
1 = HTTrack should continue
Name: "pause"
Purpose: Called to wait for the lock file to
be deleted
Params: char* lockfile
Valid Return: Nothing
Name: "end"
Purpose: Called after HTTrack ends the mirror
Params: None
Valid Return: int 1 (ignored as far as I can
tell, but to be safe use 1)
Name: "free"
Purpose: Called during HTTrack termination
Params: None
Valid Return: Nothing
| |