|
> There are many other useful things in the structures - ask
> me if necessary.
There's some weirdness about the statuscode field in htsblk
as passed to receive-headers. It doesn't seem to reflect
the status code found in the headers (and seen by other
parts of HTTrack). I have as the first statement of my
receive-headers plugin the following:
get_header(char* buff, char* adr, char* fil, char* referer_adr,
char* referer_fil, htsblk* incoming) {
printf("\nget_header %s%s: (%d) %s\n",
adr, fil, incoming->statuscode, incoming->location);
Yes when I point this at <http://shasta.cs.uiuc.edu/~lrclause>
with -r1, I get these printouts:
get_header shasta.cs.uiuc.edu/~lrclause: (-5)
* shasta.cs.uiuc.edu/~lrclause (270 bytes) - 301
get_header shasta.cs.uiuc.edu/~lrclause/: (0)
* shasta.cs.uiuc.edu/~lrclause/ (319 bytes) - OK
get_header shasta.cs.uiuc.edu/robots.txt: (-5)
1/3: shasta.cs.uiuc.edu/robots.txt (279 bytes) - 404
get_header shasta.cs.uiuc.edu/~lrclause/: (-5)
Done.shasta.cs.uiuc.edu/~lrclause/ (8623 bytes) - OK
Thanks for using HTTrack!
Where are these status codes defined, and shouldn't they be
the code returned by the server? It's good that I get the
redirects, but I wish I could recognize them in there.
For the curious, I've placed the plugin at
<http://shasta.cs.uiuc.edu/~lrclause/tmp/arc.c>
Note that it depends on GLib, since I had HTTracks own hash
table crash on me.
-Lars | |