| > There's some weirdness about the statuscode field in
htsblk
> as passed to receive-headers. It doesn't seem to reflect
> the status code found in the headers (and seen by other
> parts of HTTrack).
Ah, yes. Some transformations are being made:
200 => 206 for bogus responses with 'content-range'
203 => 200 because this is the same and I did not want
multiple tests
416 => 304 for bogus 'Requested Range Not Satisfiable'
responses when the file is actually complete after a
request using Content-range: bytes */<size>
406 => 200 OK is good: links inside
206 => 200 for partial files (200 is good)
"brain fcked request" => 200 because brainfcked servers
exists :(
304 => 200 because as we have a cache, we can
simulate "fake" 200 responses (so that we don't have to
bother with the update process)
There are also negative codes (-1,..-10) for timeouts, and
other various errors (see the 'msg' member of the htsblk
structure, which is the error string message)
> get_header shasta.cs.uiuc.edu/~lrclause: (-5)
> * shasta.cs.uiuc.edu/~lrclause (270 bytes) - 301
Argh - forgot to say that the receive-header callback is
called **BEFORE** headers are being processed - therefore
most fields are irrelevant at this state!
> Note that it depends on GLib, since I had HTTracks own
hash
> table crash on me.
Darn! Using htsinthash.* functions ?
| |