| > I had seen ETags mentioned previously, but no solid
> documentaion on how to use them with httrack.
Well, the mechanism isn't documented in httrack, because it
is already described in RFC 2616
(http://www.ietf.org/rfc/rfc2616.txt)
The idea is to implement a mechanism that will allow to
check the "freshness" of a ressource (identified by the
full URL - INCLUDING the query string), by using a specific
server header response, "Etag". This header allows the
server to send a string that will FULLY identify the
ressource content. It means that, given an URL and an Etag
value, a server is able to tell whether the ressource is up-
to-date or not.
The client, when requesting a refresh, will issue a regular
HTTP request, giving a "If-None-Match" field with the
previous server Etag string. The server then will reply
either with a "Not modified" status (304), or with a "OK"
status (200) with the full content, if the cached content
is not up-to-date.
It is up to the server to generate the Etag string, AND to
check the freshness of the contents.
An md5-checksum of the full output data can be used, but
any other solution is possible (a checksum of all variables
used to generate the page, for example)
The use is not very simple depending on the server, however.
| |