I recently had a situation where a site I had mirrored experienced a server
issue in the midst of my updating the mirror and a great majority of the image
content on that site was replaced with a "image not found" placeholder image.
Many sites (as a best practice) use globally unique addresses for any image
content. As such, once an image file has been downloaded, it never needs to be
downloaded (or updated) again. No matter what the ETAG or anything else says.
In my experience, updating images that have already been download creates
undesired results more often than being the "desired" behavior (ie. images are
lost because the image was deleted, temporarily unavailable, etc).
Is there any chance that an update mode, or a flag could be created to forbid
overwriting certain types of files (ideally this could be specified by a url
mask, or a mime type mask) if they already exist. There are very few sites I
am aware of (if any) where the image at a given url actually changes, so
re-downloading images usually wastes bandwidth and potentially results in data
loss as mentioned above.
I would love to hear any thoughts on this issue (even if it's that I shouldn't
hold my breath for such a feature ever being implemented) :P