A lot of interesting things have been going on lately on the Windows Azure MVP list and I’ll be try to pick the best and the ones I can share and make some posts.
During an Azure bootcamp another fellow Windows Azure MVP, had a very interesting question “What happens if someone is updating the BLOB and a request come in for that BLOB to serve it?”
The answer came from Steve Marx pretty quickly and I’m just quoting his email:
“The bottom line is that a client should never receive corrupt data due to changing content. This is true both from blob storage directly and from the CDN.
The way this works is:
· Changes to block blobs (put blob, put block list) are atomic, in that there’s never a blob that has only partial new content.
· Reading a blob all at once is atomic, in that we don’t respond with data that’s a mix of new and old content.
· When reading a blob with range requests, each request is atomic, but you could always end up with corrupt data if you request different ranges at different times and stitch them together. Using ETags (or If-Unmodified-Since) should protect you from this. (Requests after the content changed would fail with “condition not met,” and you’d know to start over.)
Only the last point is particularly relevant for the CDN, and it reads from blob storage and sends to clients in ways that obey the same HTTP semantics (so ETags and If-Unmodified-Since work).
For a client to end up with corrupt data, it would have to be behaving badly… i.e., requesting data in chunks but not using HTTP headers to guarantee it’s still reading the same blob. I think this would be a rare situation. (Browsers, media players, etc. should all do this properly.)
Of course, updates to a blob don’t mean the content is immediately changed in the CDN, so it’s certainly possible to get old data due to caching. It should just never be corrupt data due to mixing old and new content.”
So, as you see from Steve’s reply, there is no chance to get corrupt data, unlike other vendors, only old data.