Cloud computing – Most common concerns and my thoughts

Every single time something new emerges in the IT market, there are three distinct categories of people: Early adopters, Late majority, Skeptics. Each one of them has its own reasons and its own concerns about when, why and if they are going to adapt to this new technology, either completely or in a hybrid mode.

All of them have some things in common. They share the same concerns about security, billing, taxes, availability, latency and probably some others.

Concerns and my thoughts.

Billing is something that can easily be solved and explained by clever and good marketing. Unfortunately, there is no such thing as local billing. Cloud computing services are truly global and the same billing model, the same prices, the same billing methods have to be used to provide a fair and consistent service to every customer. This has to change. For some markets, a service can be really cheap, but for some others can be really expensive. Increasing the price in some countries and decreasing in some others can make the service more fair and more adoptable. Using a credit card to identify the country is a good method, but there is a problem. It’s called taxes.

Taxes is a way for a government to make money. In many countries, Greece being one of them, having a credit card with a decent limit it’s a privilege. Unfortunately I mean it in a bad way. Interest is quite high and with such an unstable tax policy you can never be sure if there won’t be any extra fees you might have to pay sooner or later. But I guess this is not only our problem but for some other countries too, mostly emerging markets. Providing another way of paying monthly fees for service usage, can easily overcome this.

Security. Oh yes, security. Countless questions during presentations and chats are about security. Tons of “what if”. Yes, it’s a big issue. But too much skepticism is never good. I believe people are not worried about security issues like data leakage/stolen/etc. There are worried because they somehow lose control of their information. At least this is what they believe. The idea that their data are not stored in their own hardware but somewhere else and not even in the same country, terrifies them. I’m not sure if there is anything that can be done to subdue this concern but at least, there can be some localized data centers for example, banks were regulatory laws demand data to be stored in the same country, if not on-premises owned by the bank. Private cloud could probably meet those regulations.

Latency. That’s an easy one. Its principal is the same as security. My data are over there and there might be a significant latency until I get a response. Yes there is a delay no it’s not that big, probably somewhere between 60 to 100 ms. For applications that are not real time, this is really really low. You can even play shoot’em’up games with 100ms latency. The only thing we can do, is have a requirement for a decent DSL line from our customers in case our application, locally installed, is accessing a cloud service. Also picking the right region to deploy our application can have a significant impact on latency.

Availability. People are worried about their data not being available when most needed. The further their data are, the more points of failure. Their internet line, their ISP line, a ship cutting some cables 4000km away. Most, if not all, cloud service providers provide 3 or 4 “nines” of uptime and availability, but there are a lot of examples of services failing from unpredicted code or human errors (eg Google). Other companies have proved more trustworthy and more reliable.


Concluding this post, I want to make something clear. I’m not part of those distinct groups of people. I started playing with cloud computing services right after Amazon removed beta label from its AWS service, back in 2008 (April if I recall correct), with Windows Azure following at PDC ‘08. I had my first token back then and started playing with it. I’ve seen Windows Azure shape and change within those two years in something amazing and really ground breaking. Windows Azure can successfully lower or even eliminate your concerns in some of the matters discussed above, but there is room from improvement and always will be. I’m going to dig a little deeper on those matters and try to provide more concrete answers and thoughts.

Thank you for reading so far,


Windows Azure Table Storage – Is backup necessary?

Yes, it is.

Table storage has multiple replicas and guarantees uptime and availability, but for business continuity reasons you have to be protected from possible failures on your application. Business logic errors can harm the integrity of your data and all Windows Azure Storage replicas will be harmed too. You have to be protected from those scenarios and having a backup plan is necessary.

There is a really nice project available on Codeplex for this purpose:


Windows Azure – Did you know.. that you can set your OS Version?

Windows Azure configuration files support an osVersion attribute where you can set which version of the Windows Azure OS should run your service.

This feature doesn’t make much sense at the moment as there is only one version WA-GUEST-OS-1.0_200912-01 but in the future it’s going to be very handy.

You can learn more about it here.


Windows Azure Training Kit – December update is out

Windows Azure training kit it the best starting point if you want to get involved in Azure development. It helps you understand the basics of Windows Azure, its components and the whereabouts of the service.

December’s release includes some updates and samples from PDC 09 so don’t miss it.

You can download the kit from here –>


Windows Azure – Dynamically scaling your application

When you have your service running on Windows Azure, the least thing you want is monitoring every now and then and decide if there is a necessity for specific actions based on your monitoring data. You want the service to be, in some degree, self-manageable and decide on its own what the necessary actions should take place to satisfy a monitoring alert. In this post, I’m not going to use Service Management API to increase or decrease the number of instances, instead I’m going to log a warning, but in a future post I’m going to use it in combination with this logging message, so consider this as a series of posts with this being the first one.

The most common scenario is dynamically increase or decrease VM instances to be able to process more messages as our Queues are getting filled up. You have to create your own “logic”, a decision mechanism if you like, which will execute some steps and bring the service to a state that satisfies your condition because there is no out-of-the-box solution from Windows Azure. A number of companies have announced that their monitoring/health software is going to support Windows Azure. You can find more information about that if you search the internet, or visit the Windows Azure Portal under Partners section.

In the code below I’m monitoring the messages inside a Queue at every role cycle:

1: CloudQueue cloudQueue = cloudQueueClient.GetQueueReference("calculateP");
3: cloudQueue.CreateIfNotExist();
4: cloudQueue.FetchAttributes();
6: /* Call this method to calculate your WorkLoad */
7: CalculateWorkLoad(cloudQueue.ApproximateMessageCount);

and this is the code inside CalculateWorkLoad:

1: public void CalculateWorkLoad(int? messages)
2: {
3: /* If there are messages, find the average of messages
4: available every X seconds
5: X = the ThreadSleep time, in my case every 5 seconds */
6: if (messages != null)
7: average = messages.Value / (threadsleep / 1000);
9: DecideIncDecOfInstances(average); 10: }

Note that if you want to get accurate values on queue’s properties, you have to call FetchAttributes();

There is nothing fancy in my code I’m just finding an average workload (number of messages in my Queue) every 5 seconds and I’m passing this value at DecideIncDecOfInstances(). Here is the code:

1: public void DecideIncDecOfInstances(int average)
2: {
3: int instances = 2;
5: /* If my average is above 1000 */
6: if (average > 1000)
7: OneForEveryThousand(average, ref instances);
8: WarnWeNeedMoreVM(instances);
9: }

OneForEveryThousand count is actually increasing the default number of instances, which is two (2), by one (1) for every thousand (1000) messages in Queue’s average count.

This is the final part of my code, WarnWeNeedMoreVM which logs our need for more or less VM’s.

1: public void WarnWeNeedMoreVM(int instances)
2: {
3: if (instances == 2) return;
5: Trace.WriteLine(String.Format("WARNING: Instances Count should be {0} on this {1} Role!",
6: instances, RoleEnvironment.CurrentRoleInstance.Role.Name), "Information");
7: }

In my next post for these series, I’m going to use the newly released Service Management API to upload a new configuration file which increases or decreases the number of VM instances in my role(s) dynamically. Stay tunned!


Patterns: Windows Azure – In-Place upgrades

In a previous post I’ve described what a VIP Swap is and how you can use it as an updating method to avoid service disruption. This particular method doesn’t apply to all possible scenarios and if not always, most of the times, during protocol updates or schema changes you’ll need to upgrade your service when its still running, chunk-by-chunk and without any downtime or disruption. By In-Place, I mean upgrades that take place during which both versions (old version and new version) are running side-by-side. In order to better understand the process below, you should read my “Upgrade domains” post in which there is a detailed description of what Upgrade domains are, how they affect your application, how you can configure the number of domains etc.

To avoid service disruption and outage Windows Azure is upgrading your application domain per domain (upgrade domain that is). That will result in a particular state where your Upgrade Domain 0 (UD0) is running a newer version of your client/service/what_have_you and your UD1, UD2 etc will run an older version. The best approach is to have a two-step phase upgrade.

Let’s call our old protocol version V1 and our new version V2. At this point, you should consider introducing a new client version called 1.5 which is a hybrid. What this version does is understanding both protocols used in both versions but always use protocol V1 by default and only respond by protocol V2 if they request is on V2. You can now start pushing your upgrades either by Service Management API or using Windows Azure Developer portal to completely automate the procedure. By the end of this process, you’ll achieve a seamless upgrade to your service without any disruption and all of your clients will upgrade to this hybrid. As soon as your first step is done and all of your domains are running version 1.5, you can proceed to step two (2).

In your second step you’ll be repeating the same process but this time your version 2 clients will use protocol V2 by default. Remember, your 1.5 clients DO understand protocol V2 and they respond to it properly once called upon with. To make it simple, this time you’re deploying version 2 of your client which uses version 2 of your protocol only. Old legacy code for version 1 is removed completely. As your upgrade domains complete the second step you’ll be having all your roles using version 2 of your protocol, again without any service disruption or downtime.

Schema changes have a similar approach but I’ll make a different post and actually put some code on it to demonstrate that behavior.


Windows Azure – What is an upgrade domain?

Windows Azure automatically divides your role instances into some “logical” domains called upgrade domains. During upgrade, Azure is updating these domains one by one. This is a by design behavior to avoid nasty situations. Some of the last feature additions and enhancements on the platform was the ability to notify your role instances in case of “environment” changes, like adding or removing being most common. In such case, all your roles get a notification of this change. Imagine if you had 50 or 60 role instances, getting notified all at once and start doing various actions to react to this change. It will be a complete disaster for your service.
Source: MSDN

The way to address this problem is upgrade domains. As I said, during upgrade Windows Azure updates them one by one and only the associated role instances to a specific domain get notified of the changes taking place. Only a small number of your role instances will get notified, react and the rest will remain intact providing a seamless upgrade experience and no service disruption or downtime.

Source: MSDN

There is no control on how Windows Azure divides your instances and roles into upgrade domains. It’s a completely automated procedure and it’s being done on the background. There are two ways to perform an upgrade on a domain. Using Service Management API or the Windows Azure Developer portal. On the Developer Portal there are two more options. Automatic and manual. If you select automatic, Windows Azure will upgrade your domains without any hassle about what is going on. If you select manual, you’ll have to upgrade all of your domains one by one.

This is some of the magic provided by Windows Azure operating system and Windows Azure platform to provide scalability, availability and high reliability for your service.


Windows Azure Table Storage and concurrency

Today, during my presentation at Microsoft DevDays “Make Web not War” I had a pretty nice question about concurrency and I left the question somehow blurry and without a straight answer. Sorry, but we were changing subjects so fast that I missed it and I only realized it on my way back.

The answer is yes, there is concurrency. If you examine a record on your table storage you’ll see that there is a Timestamp field or so called “ETag”. They are not the same thing, I just want to avoid confusion of terms. Windows Azure is using this field to apply optimistic concurrency on your data. When you retrieve a record from the database, change a value and then call “UpdateObject”, Windows Azure will check if timestamp field has the same value on your object as it does on the table and if it does it will update just fine. If it doesn’t, it means someone else changed it and you’ll get an Exception which you have to handle. One possible solution is retrieve the object again, update your values and push it back. The final approach to concurrency is absolutely up to the developer and varies between different types of applications.

As I mentioned during my presentation, there a lot of different approaches to handle concurrency on Windows Azure Table Storage. There is a very nice video on the “How to” section on MSDN about Windows Azure Table Storage concurrency which can certainly give you some ideas.


Windows Azure – From an architect and cost (?) perspective

Its been a while since Windows Azure caught the attention of a broader audience rising all kinds of questions from the most simple “How do I access my table storage” to the most complex and generic “How do I optimize my code to pay less”. Although there is a straight answer on the first one, that’s not the case on the second. Designing an application has always been a fairly complex scenario when it comes to high scale enterprise solutions or even on mid-sized businesses.

There are a few things to consider when you’re trying to “migrate” your application to the Azure platform.

You should have in mind that, you literally pay for your mistakes. As long as you’re not a developer account, which by the way you “pay” but in a different way, you are on a pay as you go model for various things, like Transactions, Compute hour, Bandwidth and of course storage. Oh and database size on SQL Azure. So every mistake that you make, let’s say un-optimized code producing more messages/transactions that necessary, you pay for it.

What’s a Compute hour? Well, it’s the number of instances multiplied by “consumed” (service) hours. And what’s a consumed (service) hour? It’s a simple medium size Azure instance having an average of 70% CPU peak, on a 1.60 GHZ CPU with 1.75 GB of RAM and 155 GB of non-persistent data. That means it’s not “uptime” as many of you think it is. So, when it comes to compute hours, you should measure how many hours does your application consume and optimize your code to consume less. Disable any roles, either Worker or Web when you don’t need them, it will cost you less.

SQL Azure as your database server. There are some catches here also. You should consider re-designing and re-writing the parts where your application is using BLOBs, specially when they are big ones, to use Azure table storage. That’s because of two reasons. First, there are only 2 editions of SQL Azure available as of now, 1GB and 10GB database size limit. You can pretty easily hit the limit when you have large amounts of BLOBs (images, documents etc). The second reason is that you don’t take advantage of the Azure CDN (Content Delivery Network) which means all your clients are served by the server(s) hosting your database even if this is slower because of the network latency. If you use Azure CDN , your content is distributed at various key points all around the world (Europe, Asia, USA etc) and your clients are served by the fastest available server nearby their location. And that’s not only that. Azure storage is using REST which means your content is also cached by various proxies increasing the performance even more.

Any thoughts?


Windows Azure – Some more name changes and new stuff

It’s been a interesting week for Windows Azure. Since last Friday (13th of November) the latest Windows Azure Tools was released and strangely the installer was marked to be a 1.0 release although I’m not sure it really is one.

  • Project Dallas has been announced which is a MarketPlace for Vendors to publish their Azure Services for consuming and start making money out of it.
  • SQL Azure Data Sync Beta 1 was released. You can use it to synchronize cached data on your offline client with SQL Azure on the cloud just like using any other Synchronization Adapter/Manager on Microsoft Sync Framework. In order to install it and mess around with samples you need Microsoft Sync Framework 2.0 SDK.
  • .NET Services was renamed to Windows Azure AppFabric to match terms with Windows Server AppFabric which includes former named projects Velocity and Dublin in one single product. .NET Services consist of Access Control (Claims based authentication) and a Service bus to communicate with other applications.
  • New APIs were released to provision and manage Windows Azure Services.
  • There have been various enhancements on the platform and the Web UI like how many seconds does it take to switch from Staging to Production, a TCO Calculator has been introduced on the main website etc.

More to come, stay tuned.