My Faith in Amazon Web Services Has Been Renewed

It wasn’t long ago that I began looking into Amazon S3, an Amazon powered storage engine that serves as part of the larger Amazon Web Services product line. While it didn’t necessarily apply to my needs I quickly came to appreciate Amazon’s other web solutions such as EC2 as they had gained the reputation of maintaining a solid, stable, and scalable network, and these are the exact features that have made Amazon Web Services a go-to choice for large and demanding websites since its conception in 2002. And for quite some time Amazon has done an excellent job at maintaining this reputation, gaining the trust of several large sites and services such as Reddit, FourSquare, Heroku, and countless other notable sites.

Towards the end of last month, though, I woke up to find that a great many websites were down. It turned out Amazon Web Services had experienced an issue in its Virginia data center that caused widespread outages for companies, businesses, and services both small and large. And as the downtimes continued to grow and quickly exceeded Amazon’s initial estimations it became perfectly evident that the issue was much larger than Amazon had previously relayed. Now a tad more than a week after all of the issues have been fully resolved, I admit that I too began to second-guess Amazon Web Services; even going as far as to question the data center’s ability to accommodate the needs of its users.

Of course I’m not the only one who began to question Amazon Web Services. Spectators and users around the globe had to question Amazon Web Services for the first time, and I believe it’s safe to say that on that April morning Amazon’s reputation took quite a substantial hit. Today Amazon has come opened up about the incident by giving a complete run-down of exactly why the data center encountered crippling outages, and has even gone as far as to automatically compensate those using affected services (regardless of if they were personally affected or not) with a ten-day service credit equal to their usage at the time of error.

First and foremost, I think it speaks volumes (no pun towards the downed EBS storage backend intended) of Amazon to offer their users a credit for the downtime. Having said this, I had simply assumed that users were protected by Amazon’s Service Level Agreement (SLA) of 99.95% uptime and would be adequately compensated. However it soon came about that the SLA actually didn’t offer a guarantee for the Elastic Block Services backend that experienced issues, but rather with the EC2 computation platform. With this in mind, Amazon was never (legally) obligated to give anyone refunds or credits of any form but opted to do so anyway at their own free will.

Sure, some organizations will see the ten-day credit as being nowhere near equal to the headaches and lost traffic that they suffered because of Amazon’s downtime. Nonetheless this is definitely better than nothing at all and at principal it shows admission of fault on Amazon’s part, and more importantly illustrates that the company is definitely apologetic about the entire situation.

But really, even though I can appreciate the fact that Amazon is crediting customers, I honestly appreciate their statement and explanation more than anything else. At more than 5,500 words the open letter on Amazon’s AWS website outlines in detail what happened, what caused it to happen, what Amazon did to resolve it, and what the company is doing to make sure that this type of problem doesn’t arise again.

This type of communication is invaluable in making users aware of the situation surrounding the outage, and more importantly I feel that it makes users more aware that Amazon did everything in their power to fix things in a timely and efficient matter; giving users a better understanding that the circumstances were really out of Amazon’s reasonable control. I know that I for one am even more confident in Amazon Web Services than I was before, simply because this letter has opened my eyes to the sheer complexity that is AWS and the fact that the company works hard to maintain their services. For Amazon, I think this is one of the best moves that they could make as having this type of communication with ones customers is invaluable in any business.

So what am I getting to? Nobody is perfect and no service is flawless. But the fact that Amazon is being open about the whole situation and is stepping up to the plate and admitting their fault; something that is a rarity at best.