Why the EC2 Outage Might Have Killed Amazon’s Shot at PaaS Dominance

Today is Monday morning after the biggest cloud outage ever, and one thing is certain: CEOs in so many board rooms around the world are heatedly demanding from their CTOs that something like this never happen to their businesses again.

CTO: “but we did plan for fast recovery into a different availability zone. Who knew that it would be dead, too.” CEO: “never mind availability zone, whatever that may be, you made our business hostage to one company, we lost a gazillion dollars and so many customers, and you must not ever let that happen again. Go get our site running on a few others clouds, too, starting yesterday, or I will get a CTO who will.”

So the CTO will turn around and port their site to Rackspace, GoGrid, Joyent, Savvis or whatever other IaaS provider, and perhaps even take the site back into their own data center. Which means they will have to throw out all code that is specific to Amazon and program towards APIs that exist on other clouds, too.

Elastic Beanstalk? Very cool, but sorry, can’t do any more. Cloudformation? Monitoring? Build tools from the command line? Next cool service to be announced next week? Sorry, can’t do: we must now have alternate ways of doing this that don’t lock us into Amazon and work on other clouds just as well.

As many commentators have pointed out, Amazon’s rise has been one of the stealthiest rises of a major platform provider in computing history. From crappy low-level functionality, it seems hardly a month has recently gone by in which not some other tab with cool functionality has shown up in the Amazon management console. That so many sites did go off-line as a result of the outage, and few had any working failover strategy underlines just how successful Amazon has been to become an indispensable infrastructure provider for many businesses. Too indispensable, I’m sure, for many CEOs this morning.

So here’s what I expect to happen:

  • The hosted PaaS companies that went down (like Engine Yard and Heroku) will accelerate their plans to become multi-cloud/multi-IaaS. Being multi-cloud will become one of the “checkbox features” required of any serious new hosted PaaS company: all other things being equal, as a developer, why would you run your code on a single-cloud PaaS service if you could also run it on a multi-cloud PaaS service?
  • Open-source PaaS systems (like CloudFoundry) will see a surge of interest from SaaS companies currently running on EC2, in the hope that they can use the same PaaS infrastructure on multiple clouds. If they are fast enough, OS companies (like Red Hat with Makara) might be able to play a significant role there.
  • Other IaaS companies (like Rackspace) will even further empathize the need for standard APIs (e.g. OpenStack), this time as an insurance policy against cloud failure.
  • Some startups may see an opportunity here to offer “federated cloud” solutions, along the lines of “if you program towards our APIs, it will run on any cloud”. Sort of the Java “write once, run anywhere” proposition in the age of the cloud. It could be a hugely disruptive cloud play for an established software platform company (Microsoft? the old Sun? HP?) if they could pull it off (which of course would be highly doubtful).

I had been wondering for how long Amazon could move up further and further the cloud computing stack and be successful at keeping it as a vertically integrated solution without a second supplier. This event is the first serious damper on its ambition to move from IaaS to PaaS. And a great opportunity for the rest of the market not only to catch up, but to build fundamentally more capable solutions than reliance on a single supplier can give us, even if it is as mighty as Amazon.