Thursday, January 07, 2010

More Cloudy Thinking

Since my last post was about the cloud, why not another? I was reading the blog of Nicholas Carr (whose "Does IT Matter" currently holds a place of honor in the introduction to my book-in-progress), as he noted approvingly the attempt by Amazon to create a spot market for computing cycles. After failing to get my comments on that thought posted there (ain't technology grand?), I thought "Hey, I have one of those blog things myself." So, one cut, one paste, and here we are.

The attempt to analogize computing services with electrical utilities breaks down at some interesting points. The one relevant to this article is the point about the service provider attempting to avoid unused capacity. For Chicago Edison, customers were buying its services for themselves. For Amazon, customers are buying its services primarily for their own customers. This implies that one set of customers has a greater ability to regulate their usage in the face of cost fluctuations than the other set of customers (especially given the modern web business model in which the customers of AWS customers expect free and fast web services, rather than expecting to have to pay more for better response times).

Most of AWS "juice" is sold to companies who are doing things like running websites, or some form of web-based service for their own customers. These are applications that use both bandwidth and CPU in a notoriously bursty and uneven fashion. It is not uncommon to plan for a factor of 10 of overcapacity to account for the difference between normal and peak usage. Indeed, a big part of what companies hope(!) they are buying from AWS is exactly that: overcapacity -- the guarantee that one can add an order of magnitude of resources in a matter of minutes, without having to pay for that enormous difference in capacity when it is not needed.

This places AWS in the conflicted position of joyously promising customers that they will indeed have overcapacity without paying for its idle time, while not really wanting to (or being able to, in the long run) pay for enough overcapacity to deliver on that promise. Trying to create a spot market (note that this is specifically aimed at customers who do NOT have customer-facing, uncontrollably bursty needs) is an attempt to lessen the cost of this fundamental flaw in their business model. It would be interesting to know whether Amazon adjusts their own internal demands on the fly to lessen that cost as well; Chicago Edison was not using massive amounts of its own electricity to directly compete with some of the businesses of its customers -- another breakpoint in this analogy.

I think the label of "the cloud", and to a lesser extent, attempts to analogize computing to an electric utility, are distracting from basic facts. The "cloud" is really only an incremental alteration of the situation with the boring old issues of running an ISP: how much overbooking can you get away with before you piss too many customers off? How many outages can you have before customers gain a more realistic grasp of your quality of service? How many DDOS attacks and RBL listings before customers realize centralization means incurring difficult-to-estimate risks instigated by the behavior of other centralized customers? I believe the more relevant historical situation to examine is the mainframe, and the lessons learned (or often not learned) about the pros and cons of centralization versus decentralization of computing. Computing time was commoditized quite heavily nearly 50 years ago, just about the right amount of time for the lessons learned then to have been forgotten.

1 comment:

codeslinger at compsalot said...

Hi Ron,
That is one of the most intelligent things that I have heard said on the subject (hype) of Cloud Computing.

I've looked at it several times now, and for me it just does not pencil out. Cloud Computing is (currently) too expensive for the resources delivered; traditional ISP hosting is much cheaper.

The only advantage of the cloud that I can see is the concept of being able to handle bursty loads, but you just put the bust to that boon.

The thing that Cloud providers don't like to talk about is how ephemeral and vulnerable your data actually is.

It's fine for building a temporary infrastructure for a short duration project (the shorter the better). But for anything serious, and for data that is important, it just does not appear to add up.

On the other hand, for internal use, I do think that the Ubuntu Private/Enterprise Cloud approach is an interesting way of managing resources, you do gain a lot of flexibility. It looks like a good option in the virtualization arena.

The promise of the Cloud was that they could provide the service cheaper due to scale. But the reality which is perhaps driven by their need for massive overcapacity is that -- so far -- the Cloud costs much more than conventional ISPs for similar levels of bandwidth and cpu/memory.


The interesting thing is, that with all of this talk about automatic burst scaling, I don't perceive it. You are still adding capacity in chunks of complete virtual computers. Beyond some small limits, you can not add more memory or more cores to your existing virtual computer, so there is no apparent win here. Also the process of adding capacity appears to require manual deployment or possibly some very complex programming with proprietary API's.