Cloud computing users think on-demand availability is the best thing since sliced bread: it enables elastic computing, outsourcing for applications requiring urgent or interactive response, and reduces wait times in batch queues. But if you are a cloud provider you might not think so… In order to ensure on-demand availability you need to overprovision: keep a lot of nodes idle so that they can be used to service an on-demand request, which could come at any time. This means low utilization. The only way to improve it is to keep fewer nodes idle. But this means rejecting more requests – at which point you’re not really on-demand… a veritable catch-22.
Low utilization is particularly hard to swallow in the scientific community where utilization rates are high – why configure a cloud if the good old batch scheduler will amortize your resource so much better?
This gave us an idea. What if we deployed a VM on every idle cloud node and joined it to something that is used to operating in an environment where resources are coming and going, something that makes use of screensaver time on people’s desktops, something like SETI@home or a Condor pool? We went to work and extended Nimbus so that an administrator can configure a “backfill VM” that gets deployed on cloud nodes by default. When an on-demand request comes in, enough backfill VMs get terminated to service the request; when the on-demand request finishes, the backfill VMs get deployed again.
For the user, this solution means that they can now choose from two types of instances: on-demand instances — what you get from a typical EC2-style cloud — and opportunistic instances – a pre-configured VM joining a volunteer computing pool. To find out what this means for the provider, Paul Marshall ran some experiments with backfill VMs configured to join a Condor pool and came up with the graphs on the right: a purely on-demand cloud is cold – add backfill VMs and the system heats up… you can get 100% utilization! More details in Paul’s CCGrid paper.
But now we had more questions: who can configure the backfill VM: does it have to be the administrator or could it be the user? And: can we have more than one type of backfill VM? One simple refinement would be for the admin to use multiple VMs and have policies on what percentage of available cycles should be devoted to each. But if we are going to allow the user to submit such VMs, how is this percentage set? Somebody has already figured this out before: why not auction it off – and provide spot instances. Both the backfill VMs and spot pricing are special cases of a more general mechanism… that we released today in Nimbus 2.7.
Nimbus 2.7 contains both the backfill and spot pricing implementation — different configurations of roughly the same thing. This makes Nimbus the only EC2-compatible open source IaaS implementation with support for spot pricing. Backfill instances may be more relevant to scientific settings where the concept of payment is not explicit, and simulating it with auctions has been known to cause “inflation”. We hope it will allow providers to leverage their cloud cycles better. And we also hope that it will provide a flexible tool for all of you investigating and fine-tuning the relationships between various aspects of resource utilization, energy savings, cost, and pricing.
Best for last — this is probably the first time that content is not the most important feature of a Nimbus release. You’ve seen us mention the hard work of our open source community contributors before – but this is the first time that open source contributions are primarily responsible for a Nimbus release. The spot pricing implementation is the work of our brilliant Brazilian contributor Paulo Ricardo Motta Gomes (just one person despite appearances to the contrary… sponsored by the Google Summer of Code (GSoC) last summer. And incidentally, there may be new opportunities this year — watch the Nimbus news feed for details.