Science Cloud Workshop

Posted by Kate Keahey | Posted in News | 01-12-2011 | One response

Happy New Year!

To get it off to a new start check out the call for papers for the ScienceCloud 2010 workshop – announced right before Christmas!

The last year’s Science Cloud workshop was a great venue for anybody interested in cloud computing for science. The program covered everything from scientific cloud platforms (and how to set them up), through standards and middleware, to case studies of scientific applications on commercial cloud platforms such as Amazon and Azure. The latter were perhaps the most interesting of the workshop – and in fact one of them, a performance study of a cosmology application on Amazon from Lawrence Berkeley National Lab won the best paper award. The slides and papers can be viewed online – still a great reference to see what’s happening in cloud computing for science.  Some of the papers will also appear in the Scientific Programming Journal’s special issue on science-driven cloud computing.

This year’s Science Cloud workshop will be again collocated with HPDC and solicit papers addressing similar issues. If you are using clouds for science and have a thing or two to say about it this may be a good opportunity.

Another Barrier Goes Down

Posted by Kate Keahey | Posted in News | 07-16-2010 | No responses

Right on the heels of Amazon’s groundbreaking news on the Cluster Compute instances a couple of days ago, comes this announcement about a partnership between CENIC, Pacific NorthWest GigaPoP (PNWGP), and Amazon: two 10 Gigabit per second (Gbps) connections to Amazon S3 and EC2. This connection will be available to CENIC and PNWGP member institutions (educational and research institutions on the West Coast and in the Pacific North-West) — among others, the many ocean scientists of the Ocean Observatory Initiative (OOI) who we are working with to develop cloud-based scientific infrastructure.

In other words, we can now not only use IaaS to lease supercomputers – we can move data to those supercomputers fast. While so far these high-speed connections to Amazon are not generally available, it will be interesting to see what scientists will be able to do with them, what performance will be achievable in practice, and how it will change the scientific use of cloud computing. That’s two great developments for cloud computing for science in one week – it seems that the pace of progress on this front is accelerating ;-) .

There is a New Supercomputer on the Block

Posted by Kate Keahey | Posted in News | 07-13-2010 | No responses

We all woke up to a game-changing announcement today: Amazon announced Cluster Compute instances designed to support the kinds of closely coupled workloads that high performance computing (HPC) relies on. The Cluster Compute instances consist of a pair of quad-core Intel “Nehalem” processors with 23 GB of RAM, and 1690 GB of local instance storage. But by far the best part of the offering is the 10 Gbps network that connects Cluster Compute instances — essential for HPC applications.

The real headline though is that for the first time ever a virtual cluster could be featured on the Top500 list. Amazon published the result of the High Performance Linpack benchmark on a virtual cluster made up of 880 Cluster Compute instances (7040 cores) and measured the overall performance at 41.82 TeraFLOPS. This would place a virtual cluster made out of Cluster Compute instances in the 146th position on the Top500 list. For a sense of scale, the somewhat larger in size TACC Lonestar cluster, serving as computational resource in TeraGrid, currently occupies the 123th position on this list.

How much does it all cost? A quick back-of-the-envelope calculation shows that at $1.60 per hour, the Cluster Compute on-demand instances cost about $14K per node per year. However, if you use reserved instances the price drops significantly. Based on 100% utilization for a 3 year reserved instance (which is more similar to buying a supercomputer for 3 years) you’d pay only $0.81 per instance ($6590 up front and $0.56 per hour), in other words, $7K per node per year – but that’s all-inclusive, no additional operating costs. This rough calculation does not include the cost of EBS and data transfer which to some extent depend on the use of the cluster — still, something to keep in mind.

EC2′s boot from EBS capability

Posted by Tim Freeman | Posted in News | 12-04-2009 | No responses

Amazon AWS recently announced that EC2 instances can be configured to launch from EBS volumes instead of bundled disk images.

Science users launching heterogeneous clusters can possibly take advantage of this in order to streamline the bundling of images. Those clusters often share a base image layout. Because these AMIs can now reference any number of EBS volumes in their external description including for the root disk, you can now work on customizing each partition and “mix and match” root disks and partitions more easily to make a cohesive cluster. That’s more convenient than maintaining such a partition organization separately and bundling images for each cluster node type, which is traditionally time consuming.

Another change is that the AMI can be above 10GB (up to 1TB) when launched in this manner: some clusters we have seen are pushing that limit even without any data sets!

There is an added cost involved in using EBS which must be taken into account. And EBS is charged by both disk size and number of I/O operations, so this may not be useful in a lot of cases.