I often get asked if there is any published work evaluating performance and cost of scientific applications on IaaS clouds and comparing them to using clusters — and I always say LOTS! …and then can’t remember more than a few off the top of my head ;-). So I recently put together a list — included below — of various evaluation and comparison efforts I’ve been able to find. They look all sorts of aspects of performance — from low-level benchmarks to applications of various types, from reliability to cost. They all tend to focus on somewhat different aspects of the issue and collectively paint a picture blessings and challenges of cloud computing for science.
My personal favorite is “Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud Amazon Web Services Cloud” — on top of the list since, having just come out at CloudCom 2010 last December it is the most recent. The authors evaluate the AWS IaaS offering based on the NERSC benchmarks framework — a comprehensive set of benchmarks capturing the typical workload in a scientific datacenter. They report not only the performance characteristics of scientific applications on virtual clusters created in the cloud but also note the mean time between failures (MTBF) of a virtual cluster deployed on cloud resources — the consequences of which I (coincidentally) blogged about around the time this paper was presented.
And finally, I have a favor to ask — if you know of papers evaluating various aspects of scientific applications on IaaS clouds or have favorites in the filed, or opinions on what you would like to see evaluated — please tell us about it. I will do a post of lessons learned.
Evaluation of IaaS clouds for scientific applications:
- “Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud Amazon Web Services Cloud” by K. Jackson, L. Ramakrishnan, K. Muriki, S. Canon, S. Cholia, J. Shalf, H. Wasserman and N. Wright, CloudCom 2010
- “Developing a Cloud Computing Charging Model for High-Performance Computing Resources“, M. Woitaszek and H. Tufo, CIT 2010
- “Seeking Supernovae in the Clouds: A Performance Study”, K. Jackson, L. Ramakrishnan, R. Thomas and K. Runge, Science Cloud 2010
- “The Impact of Virtualization on Network Performance of Amazon EC2 Data Center”, G. Wang and T. E. Ng, INFOCOM 2010.
- “Data Sharing Options for Scientific Workflows on Amazon EC2″, G. Juve, E. Deelman, K. Vahi and G. Mehta, SC 2010
- “Scientific computing in the cloud”, J. Rehr, F. Vila, J. Gardner, L. Svec, and M. Prange, Computing in Science and Engineering, 2010.
- “High Performance Computing with Clouds”, R. Masud
- “Can Cloud Computing Reach the TOP500?”, Napper, J. and P. Bientinesi, Unconventional High-Performance Computing (UCHPC) 2009.
- “Using Clouds for Metagenomics”, Wilkening, J., Wilke, A., Desai, N. and Meyer, F, CLUSTER 2009.
- “On the Use of Cloud Computing for Scientific Workflows”, Hoffa, C., G. Mehta, T. Freeman, E. Deelman, K. Keahey, B. Berriman and J. Good., SWBES 2008
- “An early performance analysis of cloud computing services for scientific computing”, S. Ostermann, A. Iosup, N. Yigitbasi, R. Prodan, T. Fahringer, and D. Epema, Delft University of Technology, Tech. Rep, 2008.
- “Scientific computing using virtual high-performance computing: a case study using the Amazon elastic computing cloud”, S. Hazelhurst, conference of the South African Institute of Computer Scientists and Information Technologists on IT research in developing countries, 2008
- “Amazon S3 for Science Grids: A Viable Solution?”, Palankar, M., A. Onibokun, A. Iamnitchi, and M. Ripeanu, International Workshop on Data-Aware Distributed Computing, 2008.
- “Cloud Computing for parallel Scientific HPC Applications: Feasibility of running Coupled Atmosphere-Ocean Climate Models on Amazon’s EC2″, C. Evangelinos and C. N. Hill, Cloud Computing and its Applications (CCA) Workshop, 2008
- “Benchmarking Amazon EC2 for high-performance scientific computing”, E. Walker, LOGIN: vol. 33, no. 5, 2008
- “An Evaluation of Amazon’s Grid Computing Services: EC2, S3 and SQS”, S. Garfinkel, Technical Report TR-08-07, 2007