We just reduced image propagation time on FutureGrid’s Sierra cloud at UCSD from hours to minutes! This magic comes courtesy of Nimbus LANTorrent.
We blogged about LANTorrent before: it can distribute the same file among many nodes using peer-to-peer techniques. It is available in Nimbus since version 2.6 and allows users to efficiently deploy a cluster of virtual machines based on the same image. Installing and configuring LANTorrent on the Nimbus nodes (both service and hypervisor nodes) is easy; it took only a couple of hours on Sierra (all the details are explained in the LANTorrent Configuration section of the Nimbus documentation).
Granted, Sierra’s configuration helped make this spectacular. Sierra is backed by an NFS server connected via Gigabit Ethernet to the cluster. To transfer virtual machines to a hypervisor node, a copy of the virtual machine image is made using scp. While copying a single virtual machine image can be done in less than one minute, deployment of large virtual clusters takes much more time because all file transfers originate from the centralized NFS server. With LANTorrent however, it is dramatically faster! Instead of forcing 100 copies of the same files through the NFS server’s single NIC, LANTorrent uses the collecting power of every receiving nodes’ NIC to transmit the data, bringing us approximately a 3x speedup! I made a graph that compares deployment time of a 4 GB virtual machine image (SCP in red, LANTorrent in blue). Faster is not all – the growth rate is where we really reap the benefits!