On-demand Scaling of Computing Clusters

In the particular case of computing services, such as working nodes managed by existing LRMs (Local Resource Managers) like SGE, LSF, OpenPBS..., OpenNebula allows a physical cluster to dynamically execute multiple virtual clusters, so providing the following benefits: * **On-demand resource provisioning** because the number of working nodes can grow according to user demands so that there are always computing slots available * **Cluster consolidation** because virtual working nodes can run on a lower number of physical resources, reducing the number of physical systems and so space, administration, power and cooling requirements. The allocation of physical resources to virtual nodes could be dynamic, depending on its computing demands, by leveraging the migration functionality provided by existing VMMs * **Cluster partitioning** because the physical resources of a cluster could be used to execute working nodes bound to different virtual clusters * **Support for heterogeneous workloads** with multiple (even conflicting) software requirements, allowing the execution of software with strict requirements as jobs that will only run with a specific version of a library or legacy application execution This solution represents a way to **separate resource provisioning from execution management**, performed at service layer by the LRM. ====== Architecture ====== {{documentation:one-sge.png?direct&600}} ====== Sample Deployment with SGE ====== You need to configure the network as described in [[:documentation:documentation|Howto configure networking]]. This will be the basis to create new Virtual SGE nodes with different names. To create the base VM image you can use **xen-create-image** command, install a pristine OS into an image or use an already working VM image. Make sure you configure the network as described in the aforementioned networking howto. You also need to configure the machine as an SGE Work node the same way you have configured your SGE physical nodes (NFS, NIS, execd, etc.) This image will be the base for your new Virtual SGE Worker nodes. For each Virtual node you have to copy images to another place. We have a directory called sgebase with the images inside and we copy all the directory to another with the name of the new node: $ cp -R sgebase sgehost01 Then you have to create a new VM template (or copy an edit another one). Here is an example: MEMORY=64 CPU=1 OS=[ kernel="/boot/vmlinuz-2.6.18-4-xen-amd64", initrd="/boot/initrd.img-2.6.18-4-xen-amd64", root="sda1", boot="hd"] DISK=[ source="/local/xen/domains/xen/domains/sgehost/disk.img", target="sda1", readonly=no] DISK=[ source="/local/xen/domains/xen/domains/sgehost/swap.img", target="sda2", readonly=no] NIC=[mac="00:16:3e:01:01:03"] Change path to images, your kernel and ramdisk version and other parameters. Pick one from your dhcp server configuration for the MAC in order to match the name you want for this machine. Then to bring up your new virtual host you should execute this command: $ onevm create After that you only need to tell SGE that there is a new host ready to run jobs: $ qconf -ah sgehost01 $ qconf -as sgehost01 $ qconf -se You may want to add this new host to a group: $ qconf -mhgrp @allhosts