The new High Performance Computing Cloud (HPC Cloud) at SURFsara has recently opened up to the whole Dutch Research and Academic community. The Oort cluster is born! Existing and coming users can benefit from various changes over the old Cloud, such as more powerful hardware, a robust and user-friendly interface and many other new features.
Our HPC Cloud service started in 2011. We implemented the HPC Cloud as IaaS (Infrastructure as a Service) platform, relying on the OpenNebula software. Our goal was to provide scientists with their own virtual environment, including processing, storage and networking resources, and full control over their own virtual machines (VMs). The scientists can log into their virtual machines, install their own operating system and applications, upload data and launch their HPC analysis. Some users prepare single multi-core or high-memory VMs, and others launch multiple VMs that cooperate in a private cluster, depending on the research requirements.
Many scientists benefit from the system’s flexibility, because running their applications on the HPC Cloud requires less modification than running their application on other traditional HPC environments. Our users appreciate this flexibility, and we have seen a steady increase in the number of projects. Thus far, about 300 projects from various research fields (Life Sciences, Informatics, Ecology, Business Social sciences, Engineering, etc.) have successfully used the HPC Cloud.
However, the old cloud was designed mainly for compute power. With the growing popularity of the HCP Cloud, the focus shift towards more data centric science and the variety of projects run on the cloud, we realised that our users needed a more optimized platform. Therefore, we decided to build a completely new system in order to meet these demands. The new cloud is designed to fulfil as many user requirements as possible: a lot of computing power, lots of fast data storage and a fast network, all delivered through a user-friendly web interface.
The thing about Oort
Our engineering team Martijn, Rogier and Esteban have installed and configured Oort piece by piece. Together with the advisors, the SURFsara Cloud team has worked hard to make the massive infrastructure upgrade possible. The upgrade includes major changes in both the software and the hardware backend.
The underlying cloud software on the new HPC Cloud runs on the latest version of OpenNebula (v4.12). This offers an intuitive user interface, a scheduler for well-balanced and efficient use of our resources and the possibility to accommodate various user profiles, for novice, advanced and master users.
The new HPC Cloud uses Intel Haswell CPUs for good overall performance, and offers GPU nodes for GPU-accelerated applications. We offer 2 storage types: Ceph and SSD. Data stored on Ceph is replicated, to protect against data loss in case of hardware failure. Users can keep the operating system images on SSD, and store the bulk data on Ceph datablocks. Small, local and fast storage is typical for High Performance Computing and is mainly used for computations and operating systems. Whereas “big data” input and computed results are copied to the larger and more reliable network storage (Ceph).
Launching VMs on the new HPC Cloud is easier than ever. Anyone who acquires an account on the user interface can import a pre-made OS image from our AppMarket repository, then configure the Virtual Machine’s number of CPUs, amount of RAM, network, boot and data images at will and fire-up the machine. This way, the VM is tailored to the user’s needs.
The Oort cluster consists of powerful compute and high-performant storage nodes, the physical machines where the user’s run their Virtual Machines (VMs). The hardware powering Oort:
- 32 HPC compute nodes @ 64 cores, 256 GB RAM, 3.2 TB local SSD
- 1 High Memory node @ 40 cores, 2 TB RAM, 3.2 TB local SSD
- 12 GPU compute nodes @ 32 cores, 256 GB RAM, 3.2 TB local SSD
- 900 TB storage on distributed object storage Ceph (2.7 PB gross, with 3-fold redundancy)
- Fast network between compute Virtual Machines
We introduce a new, more sophisticated user interface (UI). The users interact with the HPC Cloud via the web interface that OpenNebula offers. This is far more intuitive than before; it is now possible to run a VM on the Oort cluster with just few clicks. It is that simple.
The new role of the “groupadmin” allows master users to have full control of their projects. This includes creating new users, assigning quotas and tracking the project accounting in a dashboard.
Similarly to the old Cloud Wizard, the new HPC Cloud allows easy creation of a Linux VM from the AppMarket or through our appliances repository.
The facility for sharing files among the project’s VMs has been removed, as it did not have enough performance under heavy load. With the new setup, users can install their own file server, while the underlying architecture is no longer a bottleneck.
Ceph datablocks are used as local disks on virtual machines. They can be partitioned and formatted in any file system type, so they can be used on Linux or Windows systems, with full control of permissions and ownership of files.
It is also possible to connect/disconnect disks to/from running VMs, without the need for a machine reboot. All changes will take effect instantly with new disks appearing ‘automagically’ inside your machines after a successful action.
First user reactions
Several users have already experienced the new features of our new HPC Cloud during the Beta phase and since recently, production phase. The feedback is very positive and encouraging so far. The users find the UI a lot better than before, the VM instantiation and boot on local SSD very fast, CEPH more stable and faster than the previous NFS implementation and the AppMarket easy to use. The users report performance improvements in long-running computations and like the possibility to book large RAM memory spaces.
New services under the hood
Up to now, SURFsara’s HPC Cloud mainly accommodated individual scientists or small research groups who required HPC resources to successfully run their project. Although the HPC Cloud will remain the most suitable HPC platform to build this type of self-service running systems, we anticipate that the new cloud capabilities will attract new user segments such as:
- Communities looking for a place to collaborate.
The new user interface is capable of enabling different projects all sharing the same software and tools under the supervision of master users who have full control and insight over resources and users.
- Institution clusters maintained by IT administrators.
Oort’s infrastructure offers an ideal environment for building your own compute cluster. We can provide infrastructure services for compute clusters, while the IT administrator remains the access point for multiple users who run multiple jobs at a time. All that without bothering with the maintenance of the underlying hardware.
For more information on setting up your project on the HPC Cloud please contact us at firstname.lastname@example.org.
Author: Anatoli Danezi
Edits: Jan Bot & Lykle Voort