Traditionally, the main workloads run on a supercomputer consists of various forms of numerical simulations, such as weather modelling, molecular dynamics, etc. Over the past decades, advances in chip-manufacturing technologies have ensured that (super)computers became increasingly powerful, and where thus able to run larger and/or more accurate models. However, the reduction of the size of transistors is nearing its physical limits and the increase in compute performance with time may no longer be a given.
While improving efficiency on the hardware side may become more challenging, new opportunities arise on the software side. Research in machine learning and deep learning technology has peaked over the past years. Recently, scientists have started exploring the use of machine learning techniques to enhance traditional simulations, such as weather predictions. Early results indicate that these models, that combine machine learning and traditional simulation, can improve accuracy, accelerate time to solution and significantly reduce costs.
For us this forms the starting point of the Machine learning enhanced High Performance Computing Applications project, which is performed in the SURF Open Innovation Lab at SURFsara. In this project we will investigate whether and how machine learning and deep learning are suitable technologies to augment, accelerate or replace scientific workloads like numerical simulations. And in that context, is it a pre- or post-processing step to help filter and understand the input data or ultimate simulation results, or is it something that is poised to (partly) replace the decades-old codes that comprise many HPC workloads?
A substantial part of machine learning techniques, such as neural networks, focus on some form or prediction. A typical example is object recognition in images: does an image show a cat, or a dog? These predictive capabilities may enable one to predict or approximate simulation outcomes using machine learning, instead of performing a full numerical simulation. The input and output data of a conventional simulation can be used to train a deep neural network, which is then used in inference mode to efficiently simulate the system being studied. While numerical models are programs that algorithmically embody the known science, machine learning models are learned from vast stores of data. The advantage is that, while training a machine learning model is a compute intensive task, running a learned model in inference mode is generally not. Going back to the ‘cats and dogs’ example: training an object recognition network may require a supercomputer, but running it in inference mode can be done on a 40-euro Raspberry Pi.
Consequently, early research projects have shown that machine learning often requires orders of magnitude fewer resources to unlock problems that have often been beyond the grasp of traditional techniques. Since performance increases in traditional high-performance computing (HPC) have been highly dependent on Moore’s Law, this approach presents a promising avenue to explore.
Perhaps the most impactful application of machine learning in HPC is to replace production numerical simulation models with machine learning approximation. This approach has the potential to transform HPC. However, adoption will require scientists to embrace a method that may eventually render obsolete the codes they have spent decades to develop and optimize. On the other hand, the approach helps scientists to focus more on the underlying science and to study divergent scales, like the planetary systems in star clusters.
When using machine learning for approximation, numerical codes can actually be useful even when relatively inefficient, as these codes would no longer be used to run the application at final scale. For example, for grid-based simulations, the traditional code could be used to simulate at a coarse grid size, while a machine learned model could take this coarse-grid data as input to make predictions at a refined grid size.
Machine learning enhanced HPC applications project
In the machine learning enhanced HPC applications project we selected 4 use cases in different domains that will investigate the enhancement of traditional HPC simulations with machine learning algorithms:
- Machine-Learned turbulence in next-generation weather models – Dr. Chiel van Heerwaarden, Meteorology and Air Quality Group, Wageningen University;
- Generating physics events without an event generator – Dr. Sacha Caron, Experimental High Energy Physics, Radboud University;
- Distinguising biological interfaces from crystal artifacts in biomolecular complexes using deep learning – Prof. Alexandre M.J.J. Bonvin, Computational Structural Biology, Utrecht University; and
- Machine learning for accelerating planetary dynamics in stellar clusters – Prof. Simon Portegies Zwart, Computational Astrophysics, Leiden University.
We are going to stimulate and support these advanced use cases with funding, technical support, etc, to validate the machine learning approach and its potential for HPC. We’ll do this in close collaboration with the scientific research groups, in order to develop in-depth knowledge in something which has the potential to become a real groundbreaking technique. We’ll also setup meetings with all projects to support one another and develop this new field together.
At the moment we are starting things up, but as scientists become more comfortable with this new approach, and as the methodologies become more robust, we believe that machine learning has the potential to emerge as a mainstream tool for many areas of scientific computing.
I’ll keep you updated about the project in a next blog post.
Author: Casper van Leeuwen