Enabling machine learning for exascale simulations

2 November 2021

SiMLInt is an interface that increases the efficiency of machine learning (ML) techniques when solving large-scale physical simulations by consuming fewer resources without sacrificing precision.

In a recent breakthrough, Kochkov et al. at Google Research demonstrated the successful use of Convolutional Neural Networks (CNNs) to solve systems of partial differential equations (PDEs) that describe  the evolution of complex physical systems, such as turbulent flows, while using considerably less computational resource.

The underlying idea is that, if trained properly, the CNN can learn basic rules that define flows and use this knowledge to describe a new flow and provide better approximation in terms of coefficients of the PDEs describing it. However, given the high pace with which the field of ML evolves, the domain experts would need to develop a second specialisation to set up the models, both in terms of design as well as the tools and libraries to be used.

We are building on the domain expertise of the School of Maths at the University of Edinburgh to help us identify the most robust and general ways of enabling ML algorithms to assist in solving systems of PDEs. Our focus is on simulations that are considered high priority and which are known to require extensive resources in order to be solved, such as turbulent flows or fusion modelling.

Given EPCC’s experience with software development and knowledge transfer, we are well placed to navigate the “zoo of tools” the current ML landscape offers and consider their strengths when run on a variety of HPC machines.

Based on this detailed comparison of various ML libraries and toolkits, with focus on CNNs and their performance in terms of accuracy as well as resource consumption, in different hardware set-ups including some of the testbeds such as Cerebras, we will present the user with an optimal configuration no matter where they choose to run their simulation.

Moreover, SiMLInt’s interface will ensure that basic ML operations can be implemented straightforwardly, and the accompanying explanatory and training materials will ensure that the domain specialists can set up their ML models correctly and efficiently, without having to build up a full ML expertise.

The project is at an early stage and will use community engagement in the form of workshops and other knowledge exchange activities to inform SiMLInt’s development to ensure the result is useful for the scientific community.

ExCALIBUR

The ExCALIBUR programme is supported by the UKRI Strategic Priorities Fund.

The programme is led by the Met Office and the Engineering and Physical Sciences Research Council (EPSRC) along with the Public Sector Research Establishment, the UK Atomic Energy Authority (UKAEA) and UK Research and Innovation (UKRI) research councils, including the Natural Environment Research Council (NERC), the Medical Research Council (MRC) and the Science and Technologies Facilities Council (STFC).

Paper: Machine learning–accelerated computational fluid dynamics

www.pnas.org/content/118/21/e2101784118.short

SiMLInt is being developed in a close collaboration between EPCC and The School of Mathematics at the University of Edinburgh under the ExCALIBUR funding, which is funded from the Strategic Priorities Fund and is dedicated to enhancing high priority computer codes and algorithms in line with latest progress/development in hardware, software and algorithmic tools.

Image: MediaProduction via Getty Images

Author

Dr Anna Roubíčková
Anna