Raising programmer productivity for stencil-based workloads on Cerebras WSE

29 January 2026

Research to be published at ASPLOS’26 demonstrates the potential benefits of Cerebras’ Wafer Scale Engine architecture for scientific computing.

Cerebras is a highly valued partner of EPCC, and we have enjoyed a close working relationship for a long time. Driven by the opportunities unlocked by Cerebras’ Wafer Scale Engine (WSE) architecture, EPCC was an early customer of the CS-1, installing the first in Europe in 2021.

Since then, driven by high demand from users and the important capability unlocked by the architecture, EPCC upgraded to the CS-2 in 2022. We now provide four CS-3s as part of the Edinburgh International Data Facility (EIDF). These provide a crucial compute resource for researchers across the world who are working on some of the grand challenges associated with AI workloads.

The raw performance and memory bandwidth provided by the WSE mean that the architecture has great potential, not just for AI, but also more generally for scientific computing. There have been numerous successes using the WSE for more general workloads, demonstrating the most benefits for codes which are memory bound and have exhausted the capabilities of other architectures such as GPUs or CPUs. To date, however, a challenge is that programming the WSE is rather different from traditional architectures, requiring significant code porting.

Compiler-based approach

Working with collaborators in Cambridge University, Imperial College London and the Technical University of Berlin, we have been researching a compiler-based approach with the aim of significantly improving programmer productivity by enabling specific types of code to run on the WSE unchanged. Leveraging xDSL, this work which has been accepted to ASPLOS’26 (the premier conference on practical aspects of computer architecture, programming languages and operating systems), focuses on the stencil-based algorithms ubiquitous in scientific computing.

Having developed new abstractions and transformations within the MLIR compiler framework via xDSL, this research has helped close the semantic gap between traditional programming language frontends such as Devito (a DSL for seismology), PSyclone (a DSL for weather and climate) and Flang (Fortran compiler), and the WSE architecture.

Lowering the barrier to WSE benefits

The importance of this research is highlighted by Cerebras.

“This research has the potential to significantly lower the barrier to programming the Wafer Scale Engine, opening up new performance opportunities for the HPC community,” said Leighton Wilson, Senior Member of Technical Staff, Cerebras.

“A key advantage of the approach is that, for stencil-based workloads, programmers do not need to modify their existing code to run efficiently on the CS-3. By enabling strong performance without manual porting or architecture-specific rewrites, this work demonstrates how compiler-driven solutions can make the power of the WSE accessible to a much broader range of scientific applications.”

A step change in capability

Overall, this work demonstrates that, compared to manually crafted WSE codes, a compiler-based approach is highly competitive and indeed in some situations can outperform the manually written code. Furthermore, when running the exact same code as on other architectures, the architectural benefits of the CS-3 result in very significantly greater performance. Ultimately, this research demonstrates a path for the scientific community to gain a step change in capability by leveraging the WSE whilst maintaining programmer productivity.

The author accepted version of the paper can be viewed on arxiv: An MLIR Lowering Pipeline for Stencils at Wafer-Scale. It will be published in the ASPLOS ACM proceedings.

Further information

The EIDF Cerebras service gives users access to a Cerebras Wafer-Scale Cluster optimised for training and tuning AI deep learning. It is available for industry and academic use. For details, see the Cerebras page on the Edinburgh International Data Facility website.

Cerebras website.

ASPLOS, the ACM International Conference on Architectural Support for Programming Languages and Operating Systems, is the premier academic forum for multidisciplinary computer systems research spanning hardware, software, and their interaction. It focuses on computer architecture, programming languages, operating systems, and associated areas such as networking and storage. ASPLOS 2026 will be held in Pittsburgh, USA, from the 22nd to the 26th of March 2026.

Author

Dr Nick Brown

n.brown@epcc.ed.ac.uk

View profile