Leveraging AI hardware for accelerating HPC applications (in collaboration with AWE)

Project Description

From Wafer Scale Engines to Tenix cores, there have been a wide range of accelerator architectures released over the past few years. Typically focussed on AI workloads their specialisation has demonstrated significant performance and energy efficiencies. This research will explore leveraging these for traditional, scientific computing, HPC applications.

Primary Supervisor: Dr Nick Brown

Project Overview

Technological advancements in the field of Artificial Intelligence (AI) and the proliferation of AI models for everyday use has led to a rapid growth in hardware specialised for these workloads. The ongoing trend of hardware specialisation away from traditional scientific computing is accelerating exponentially, and accelerators that traditional HPC workloads rely on, such as GPUs, are also now themselves focussing more heavily on AI. As an example, the performance of such hardware for lower precision arithmetic (used by AI) tends to be very significantly better than the higher precision arithmetic required by traditional HPC. As such, there is a risk that traditional high-performance applications of the future will no longer be able to leverage the bleeding edge of hardware effectively.

The objectives of this research will be to investigate the performance of HPC applications and algorithms on this new class of AI-designed hardware. Developing optimal implementations and a viability and performance assessment, areas of interest also include total energy efficiency and compute throughput-per-watt, which are becoming increasingly important.

This project is in collaboration with AWE who will provide mini-application(s) and identify classes of algorithms of interest to be explored within this project.

Overview of the research area

From the Cerebras Wafer Scale Engine (WSE) to the Tenstorrent Tensix, EPCC hosts a wide range of specialised hardware that has been developed for AI. However, fundamentally, this hardware is designed to undertake floating point arithmetic in a highly efficient manner, and it has been demonstrated that such specialisation can be exploited more widely for a range of HPC workloads. There are many codes which do not currently fully exploit CPUs or GPUs, for example those which are memory bound, and for example studies have demonstrated that the Cerebras WSE's unique architecture provides much higher memory performance compared with more traditional hardware. Ultimately, those class of applications have found great success running on the WSE, removing the bottleneck that exists with other hardware.

Underlying all of this however is the key question around how best to leverage these different architectures to gain optimal performance. This is not simple, as one must move away from the Von Neumann model and really understand both the specific execution model of these architectures along with its idiosyncrasies. This involves redesigning algorithms and numerics, but there can be great benefit in doing so for the right code.

Potential research questions

The overarching hypothesis for this project will be that emerging architectures can provide performance and energy efficiency benefits for specific scientific computing workloads. The sub questions we will use as a starting point are:

What are the most appropriate algorithmic techniques to obtain best performance
Can we categorise these classes of emerging hardware to better understand which algorithms/applications suite the different types
From a performance and energy perspective how does this emerging hardware compare with, for instance, CPUs and GPUs.
Can reduced precision and alternative number representation, such as bf16, sufficiently express the numerics required for some HPC codes.

We will start with a specific hardware architecture, likely the Cerebras, and there is flexibility around whether we go deep and explore just a few architectures, or broad and explore a whole range. This research will be driven by algorithms of interest to AWE.

Funding

This PhD comes with joint AWE-EPSRC funding for a UK home fees status student. The annual stipend is around £2500/year above the UKRI base level, which rises each year, and also includes a generous travel budget.

Student Requirements

A UK Masters degree, 1^st in undergraduate integrated Masters, or its international equivalent, in a relevant subject such as computer science and informatics, physics, mathematics, engineering.

The student must be a strong programmer in at least one of C, C++ or Fortran ideally with experience of developing or contributing to scientific applications. The student must be familiar with mathematical concepts such as algebra, linear algebra, probability and statistics.

English Language requirements as set by University of Edinburgh

Important: The student must be a UK national

Recommended/Desirable Skills

Experience with numerical methods, scientific programming and HPC are highly desirable, as are an understanding of the fundamentals of CFD, combustion and/or numerical analysis.

How to apply

Applications should be made via the University application form, available via the degree finder. Please note the proposed supervisor and project title from this page and include this in your application. You may also find this page is an useful starting point for a research proposal and we would strongly recommend discussing this further with the potential supervisor.

Further Information

An MLIR Lowering Pipeline for Stencils at Wafer-Scale. Stawinoga, N., Katz, D., Lydike, A., Zarins, J., Brown, N., Bisbas, G., Grosser, T. In 2026 ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). https://arxiv.org/pdf/2601.17754
Exploring Fast Fourier Transforms on the Tenstorrent Wormhole. Brown, N., Davies, J., Le Clair, F. In International European Workshop on RISC-V for HPC (RISC-V HPC) 2025. https://arxiv.org/pdf/2506.15437
Accelerating stencils on the Tenstorrent Grayskull RISC-V accelerator. Brown, N., Barton, R., In International workshop on RISC-V for HPC (RISCV-HPC) at SC24. https://arxiv.org/pdf/2409.18835
Exploring the Versal AI engines for accelerating stencil-based atmospheric advection simulation. Brown, N. In The 31st ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA). https://arxiv.org/abs/2301.13016
Exploring the acceleration of Nekbone on reconfigurable architectures. Brown, N. In IEEE/ACM International Workshop on Heterogeneous High-performance Reconfigurable Computing (H2RC). https://arxiv.org/pdf/2011.04981.pdf