Meet our PhD students

Introducing some of our community of students at EPCC.

You can view project posters relating to some of our students' research on EPCC's Zenodo site:
https://zenodo.org/communities/epcc/records?q=&l=list&p=1&s=10&sort=newest 
 

Jakub Adamski

Jakub Adamski 

Jakub.Adamski@ed.ac.uk
Supervisor: Dr Oliver Brown

Jakub Adamski is a second-year PhD candidate investigating high performance simulations of quantum computers. 

I have a joint degree in computer science and physics, which is why I chose to take up quantum computing research as an opportunity to bridge the disciplines. It is exciting to be a part of such a revolutionary field, and to witness it redefine the limits of computation – just like the emergence of digital computers did half a century ago.

Classical simulations of quantum computing are an active area of research, offering an alternative to real quantum hardware, which tends to be expensive and unreliable. Furthermore, while a quantum processor outputs only single probabilistic bitstrings, an emulation can reveal all information about the underlying system without the need to sample from numerous runs. This is why simulations are crucial for the development of the field.

To model quantum circuits, two distinct representations are used. The statevector method is more common, as it updates the state sequentially and in-place. The alternative is to use tensor networks, which can process the circuit in any order, but at the expense of potential information loss. My research was at first centred around existing statevector simulators, but I am now gradually bringing in the tensor networks. Eventually, I intend to develop a brand-new library in C++, which takes advantage of high performance computing by distributing the tensors with MPI.

Outside the university work, I am an active person, enjoying running or hillwalking in the Scottish countryside. I am also an avid astronomy fan, keen on getting good photographs of the Milky Way, or aurora borealis on lucky days. When the weather inevitably worsens, I relax at home and play the keyboard, or get immersed in factory-building video games.

Follow Jakub's work

GitHub: https://github.com/jjacobx
LinkedIn: https://www.linkedin.com/in/kuba-adams/
Personal website: jjacob.xyz

Felicity Anderson

Felicity (Flic) Anderson 

Felicity.Anderson@ed.ac.uk 
Supervisors: Prof. Neil Chue Hong and Dr Julien Sindt

Flic is investigating real-world research software practices amongst academics who write code for their research. 

By mining GitHub repository data for research software projects, I hope to build up a picture of how people interact with their codebases, what software practices are most common amongst their developers, and which of these can be shown to be most effective. 

Initially I aim to do this by looking at whether people's development activities fall into groupings of similar practices - do research software developers have different 'personas' which can describe how they write and engage with their code? 

This will be done by combining lots of different properties from their codebase data (such as code commits and engagement with features such as issue tickets or pull requests) and using clustering analysis to check whether we can describe specific 'types' of developer from the data. 

Not all projects have the same goals, so are different skills and approaches needed? Or are some best practices commonly required across all types of projects? Some approaches may also be more effective than others, so gathering an understanding about what works could fine-tune software development approaches and allow researchers to create better software.

Ultimately being able to give an evidence-basis for what techniques researcher software developers are really using (and whether they work) will help research software projects identify which techniques would be most helpful when starting new projects. It will also help individual researchers prioritise which development skills would benefit them most for the types of projects they want to work on. 

Follow Felicity's work

GitHub https://github.com/FlicAnderson
LinkedIn: https://www.linkedin.com/in/flicanderson 
ORCID ID: https://orcid.org/0000-0001-8778-6779 

Shrey Bhardwaj

Shrey Bhardwaj

shrey.bhardwaj@ed.ac.uk
Supervisors: Prof. Mark Parsons and Dr Paul Bartholomew

Shrey is a final-year PhD student researching HPC I/O bottlenecks. 

I am researching Input/Output (I/O) bottlenecks in high performance computing (HPC) as part of the ASiMoV Strategic Partnership. Before this PhD, I studied for an MEngg in Aerospace Engineering from the University of Bristol. 

I/O bottlenecks are a significant problem for HPC applications because the I/O speeds in such systems are much slower than computing speeds. As an example, the maximum I/O speeds for the fastest HPC systems as of now is around 11.3 TiB/s(IO500) but the maximum compute speeds are around 1.2 ExaFlop/s (TOP500), highlighting the disparity in these speeds. This causes the application to be slowed down by its I/O tasks, despite increasing the number of parallel processes. 

As part of my research, I am investigating the impact of using Simultaneous Multithreading (SMT) as dedicated I/O servers to improve the effective I/O bandwidth for the user. To investigate this, I have created a library, iocomp, which is a portable abstraction library that enables users to create I/O servers by using simple function calls, so removing the complexities of implementing this from the user. 

I have also ported multiple benchmarks to use this library such as STREAM, HPCG and FEniCSx. 

Follow Shrey's work

Github: https://github.com/shreybh1
Linkedin: https://www.linkedin.com/in/shreybhardwaj

Gabriel Rodriguez Canal

Gabriel Rodríguez Canal

gabriel.rodcanal@ed.ac.uk
Supervisor: Dr Nick Brown

Gabriel is a third-year PhD student interested in compilers for FPGAs. 

My work revolves around enabling FPGAs in HPC systems, where the current tooling still needs significant user intervention to generate highly performant hardware. In particular, my research focuses on two areas that are key to the adoption of FPGAs in high-end systems.  

First, the task-based programming model, fundamental in HPC for the execution of scientific workloads on accelerators, is not fully supported on FPGAs. Swapping in and out running tasks on the device is a hardware-level process, where the actual configuration of the FPGA changes. This incurs a massive performance penalty that CPUs and GPUs do not suffer from, as this process happens at the software level. I have developed a solution that enables the advanced flow of partial reconfiguration and abstracts away all the low-level details from the user, fully enabling the task-based programming on FPGA. 

Second, the hardware generation process from high-level source codes written in C/C++, known as high-level synthesis (HLS), presents important limitations: the choice of C/C++ as the frontend language forces the domain expert to port their codes to this language before getting any close to an FPGA, and codes written in this language do not suit the dataflow paradigm naturally, which is necessary to get performance out of an FPGA. 

I have addressed the first issue with Fortran HLS, an LLVM-based tool around Flang that enables Fortran in the HLS ecosystem for AMD Xilinx FPGAs. It also provides a methodology to enable hardware generation from any other high-level language with an LLVM frontend leveraging its modular implementation. This point is proven through Stencil-HMLS, a tool developed in my PhD to accelerate stencils on FPGAs supported by a combination of MLIR, as the frontend, and the Fortran HLS backend. The results show around 100x improvement in runtime and energy efficiency with respect to the state-of-the-art, and prove it is possible to generate dataflow hardware from an imperative specification without user intervention. 

As part of my PhD, I am involved with the EPSRC-funded xDSL project (https://xdsl.dev), where I have contributed the HLS dialect that is used in the Stencil-HMLS work, as well as support for the AIEs in the AMD Xilinx Versal architecture. 

Finally, being a firm believer in knowledge transfer with other institutions, I undertook an industry placement at Hewlett-Packard Enterprise after the completion of the first year of my PhD, where I worked on Fortran HLS. Later this year I will be doing another internship at Pacific Northwest National Laboratory (PNNL) in the US, where I will be working on SODA-opt/Bambu HLS in collaboration with Politecnico di Milano. 

Follow Gabriel's work

Fortran HLS GitLab: https://gitlab.com/cerl/fortran-hls/
GitHub: https://github.com/gabrielrodcanal
LinkedIn: https://www.linkedin.com/in/gabriel-rodriguez-canal-572604109/
ORCID ID: https://orcid.org/0009-0005-0511-3922

Ananya Gangopadhyay

Ananya Gangopadhyay

a.gangop@ed.ac.uk
Supervisors: Prof. Michèle Weiland and Dr Paul Bartholomew

Ananya is a third year PhD student working on improving the performance of the mesh generation phase in Computational Fluid Dynamics (CFD) workflows using existing performance optimisation techniques and novel machine learning approaches.

Mesh generation is a pre-processing step where the fluid domain is discretised into a mesh of cells prior to the principal solver phase. It can be comparable to the solver in terms of runtime, making it a major bottleneck that results in the inefficient utilisation of compute resources and loss of real-world time. Along with looking at standard code optimisation methods such as alternative parallelisation, improved load balancing and vectorisation, I am developing machine learning methods to achieve a significant improvement in performance while maintaining accuracy.

I first became acquainted with computer simulation modelling and numerical methods during a project in the final year of my BSc(Eng) in Electrical and Computer Engineering degree at the University of Cape Town, South Africa. I eventually built upon the project while pursuing an MSc in Computational Science. Working with the Molecular Dynamics (MD) simulation method, I learned how these computational tools act as the third pillar in scientific research, alongside theoretical and experimental analysis. However, I also understood that despite the incorporation of high-performance computing techniques to improve their performance, to meet high accuracy demands these tools can also be resource-hungry with long runtimes. As a result, they may not always provide an improvement in research turnaround times. 

This understanding and experience is the basis of my research targets: to improve the accuracy, performance and time-to-solution of computational simulation and numerical analysis methods, making them reliable and resource efficient supplements (and in some cases viable alternatives) to experimental methods. Through my PhD project with EPCC, I aim to contribute towards making mesh generation easier to configure for better performance and quicker runtimes, which should shorten development cycles and enable engineers to explore the design space more thoroughly.

Follow Ananya's work

GitHub: https://github.com/agango93
LinkedIn: https://www.linkedin.com/in/agangopadhyay
Orcid: https://orcid.org/0000-0002-4948-9674

David Kacs in Bayes building

David Kacs

D.Kacs@sms.ed.ac.uk
Supervisor: Dr Nick Brown

David is a first-year PhD candidate, working towards utilising cutting-edge compiler techniques to target novel architectures.

My research focusses on lowering the barrier to entry to programming the
Cerebras CS-2 by utilising modern, state-of-the-art compiler technology.

The Cerebras CS-2 is a co-processor originally designed to accelerate training
of machine learning models, however it has been shown to perform well in other
common HPC tasks like computational fluid dynamics. The system is programmed
using a bespoke programming language called CSL.

CSL has often proved to be challenging for first time and even experienced
users, as both its syntax and programming model are very different from common
HPC languages. In order to alleviate the steep learning curve, I am planning to
create a new compiler targeting the system. This compiler will apply
architecture specific optimisations, designed to accelerate HPC applications, to
regular code in a language like Fortran, which researchers are more familiar
with.

I am using Multi-Level Intermediate Representation (MLIR) to accomplish this.
MLIR is a component of the LLVM compiler framework which provides a mechanism
for combining different compiler passes and optimisations at different levels of
abstraction, allowing existing transformations to be combined with new ones
targeting the architecture, in theory, resulting in a very efficient executable.

Follow David's work

GitHub: https://github.com/dk949/
LinkedIn: https://www.linkedin.com/in/david-kacs-38a561226/ .

Mark Klaisoongnoen head shot

Mark Klaisoongnoen 

Mark.Klaisoongnoen@ed.ac.uk
Supervisor: Dr Nick Brown

Mark is a final-year PhD candidate exploring the acceleration of quantitative finance workloads on field-programmable gate arrays (FPGAs).

I investigate approaches for recasting Von-Neumann-based algorithms into a dataflow style suitable for FPGAs. During my PhD project, I have been developing and optimising financial workloads on recent generations of Intel and Xilinx FPGAs via high-level synthesis (HLS) in C/C++, comparing them against traditional x86 and GPU architectures.

FPGAs enable programmers to tailor the electronics directly to the application in question, bypassing the general purpose, black-box microarchitecture of CPUs and GPUs. This specialization can help ameliorate issues such as memory-bound codes, and FPGAs are also well known to be highly energy efficient. Therefore I believe they are very interesting as a potential future target for HPC workloads, and while programming FPGAs has traditionally been a significant drawback, recent advances in the ecosystem by the two major vendors, Xilinx and Intel, mean that we can write code in C++, and for much of the underlying process to be automated.

However, while such recent advances have made programming FPGAs more a question of software development rather than hardware design, a key challenge is appropriate dataflow algorithmic techniques to achieve optimal performance. CPU-based codes often require significant changes to effectively exploit FPGAs, and a major part of my research has been developing and understanding these appropriate porting techniques.

In mid-2021 I started collaborating with STAC ( https://stacresearch.com/ ), who provide industry-standard financial benchmarks, to explore the acceleration of their benchmark suite on reconfigurable architectures.

Follow Mark's work

Google Scholar: https://scholar.google.com/citations?user=lAGBYWEAAAAJ&hl=en
LinkedIn: https://www.linkedin.com/in/mark-klaisoongnoen-049550127/
Personal website: https://markxio.github.io/

 

Chris Stylianou

Christodoulos (Chris) Stylianou

c.stylianou@ed.ac.uk
Supervisor: Prof Michèle Weiland

Chris's work lies at the intersection of cutting-edge computing and artificial intelligence, focusing on making computations more efficient on the diverse and complex computer systems of today. 

I am an advanced PhD student in high performance computing (HPC), Computational & Data Science, and Software Engineering, part of the ASiMoV Strategic Partnership. With an MEng in Electrical & Electronic Engineering from Imperial College London and an MSc in HPC with Data Science from EPCC, I have built a strong foundation in both theory and application.

I focus on enhancing how computers process large, complex datasets, similar to those in scientific research or big data analytics, by optimizing Sparse Matrix Vector Multiplication (SpMV). This improvement is key for applications ranging from climate modelling to machine learning, making computations more efficient. By leveraging AI, I am developing methods to automatically choose the best way to store and process this data, optimising for the unique features of each computing system.

A notable contribution is my development of Morpheus, an innovative library-based ecosystem for managing sparse matrix formats. This tool allows researchers and engineers to easily switch between data storage formats, optimising performance without the need to overhaul their software. It’s a step towards making high performance computing more accessible and efficient, potentially accelerating advancements in numerous fields.

Follow Chris' work

Personal website: https://cstyl.github.io

Chao Tang with an Eriskay pony

Chao Tang

c.tang@ed.ac.uk

Supervisors: Adrian Jackson, and Yves Wiaux.

Chao is a third-year PhD student focusing on extreme-scale computational imaging for radio astronomy.

My research focuses on the intersection of astronomical imaging and high-performance computing. This involves designing algorithms to reconstruct high-precision images from the extensive measurements collected by modern radio telescopes, as well as deploying these algorithms on high-performance computing systems. Prior to my PhD study, I obtained my master’s degree in signal and image processing from the University of California, San Diego.

Aperture synthesis via interferometry in radio astronomy is a crucial technique that allows us to observe the sky with antenna arrays at high angular resolution and sensitivity. The raw measurements of a radio telescope provide incomplete linear information about the sky in the spatial Fourier domain. Extracting images from this data enables us to achieve various scientific goals, including studying cosmic magnetism, dark matter, dark energy, and understanding the structure and evolution of stars and galaxies. However, due to the sub-Nyquist sampling strategy, the reconstruction process poses a highly ill-posed Fourier inverse problem, necessitating powerful computational imaging algorithms to incorporate proper prior information.

Next-generation radio telescopes, exemplified by the Square Kilometre Array, feature increasingly large data volumes reaching exabytes in scale. This allows them to resolve complex structures with higher resolution and dynamic range, posing unprecedented requirements for the joint scalability and precision of imaging algorithms. 

The CLEAN algorithm, first proposed by Högbom in 1974, and its variations are widely used in astronomical imaging due to their simplicity. However, CLEAN-based algorithms require manual fine-tuning, and the resolution of the recovered images is limited by the point spread function of the telescope system. Building on compressive sensing theory, advanced convex optimisation algorithms have proven superior in image quality. In this versatile framework, the reconstructed image minimises an objective function composed of a data-fidelity term ensuring physical correctness, and a regularization term promoting prior models. Solutions can be found through iterative proximity splitting algorithms, such as forward-backward, primal-dual, ADMM, etc.

With advancements in data decomposition and 3D image cube faceting techniques, block optimisation algorithms like Faceted-HyperSARA can be further parallelised. However, they may still face challenges due to computational complexity arising from sophisticated image priors and potential sub-optimality of handcrafted priors. Therefore, replacing the sophisticated proximity operator with data-driven deep neural networks offers appealing alternatives known as Plug-and-Play (PnP) algorithms. 

Leveraging the advantages of maximally monotone operator theory, we have proposed a PnP algorithm with convergence guarantee, namely AIRI. We have validated the image quality of AIRI and its variations using real measurements from the MeerKat telescope, demonstrating the robustness of our algorithms. In the near future, we will explore other physically informed deep learning frameworks to handle even higher-dimensional, larger-scale astronomical imaging tasks. Additionally, we will extend our algorithms to tackle problems in other imaging modalities, such as high-dimensional magnetic resonance imaging.

Follow Chao's work

GitHub: https://github.com/ChaoTang0330

LinkedIn: www.linkedin.com/in/chaotang

Weiyu Tu

Weiyu Tu

W.Tu-3@sms.ed.ac.uk
Supervisor: Dr Mark Bull

After completing her Master's in High Performance Computing at EPCC, Weiyu Tu's current research is focusing on the microbenchmarking of accelerator offloading.

My research involves understanding the fundamentals and identifying key areas for improvement in GPU benchmarking methodologies, primarily focusing on kernel launch overheads and computational task assessments. I aim to develop a cross-API benchmark suite for equitable HPC performance evaluations and explore the effects of unified CPU/GPU memory architectures on performance.

I am dedicated to laying a solid foundation for my research while exploring the intricacies of high performance computing environments. My goal is to contribute to developing more accurate and reliable benchmarking processes in my field.