Obtaining performance portability via Domain Specific Languages (DSLs) and MLIR

Writing efficient parallel code for current generation supercomputers is difficult and in the domain of the (relatively) few experts. However this situation is set to become even more challenging as the trend for heterogeneity (i.e. use of accelerators) and scale increase significantly with next generation exascale machines. Put simply, sequential languages that we have relied upon for so long to write our parallel codes do not provide the necessary abstractions when it comes to writing parallel codes. As a community we have gotten around this by making it the job of the programmer to determine all aspects of parallelism for their code and provide their own parallel abstractions (e.g. by explicitly designing at the code level for geometric decomposition, or divide-and-conquer, or pipeline parallelism), but determining this low level and tricky detail is time consuming, requiring significant expertise, as well as not being scalable to future much larger and more complex supercomputers. There is however another way, and that is of the use of Domain Specific Languages (DSLs).

Primary Supervisor: Dr Nick Brown

Additional Supervisor: Dr Tobias Grosser (School of Informatics)

Further Description

Domain Specific Languages (DSLs) are languages which, out of the box, provide specific abstractions to the programmer which they can then use as a basis for writing their code. The idea being that by encouraging the programmer to work within the confines of specific rules governed by the abstractions and restrictions of a specific domain, then there is a significant amount of information upon which the compiler can act to determine details that traditionally required the programmer to specify manually. In-fact the word language is a bit of a misnomer here, instead the key is abstractions as many of these technologies are embedded within existing languages such as Python.

DSLs have demonstrated their potential to play an important role in programming future exascale simulation codes, however there is a big problem! The issue is around the underlying compilation stack, where DSLs are often siloed and tend to share very little, if any, underlying infrastructure. This means that it can be costly to develop new DSLs, the underlying technology stack can be brittle, and there can be a lack of third party tools such as debuggers and profilers. But there is also a potential solution and that is of Multi Level Intermediate Representation (MLIR) which is a framework for IR that enables one to effectively lower source code to the general representation required by the LLVM compiler through a series of pre-built abstractions. There are very many existing MLIR dialects, with it being possible to write new ones too, thus enabling many different languages, abstractions, and domains to more readily integrate with the existing and mature LLVM tooling without losing information in the translation process.

Overview of research area

DSLs sit across numerous research communities, including programming language design, compilers, and HPC. We have just started a project called xDSL which is a collaboration between Informatics and EPCC at Edinburgh, and Imperial College London. xDSL looks to develop a unified DSL ecosystem based upon MLIR, with the idea being that DSL front-ends will be able to readily integrate with our ecosystem and the appropriate MLIR dialects. Upon doing so the DSL will then benefit from the mature, and well supported, LLVM tooling whilst still being able to exploit the high level domain-specific information provided by the programmer when making important decisions around how to map to the hardware (e.g. choices around parallelism and specific accelerators).  Ultimately this will significantly reduce the effort required to develop DSLs and provide a rich, well supported compilation stack with a large variety of third party tooling.

MLIR was developed by Google in 2020 and has become part of the main LLVM project. In EPCC we are involved in an EPSRC funded ExCALIBUR project which is developing a Python toolbox, known as xDSL, for MLIR that significantly lowers the barrier to entry. This has become an active open source project that is being used by several groups around the world, and the PhD student would be joining this exciting activity and using the xDSL framework to develop their dialects and transformations.

Potential research questions

A key potential benefit of such a DSL ecosystem is performance portability, where a single source code can, to some extent at-least, run across numerous different hardware with minimal changes required on-behalf of the programmer. Whilst this has been proven somewhat across CPU families for MLIR, when considering accelerators such as GPUs, the Cerebras CS-1, FPGAs, AI accelerators,or even novel CPUs such as RISC-V, then this objective is significantly more challenging! A key question is therefore whether one can, using MLIR and based upon the rich amount of domain specific information encoded within a single source code, target many different types of accelerator with minimal changes required to code whilst obtaining good performance. Furthermore, which programmer driven optimisations are still required in code, and how can these be best expressed within a DSL to achieve this objective?

Due to the large scope here, there is flexibility for the student to work at different levels of the computing stack and can be driven largely by their interests. For instance this includes aspects ranging from optimising the generation of binaries on specific hardware, to the compiler support required in enabling this portability, to the design of language level abstractions, and also support with third party tooling. EPCC hosts a wide range of exciting next-generation hardware that the student will be given access to as part of this project.

Student Requirements

A UK 2:1 honours degree, or its international equivalent, in computer science/informatics, physics, mathematics, or engineering.

You must be a strong programmer with some experience of C or C++. You must be comfortable learning new language and concepts as this will form a significant part of the first year.

English Language requirements as set by University of Edinburgh.

Student Recommended/Desirable Skills and Experience

Experience in developing HPC codes and using the corresponding technologies (e.g. MPI, OpenMP). Experience in developing compilers (e.g. LLVM/MLIR) or using accelerators such as GPUs.

How to apply

Applications should be made via the University application form, available via the degree finder. Please note the proposed supervisor and project title from this page and include this in your application. You may also find this page is an uneful starting point for a research proposal and we would strongly recommend discussing this further with the potential supervisor.

References

Brown, Nick. "Accelerating advection for atmospheric modelling on Xilinx and Intel FPGAs." 2021 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 2021.

Gysi, Tobias, et al. "Domain-specific multi-level IR rewriting for GPU: The Open Earth compiler for GPU-accelerated climate simulation." ACM Transactions on Architecture and Code Optimization (TACO) 18.4 (2021): 1-23.

Chelini, Lorenzo, et al. "MultiLevel Tactics: Lifting loops in MLIR." (2020). 2020 European LLVM Developers' Meeting - Paris, France

Ben-Nun, Tal, et al. "Stateful dataflow multigraphs: A data-centric model for performance portability on heterogeneous architectures." Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 2019.

https://www.xdsl.dev

https://mlir.llvm.org/docs/

https://github.com/xdslproject/xdsl

https://mlir.llvm.org/docs/Tutorials/Toy/