Daniel Holmes's blog

Preparing programming models for Exascale

Author: Daniel Holmes
Posted: 18 Jun 2019 | 15:03

To make future heterogeneous systems easier to use efficiently, the EPiGRAM-HS project is improving and extending existing programming models with Exascale potential. We are working with the MPI and GASPI programming models primarily, but also applying our changes to HPC applications like Nek5000, OpenIFS and iPIC3D, and AI frameworks like TensorFlow and Caffe. We expect the trend towards specialisation of hardware will continue and therefore large machines will become more and more heterogeneous.

What is MPI “nonblocking” for? Correctness and performance

Author: Daniel Holmes
Posted: 27 Feb 2019 | 15:53

The MPI Standard states that nonblocking communication operations can be used to “improve performance… by overlapping communication with computation”. This is an important performance optimisation in many parallel programs, especially when scaling up to large systems with lots of inter-process communication.

However, nonblocking operations can also help with making a code correct – without introducing additional dependencies that can degrade performance.

Making complex machines easier to use efficiently

Author: Daniel Holmes
Posted: 24 Oct 2018 | 16:48

Supercomputers are getting more complex. Faster components would be impossible to cool but, by doing more with less, we can still solve bigger problems faster than ever before.

March 2018 meeting of the MPI Forum

Author: Daniel Holmes
Posted: 21 Apr 2018 | 16:21

In the March 2018 meeting of the MPI Forum, the “Persistent Collectives” proposal began the formal ratification procedure, the “Sessions” proposal took a step forward, but the “Fault Tolerance” saga took a step side-ways.

The proposal to add persistent collective operations to MPI was formally read at the March meeting, and was well-received by all those present. The first vote for this proposal will happen in June and the second vote in September. If all goes well, this addition to MPI will be announced at SC18.

Planning for high performance in MPI

Author: Daniel Holmes
Posted: 25 Jan 2018 | 14:36

Many HPC applications contain some sort of iterative algorithm and so do the same steps repeatedly, over and over again, with the data gradually converging to a stable solution. There are examples of this archetype in structural engineering, fluid flow, and all manner of other physical simulation codes.

The Message Passing Interface: On the Road to MPI 4.0 and Beyond (SC17 event)

Author: Daniel Holmes
Posted: 8 Nov 2017 | 10:23

This year’s MPI Birds-of-a-Feather meeting at SC17 will be held on Wednesday 15th November. I’ll be talking about the Sessions proposal – and explaining why it’s no longer called Sessions!

Spoiler: the working group has been looking at how Teams might interact with Endpoints.

Deep Learning at scale: SC17 talk

Author: Daniel Holmes
Posted: 8 Nov 2017 | 10:13

Are you interested in using machine learning for something big enough to need supercomputing resources?

Have you worked on, or with, one of the Deep Learning frameworks, like TensorFlow or Caffe?

Are you just curious about the state-of-the-art at the crossover between AI and HPC?

MPI 3.1 ratified

Author: Daniel Holmes
Posted: 8 Jun 2015 | 13:17

The MPI 3.1 standard, a minor update to the existing MPI 3.0 Standard, was ratified last week at the latest MPI Forum meeting

McMPI at the EuroMPI 2013 conference

Author: Daniel Holmes
Posted: 26 Sep 2013 | 10:05

Following my presentation about McMPI at the EuroMPI 2013 conference last week, some people asked me to post the slides. The presentation and the associated paper give a brief introduction to the McMPI software and quickly cover some of my thoughts about threading in MPI.

MPI: sending and receiving in multi-threaded MPI implementations

Author: Daniel Holmes
Posted: 5 Jul 2013 | 16:06

Have you ever wanted to send a message using MPI to a specific thread in a multi-threaded MPI process? With the current MPI Standard, there is no way to identify one thread from another. The whole MPI process has a single rank in each communicator.

Pages