Daniel Holmes's blog
Preparing programming models for Exascale
Author: Daniel HolmesPosted: 18 Jun 2019 | 15:03
To make future heterogeneous systems easier to use efficiently, the EPiGRAM-HS project is improving and extending existing programming models with Exascale potential. We are working with the MPI and GASPI programming models primarily, but also applying our changes to HPC applications like Nek5000, OpenIFS and iPIC3D, and AI frameworks like TensorFlow and Caffe. We expect the trend towards specialisation of hardware will continue and therefore large machines will become more and more heterogeneous.
What is MPI “nonblocking” for? Correctness and performance
Author: Daniel HolmesPosted: 27 Feb 2019 | 15:53
The MPI Standard states that nonblocking communication operations can be used to “improve performance… by overlapping communication with computation”. This is an important performance optimisation in many parallel programs, especially when scaling up to large systems with lots of inter-process communication.
However, nonblocking operations can also help with making a code correct – without introducing additional dependencies that can degrade performance.
Making complex machines easier to use efficiently
Author: Daniel HolmesPosted: 24 Oct 2018 | 16:48
Supercomputers are getting more complex. Faster components would be impossible to cool but, by doing more with less, we can still solve bigger problems faster than ever before.
March 2018 meeting of the MPI Forum
Author: Daniel HolmesPosted: 21 Apr 2018 | 16:21
In the March 2018 meeting of the MPI Forum, the “Persistent Collectives” proposal began the formal ratification procedure, the “Sessions” proposal took a step forward, but the “Fault Tolerance” saga took a step side-ways.
The proposal to add persistent collective operations to MPI was formally read at the March meeting, and was well-received by all those present. The first vote for this proposal will happen in June and the second vote in September. If all goes well, this addition to MPI will be announced at SC18.
Planning for high performance in MPI
Author: Daniel HolmesPosted: 25 Jan 2018 | 14:36
Many HPC applications contain some sort of iterative algorithm and so do the same steps repeatedly, over and over again, with the data gradually converging to a stable solution. There are examples of this archetype in structural engineering, fluid flow, and all manner of other physical simulation codes.
The Message Passing Interface: On the Road to MPI 4.0 and Beyond (SC17 event)
Author: Daniel HolmesPosted: 8 Nov 2017 | 10:23
This year’s MPI Birds-of-a-Feather meeting at SC17 will be held on Wednesday 15th November. I’ll be talking about the Sessions proposal – and explaining why it’s no longer called Sessions!
Spoiler: the working group has been looking at how Teams might interact with Endpoints.
Deep Learning at scale: SC17 talk
Author: Daniel HolmesPosted: 8 Nov 2017 | 10:13
Are you interested in using machine learning for something big enough to need supercomputing resources?
Have you worked on, or with, one of the Deep Learning frameworks, like TensorFlow or Caffe?
Are you just curious about the state-of-the-art at the crossover between AI and HPC?
MPI 3.1 ratified
Author: Daniel HolmesPosted: 8 Jun 2015 | 13:17
The MPI 3.1 standard, a minor update to the existing MPI 3.0 Standard, was ratified last week at the latest MPI Forum meeting.
McMPI at the EuroMPI 2013 conference
Author: Daniel HolmesPosted: 26 Sep 2013 | 10:05
Following my presentation about McMPI at the EuroMPI 2013 conference last week, some people asked me to post the slides. The presentation and the associated paper give a brief introduction to the McMPI software and quickly cover some of my thoughts about threading in MPI.
MPI: sending and receiving in multi-threaded MPI implementations
Author: Daniel HolmesPosted: 5 Jul 2013 | 16:06
Have you ever wanted to send a message using MPI to a specific thread in a multi-threaded MPI process? With the current MPI Standard, there is no way to identify one thread from another. The whole MPI process has a single rank in each communicator.