Daniel Holmes's blog

Making complex machines easier to use efficiently

Author: Daniel Holmes
Posted: 24 Oct 2018 | 16:48

Supercomputers are getting more complex. Faster components would be impossible to cool but, by doing more with less, we can still solve bigger problems faster than ever before.

March 2018 meeting of the MPI Forum

Author: Daniel Holmes
Posted: 21 Apr 2018 | 16:21

In the March 2018 meeting of the MPI Forum, the “Persistent Collectives” proposal began the formal ratification procedure, the “Sessions” proposal took a step forward, but the “Fault Tolerance” saga took a step side-ways.

The proposal to add persistent collective operations to MPI was formally read at the March meeting, and was well-received by all those present. The first vote for this proposal will happen in June and the second vote in September. If all goes well, this addition to MPI will be announced at SC18.

Planning for high performance in MPI

Author: Daniel Holmes
Posted: 25 Jan 2018 | 14:36

Many HPC applications contain some sort of iterative algorithm and so do the same steps repeatedly, over and over again, with the data gradually converging to a stable solution. There are examples of this archetype in structural engineering, fluid flow, and all manner of other physical simulation codes.

The Message Passing Interface: On the Road to MPI 4.0 and Beyond (SC17 event)

Author: Daniel Holmes
Posted: 8 Nov 2017 | 10:23

This year’s MPI Birds-of-a-Feather meeting at SC17 will be held on Wednesday 15th November. I’ll be talking about the Sessions proposal – and explaining why it’s no longer called Sessions!

Spoiler: the working group has been looking at how Teams might interact with Endpoints.

Deep Learning at scale: SC17 talk

Author: Daniel Holmes
Posted: 8 Nov 2017 | 10:13

Are you interested in using machine learning for something big enough to need supercomputing resources?

Have you worked on, or with, one of the Deep Learning frameworks, like TensorFlow or Caffe?

Are you just curious about the state-of-the-art at the crossover between AI and HPC?

MPI 3.1 ratified

Author: Daniel Holmes
Posted: 8 Jun 2015 | 13:17

The MPI 3.1 standard, a minor update to the existing MPI 3.0 Standard, was ratified last week at the latest MPI Forum meeting

McMPI at the EuroMPI 2013 conference

Author: Daniel Holmes
Posted: 26 Sep 2013 | 10:05

Following my presentation about McMPI at the EuroMPI 2013 conference last week, some people asked me to post the slides. The presentation and the associated paper give a brief introduction to the McMPI software and quickly cover some of my thoughts about threading in MPI.

MPI: sending and receiving in multi-threaded MPI implementations

Author: Daniel Holmes
Posted: 5 Jul 2013 | 16:06

Have you ever wanted to send a message using MPI to a specific thread in a multi-threaded MPI process? With the current MPI Standard, there is no way to identify one thread from another. The whole MPI process has a single rank in each communicator.

McMPI

Author: Daniel Holmes
Posted: 3 May 2013 | 15:18

This article originally appeared in the Cisco blog by Jeff Squires and was written while I was undertaking a PhD before I joined EPCC as a member of staff. I thought it would be of interest to folks reading this blog.

My PhD involved building a message passing library using C#; not accessing an existing MPI library from C# code but creating a brand new MPI library written entirely in pure C#. The result is McMPI (Managed-code MPI), which is compliant with MPI-1 – as far as it can be given that there are no language bindings for C# in the MPI Standard. It also has reasonably good performance in micro-benchmarks for latency and bandwidth both in shared-memory and distributed-memory.