March 2018 meeting of the MPI Forum

21 April 2018

In the March 2018 meeting of the MPI Forum, the “Persistent Collectives” proposal began the formal ratification procedure, the “Sessions” proposal took a step forward, but the “Fault Tolerance” saga took a step side-ways.

The proposal to add persistent collective operations to MPI was formally read at the March meeting, and was well-received by all those present. The first vote for this proposal will happen in June and the second vote in September. If all goes well, this addition to MPI will be announced at SC18.

The Sessions working group is now starting to write new text for the next version of the MPI Standard. Initially, the group will focus on adding new object constructors for Communicators, Windows, and Files that take an MPI_Group rather than an MPI_Comm as a “parent”.

The motivation for these new functions is that they can avoid the creation of intermediate Communicator objects, with the associated resources they use up. The ultimate goal, of course, is to remove entirely the reliance on the built-in Communicator, MPI_COMM_WORLD, and to allow MPI to be initialised in a better way than via the MPI_INIT or MPI_INIT_THREAD functions.

However, it will be possible to use these new functions without buying in to the rest of the changes included in the full Sessions proposal, and so the full proposal can be neatly split up into more manageable pieces.

The Fault Tolerance working group brought forward the proposal for User-Level Fault Tolerance (ULFM) for another formal reading at the March meeting, but there are still unresolved concerns about several aspects of this large proposal.

The working group was asked to split their proposal into smaller pieces so that some of the less controversial changes can be voted on separately from the more contentious aspects.

This is hopefully a route to getting some basic fault tolerance ability into MPI soon.

There is a lot going on in the MPI Forum right now. If you want to get involved, send me an email!