MPI: sending and receiving in multi-threaded MPI implementations

Author: Daniel Holmes
Posted: 5 Jul 2013 | 16:06

Have you ever wanted to send a message using MPI to a specific thread in a multi-threaded MPI process? With the current MPI Standard, there is no way to identify one thread from another. The whole MPI process has a single rank in each communicator.

Let’s take a simple example. We start with two operating system processes. We call MPI_Init_thread in each of them requesting the MPI_Thread_multiple thread support level, so now each OS process is also an MPI process (with ranks 0 and 1 in MPI_Comm_world). We then create a second thread in each of these processes, for example, by entering an OpenMP parallel region. At some point during the program, each thread wants to send some information to the other threads. If only one thread in each MPI processes wants to communicate then there is no problem. However, what happens when both threads in rank 0 send a message to rank 1 and both threads in rank 1 receive a message from rank 0? The messages are “logically concurrent” and so there is no guarantee of the matching order. Each thread in rank 1 could receive the message that was intended for the other thread in rank 1.

With the current MPI Standard, this is resolved by using the “tag” parameter to force the correct matching behaviour but this means using “tag” like an extension of rank. Would it not be better to give each thread its own rank?

There is a proposal being discussed by the MPI Forum to introduce a way to create a new communicator that has more ranks in it than the parent communicator. These extra ranks are called end-points. The new method for creating an end-points communicator is:

MPI_Comm_create_endpoints(
            MPI_Comm parent,
            int my_num_ep,
           MPI_Info info,
           MPI_Comm out_comm[] )

In our simple example, we could create an end-points communicator using MPI_Comm_world as the parent, requesting 2 end-points in the local MPI process (my_num_ep) and supplying an info object, if desired. The function returns an array of communicator handles, one handle for each local end-point requested. In our case, this new communicator would have 4 ranks. Each thread in each MPI process will use its own communicator handle and therefore has its own rank.

As the number of threads in each process increases and the programming models for threads become more widely adopted, the MPI Forum is looking for ways to modify the MPI Standard to make programming with MPI and threads easier.

Contact

Daniel Holmes, EPCC

MPI at EPCC

MPI training: Message Passing Programming with MPI

EPCC OpenMP/MPI micro-benchmark suite