Cross-compiling OpenFOAM for KNL

Author: Adrian Jackson
Posted: 27 Jan 2017 | 14:30

Breaking of a dam (from the OpenFOAM user guide)

For those of you not acquainted with OpenFOAM, it's a large open source CFD package used by a wide variety of scientists and companies to investigate a whole range of scientific and engineering problems. 

We support it on ARCHER and have a number of different versions available and in use on the machine. As part of our IPCC work we are interested in looking at the performance of OpenFOAM on the latest Xeon Phi processor, Knights Landing (KNL).

However, it isn't always the easiest package to install. It uses its own build environment, called wmake, and can require a large amount of third party libraries and packages to be built as well. A typical ARCHER build will take a number of hours to complete, and installing new versions often requires a bit of modification to the build environment(s) to work (see here for more details).

So, it was with some trepidation I tried building OpenFOAM for the ARCHER KNL system. As KNL hosts an operating system it should be possible to build OpenFOAM directly on the KNL and the compilation should be straightforward. However, as ARCHER's KNLs are set up as a production environment, with batch system and login nodes, you cannot easily compile an application directly on the KNL. Partially this is to ensure the valuable hardware resources aren't used for activities that can be done on cheaper hardware (ie the login nodes), and partially it's to make sure the operating system is as simple as possible on the production nodes.

The upshot is that you need to compile on different hardware than the code will be run on. The compilers are set up to cross-compile, ie produce binaries for the production hardware, but if your compile process does any execution of applications then it can be a bit more complicated to build the application. For the versions of OpenFOAM we were building on KNL (2.4.0 and 4.1), there is a stage of the building of the third party libraries (specifically Scotch) that builds and then runs an executable as part of the compilation.

On our ARCHER setup this causes the compilation to fail, as the executable (dummysizes) is created with floating point instructions that do not exist in the login node processors (the compilers are configured to generate executables for the KNL floating point instruction set). Specifically, it fails with the following error:

./dummysizes library.h scotch.h 
Makefile:2951: recipe for target 'scotch.h' failed 
make[1]: *** [scotch.h] Illegal instruction

There is probably a nice, cross-compilation fix for this, but I got around the issue by creating that executable specifically for the login nodes, rather than the KNL, and then re-running the build process. This involved unloading the module that sets up the compilers for KNL on ARCHER (that's the craype-mic-knl module), going into the Third-Party-2.4.0/scotch_6.0.3/src/libscotch directory, making dummysizes manually (make dummysizes), then modifying the makefile to ensure dummysizes isn't deleted when make clean is run.

We also encountered another issue building the Scotch libraries on the KNL.  Some variables were not being passed through the compliation process. The file ThirdParty-2.4.0/scotch_6.0.3/src/libscotch/library_version contains some definitions that should be updated on compile, ie:

void 
SCOTCH_version ( 
int * const                 versptr, 
int * const                 relaptr, 
int * const                 patcptr) 
{
   *versptr = SCOTCH_VERSION;
   *relaptr = SCOTCH_RELEASE;
   *patcptr = SCOTCH_PATCHLEVEL;
}

However, for our build SCOTCH_VERSION, SCOTCH_RELEASE, and SCOTCH_PATCHLEVEL were not being recognised.  It's likely this was a mis-configuration on our part, but a simple fix was just to put these numbers in manually.

That enabled the third party libraries to be built correctly, and the rest of the OpenFOAM build was straightforward (using the same configuration as we use on the main ARCHER machine).  Interestingly, though, because we're using newer compilers on the KNL system than the main ARCHER machine (gcc 6.1.0 vs gcc 5.3.0), we also had to modify a single OpenFOAM source file for version 4.1 because of a conflict between C and C++ MPI header files.

The file OpenFOAM-4.1/src/parallel/decompose/ptscotchDecomp/ptscotchDecomp.C failed with errors about multiple declarations of MPI variables and routines.  After a bit of googling and investigation we managed to fix this by editing the file and moving the MPI header file declaration outside the external C declaration block, ie changing this:

 ... 
extern "C" {
     #include <mpi.h>
     #include <stdio.h>
     #include "ptscotch.h"
}
...

To this:

...
#include <mpi.h> 
extern "C" {
     #include <stdio.h>
     #include "ptscotch.h"
 }
...

Once updated the build of OpenFOAM-4.1 completed succesfully.  Now the plan is to run some performance tests, and build versions using the Intel compilers as well to allow performance comparisons between the two compilers on KNL.

We're also aware that others have worked to optimise OpenFOAM on KNL, and there is some code available on the OpenFOAM github, which provides some optimised solver routines (implemented to efficiently utilise KNL vector instructions) and an improved memory allocator to efficiently make use of the MCDRAM on the processor. Once we've got our initial performance baseline we'll look at adding these optimisations and seeing the impact they have.

We'll write a new blog post when we have some performance data and share the results.

Image shows the breaking of a dam (from the OpenFOAM user guide).

Author

Adrian Jackson, EPCC
Adrian on Twitter: @adrianjhpc