The experience of a lifetime :  ISC’17 Student Cluster Competition

13 July 2017

A team of students from EPCC's MSc programmes took part in this year's Student Cluster Competition at the International Supercomputing Conference (ISC) in Germany. The competition requires teams to design and configure a cluster on which they optimise and run benchmarks and applications within a power budget of 3000 watts.

Here Team EPCC and its coach Emmanouil Farsarakis tell us about their hard work and its rewards.

Benchmarks and applications

 A few months before the competition, HPC Advisory Council announced three benchmarks (HPCC, HPL and HPCG) and three applications (FEniCS, TensorFlow and miniDFT). Students have the opportunity to investigate the performance and power consumption before the competition as well as try to optimise them with respect to hardware and software configuration.

Since they might be unfamiliar with applications such as FEniCS and TensorFlow, they should initially learn how to use them. During that period, we could also change the design of our cluster depending on our investigation. On the second day of the competition, teams were given the 'secret' application LAMMPS to optimise and run on their clusters.

Our cluster and sponsors

Team EPCC’s cluster was designed in collaboration with our team’s sponsor Boston LTD, which also provided the hardware. We would like to thank our sponsor for giving us the opportunity to choose among state-of-the-art hardware ie Intel KNL, NVIDIA P100, and Intel Xeon CPUs as well as the liquid cooling provided by CoolIT Systems.

Our cluster is composed of three nodes with three NVIDIA P100 GPUs and two Intel Xeon E5-2630v4 CPUs each. In addition, each node has two SSDs and 128GB of DDR4 RAM. Mellanox Infiniband is used as the interconnect switch. Initially our cluster used only air cooling. Adding liquid cooling on our cluster gave us the opportunity to remove power hungry fans, saving on power consumption overall.

The preparation period also included a two-day training trip to our sponsor, Boston Limited, in London. That was the first time we had hands-on experience on the cluster, giving us an opportunity to learn how to properly handle the hardware in case any hardware changes would be necessary during the competition. During this trip, the second and third nodes of the cluster were added and configured. This was an excellent learning experience, helping us to learn more about the software and hardware configurations necessary to build a cluster.

Our work

One of our primary goals for the competition was the Highest LINPACK award, which heavily influenced our decision regarding the hardware configuration of our cluster. Having extensively experimented with two GPUs per node (using four nodes) and three GPUs per node (using three nodes) for HPL, we decided that the latter option was more promising both in terms of performance and power consumption. Our final results for the competition were 33.99TFLOPS, the third highest HPL performance achieved during the competition.

Regarding our work for the competition, porting and optimising the applications and benchmarks on the cluster included the investigation of input parameters, compilers, optimisation flags, MPI implementations, different parallelisation techniques, as well as the usage of several libraries. FEniCS, a well-known library for solving partial differential equations, was used to solve a Poisson equation during the competition. TensorFlow, a numerical computation library using data flow graphs, was used in a CAPTCHA challenge using neural networks to correctly identify a series of four- and six-character strings. MiniDFT was extracted from QUANTUM ESPRESSO and it was used for the code challenge, requiring a more detailed in-code investigation on optimisations. 

The overall experience

Undoubtedly, this was the experience of a lifetime. First of all, it was a unique learning experience. We would not otherwise have the opportunity to experience so many different aspects of HPC, both in terms of hardware and software, in such a short time period. Moreover, we had the chance to collaborate with Boston Limited, a major HPC vendor in the UK, and to be introduced to more companies and HPC experts during the competition. Discussing with other teams was very inspiring as we had the chance to exchange ideas and learn from each other. All in all, participating in this conference and trying our best to achieve great results for the competition, was definitely a rewarding experience that was worth all the hard work and time invested.

Acknowledgements

We would like to thank our supervisors as well as our coach Emmanouil Farsarakis for their support and guidance. Finally, a special thanks to Boston Limited and Konstantinos Mouzakitis and David Power for the hardware and technical support provided throughout the entire period.

Team EPCC comprised students of the MSc in High Performance Computing and the MSc in HPC with Data Science: Alexandros Nakos, Andriani Mappoura, Chao Peng and Jingmei Zhang. Team Coach Emmanouil Farsarakis is an Applications Consultant at EPCC.

The Student Cluster Competition took place at the 2017 International Supercomputing Conference  in Frankfurt, Germany from 19–21 June.