Adrian Jackson's blog

Spreading the love

Author: Adrian Jackson
Posted: 10 Mar 2017 | 13:54

Binding processes to coresThread and process binding

Note, this post was updated on the 23rd March 2017 to include how to bind threads correctly on Cray systems (aprun -cc rather than taskset)

Making sure threads and processes are correctly placed, or bound, on cores or processors is essential to ensure good performance for a range of parallel applications. 

This is not a new topic, and has been covered well by others before, ie http://www.glennklockwood.com/hpc-howtos/process-affinity.html. Generally this is just handled for you; if you're running an MPI program then your mpirun/mpiexec/aprun job launcher will do sensible process binding to cores. 

Optimised tidal modelling

Author: Adrian Jackson
Posted: 2 Feb 2017 | 11:37

Fluidity for tidal modelling

Tidal model

Figure 1: Mesh for the Sound of Islay tidal simulation. Courtesy Dr Creech.

We were recently involved in a project to optimise the CFD modelling package Fluidity for tidal modelling. This ARCHER eCSE project was primarily carried out by Dr Angus Creech from the Institute of Energy Systems in Edinburgh.

Cross-compiling OpenFOAM for KNL

Author: Adrian Jackson
Posted: 27 Jan 2017 | 14:30

Breaking of a dam (from the OpenFOAM user guide)

For those of you not acquainted with OpenFOAM, it's a large open source CFD package used by a wide variety of scientists and companies to investigate a whole range of scientific and engineering problems. 

We support it on ARCHER and have a number of different versions available and in use on the machine. As part of our IPCC work we are interested in looking at the performance of OpenFOAM on the latest Xeon Phi processor, Knights Landing (KNL).

EPCC at SC16 (booth 701)

Author: Adrian Jackson
Posted: 8 Nov 2016 | 23:59

SC16 LogoSupercomputing 2016

EPCC is gearing up for next week's Supercomputing conference, SC16, in Salt Lake City, USA. If you'll be there too, come and visit us at booth #701. We'll also be involved in a whole range of workshops, panels, birds of a feather sessions (BoFs), and talks.

Benevolent dictator vs democracy: which are you coding for?

Author: Adrian Jackson
Posted: 27 Sep 2016 | 16:47

Developing for the real world

As part of a recent ARCHER eCSE project I developed a new parallelisation strategy for a computational simulation application to enable it to scale efficiently to larger process counts. We managed to significantly reduce the parallel overheads, so the code was accepted into the main repository for users to exploit.

Self racing cars

Author: Adrian Jackson
Posted: 16 Sep 2016 | 11:34

Roborace DevBot on the trackAutonomous racing

Recently EPCC's Alan Gray and I attended a workshop at Donington Park held by Roborace.  For those who've not heard of Roborace, it's a project to build and race autonomous cars, along the lines of Formula 1 but without any drivers or human control of the cars.  Actually, it's more like Formula E but without drivers, as the plan is for the cars to be electric.

Secure access to remote systems

Author: Adrian Jackson
Posted: 5 Sep 2016 | 10:35

06/09/16: As pointed out by my colleague Stephen in the comments after this post, the way to solve most of these issues is to tunnel the key authentication and therefore bypass the need to have private keys anywhere but on my local machine.  I'm always learning :)

Password vs key

Having to remember a range of passwords for systems that I don't use regularly is hard. 

You can use a password manager, but that only helps if I'm only ever trying to log in from my own laptop. If I have to log in from someone else's machine for any reason then I'd need to know the password.

MPI performance on KNL

Author: Adrian Jackson
Posted: 30 Aug 2016 | 12:22

Knights Landing MPI performance

Following on from our recent post on early experiences with KNL performance, we have been looking at MPI performance on Intel's latest many-core processor.

MPI ping-pong latency on KNC and IvyBridge
Figure 1

The MPI performance on the first generation of Xeon Phi processor (KNC) was one of the reasons that some of the applications we ported to KNC had poor performance.  Figures 1 and 2 show the latency and bandwidth of an MPI ping-pong benchmark running on a single KNC and on a 2x8-core IvyBridge node.

Early experiences with KNL

Author: Adrian Jackson
Posted: 29 Jul 2016 | 16:45

Initial experiences on early KNL

Updated 1st August 2016 to add a sentence describing the MPI configurations of the benchmarks run.
Updated 30th August 2016 to add CASTEP performance numbers on Broadwell with some discussion

EPCC was lucky enough to be allowed access to Intel's early KNL (Knights Landing, Intel's new Xeon Phi processor) cluster, through our IPCC project.  KNL Processor Die

KNL is a many-core processor, successor to the KNC, that has up to 72 cores, each of which can run 4 threads, and 16 GB of high bandwidth memory stacked directly on to the chip.

Latest Top500 list, looking beyond the number 1

Author: Adrian Jackson
Posted: 21 Jun 2016 | 17:13

There's been a lot of discussion about the latest Top500 list, released this week at ISC16.  Most of the interest has been in the whopping new Chinese system, Sunway TaihuLight, which has come in at number 1 on the list with a massive 93 PFlop/s rpeak Linpack performance, and 125 PFlop/s rmax theoretical peak performance (3 times bigger than the previous number 1 system).Top500

Whilst this is a very interesting system, and much bigger than is currently planned elsewhere, it's not unknown for very large systems to come in and dominate the list like this.  Back in 2002, the Japanese Earth Simulator system became the number 1 machine with an rpeak of ~5x that of the previous number 1 system, and it stayed as the top machine for a number of years.

Pages