Adrian Jackson's blog
Posted: 17 Jul 2019 | 14:11
As part of the NEXTGenIO project we have a prototype HPC system that has two Intel Omni-Path networks attached to each node. The aim of having a dual-rail network setup for that system is to investigate the performance and functionality benefits of having separate networks for MPI communications and for I/O storage communications, either directing Lustre traffic and MPI traffic over separate networks, or using a separate network to access NVDIMMs over RDMA. We were also interested in the performance benefits for general applications exploiting multiple networks for MPI traffic, if and where possible.
Posted: 12 Dec 2017 | 11:16
November 2017 Top500
My initial impression of the latest Top500 list, released last month at the SC17 conference in Denver, was that little has changed. This might not be the conclusion that many will have reached, and indeed we will come on to consider some big changes (or perceived big changes) that have been widely discussed, but looking at the Top 10 entries there has been little movement since the previous list (released in June).
Posted: 16 Aug 2017 | 15:59
I recently attended the 2017 Flash Memory Summit, a conference primarily aimed at storage technology and originally based around flash memory, although it has expanded to cover all forms of non-volatile storage technology.
Non-volatile memory is a big deal nowadays. It is memory that stores data even when it has no power (unlike the volatile memory in computers that lose data when power is switched off). Flash memory is a particular form for non-volatile memory, it's been used for a long time, and has had a massive impact on consumer technology, from the storage in your cameras and phones, to SSD hard drives routinely installed in laptop and desktop systems.
Posted: 10 Aug 2017 | 20:22
Distributed ledgers, the core technology underlying digital currencies such as BitCoin, offer some interesting functionality for constructing distributed data infrastructures.
Ledgers can be considered to be simple data stores. They are styled on accounting ledgers, books where transactions are recorded one after the other, and the overall state of the accounts can be evaluated by working through the recorded transactions to calculate how much money has flowed in and out of the accounts.
Posted: 15 Jun 2017 | 13:41
We are entering the fourth year of the Intel Parallel Computing Centre (IPCC). This collaboration on code porting and optimisation has focussed on improving the performance of scientific applications on Intel hardware, specifically its Xeon and Xeon Phi processors.
Posted: 30 May 2017 | 11:01
Posted: 24 May 2017 | 19:30
When we parallelise and optimise computational simulation codes we always have choices to make. Choices about the type of parallel model to use (distributed memory, shared memory, PGAS, single sided, etc), whether the algorithm used needs to be changed, what parallel functionality to use (loop parallelisation, blocking or non-blocking communications, collective or point-to-point messages, etc).
Posted: 11 May 2017 | 00:06
As part of the ARCHER Knights Landing (KNL) processor testbed, we have produced and collected a set of benchmark reports on the performance of various scientific applications on the system. This has involved the ARCHER CSE team, EPCC's Intel Parallel Computing Center (IPCC) team, and various users of the system all benchmarking and documenting the performance they have experienced.
Posted: 11 Apr 2017 | 17:59
Shall I compare thee...
Performance comparisons are always tricky to get exactly right. They are needed to ensure that we can demonstrate the performance improvements that optimisations, new hardware, new algorithms, etc... have had on an application or benchmark, but there is a lot of latitude in what can be compared, which makes it easy to get a performance comparison wrong and not properly demonstrate whatever it is you're trying to show.
Posted: 10 Mar 2017 | 13:54
Thread and process binding
Note, this post was updated on the 23rd March 2017 to include how to bind threads correctly on Cray systems (aprun -cc rather than taskset)
Making sure threads and processes are correctly placed, or bound, on cores or processors is essential to ensure good performance for a range of parallel applications.
This is not a new topic, and has been covered well by others before, ie http://www.glennklockwood.com/hpc-howtos/process-affinity.html. Generally this is just handled for you; if you're running an MPI program then your mpirun/mpiexec/aprun job launcher will do sensible process binding to cores.