Posted: 30 Aug 2016 | 12:22
Knights Landing MPI performance
Following on from our recent post on early experiences with KNL performance, we have been looking at MPI performance on Intel's latest many-core processor.
The MPI performance on the first generation of Xeon Phi processor (KNC) was one of the reasons that some of the applications we ported to KNC had poor performance. Figures 1 and 2 show the latency and bandwidth of an MPI ping-pong benchmark running on a single KNC and on a 2x8-core IvyBridge node.
Posted: 9 Aug 2016 | 17:18
BioExcel is a newly launched Centre of Excellence that helps academic and industrial researchers to use high-performance computing and high-throughput computing in biomolecular research.
We are running a number of events in September and October and I would be very grateful to anybody who circulates these to people or groups that may be interested (and bonus points if you are willing to share with me details of where/to whom you publicise so that we can reduce cross-posting).
Posted: 29 Jul 2016 | 16:45
Initial experiences on early KNL
Updated 1st August 2016 to add a sentence describing the MPI configurations of the benchmarks run.
Updated 30th August 2016 to add CASTEP performance numbers on Broadwell with some discussion
KNL is a many-core processor, successor to the KNC, that has up to 72 cores, each of which can run 4 threads, and 16 GB of high bandwidth memory stacked directly on to the chip.
Posted: 1 Jul 2016 | 10:58
This week I have been at the FEAT (Future Emerging Art and Technology) workshop in Vienna, which aims to promote collaboration between scientists and artists. As I am sure many people will be aware, the EU-funded Future and Emerging Technology (FET) programme consists of scientific projects looking to push the boundaries of research in specific fields.
Posted: 15 Jun 2016 | 13:35
This week sees our annual collaboration workshop with Tsukuba University, Japan (more details are available here). This is a great chance to get a flavour of the kind of research another HPC centre is undertaking, how they work, and what platforms they are investing in.
The Centre for Computational Sciences (CCS) at Tsukuba is a department very like EPCC, in that it is responsible for high performance and parallel computing at the university, runs and supports large-scale computers for researchers, and undertakes parallel computing research.
Posted: 3 Jun 2016 | 16:09
It's a good time to take stock of our achievements and reflect on how to focus our efforts in the final phase. Also to consider life after the project ends: how do we want to exploit the technologies we have developed and the knowledge we have gained? How do we ensure a lasting legacy for Adept?
Posted: 27 May 2016 | 10:15
The NEXTGenIO project represents a step along the Exascale pathway.
We are developing a prototype platform that utilises the latest developments in memory technology, and that will offer vastly improved I/O performance compared to current HPC machines. The system will be developed end-to-end by the project partners – from inception through to delivery, with a full suite of systemware that can make use of the new technologies.
Posted: 5 May 2016 | 16:43
Recently I seem to have had many conversations about programming languages for HPC. In some ways this is not a new subject - I have been having similar conversations for the last 20 years. However as HPC hardware evolves, machines become more complex and the issues that need to be addressed by programmers also become more complex. So it is not surprising that we are wondering if there is more the compiler could be doing to help us.
Posted: 19 Apr 2016 | 23:14
Anyone taking more than a passing interest in HPC hardware recently will have noticed that there are a number of reasonably significant trends coming to fruition in 2016. Of particular interest to me are on-package memory, integrated functionality, and new processor competitors.
On-package memory, memory that is directly attached to the processor, has been promised for a number of years now. The first product of this type I can remember was Micron's Hybrid Memory Cube around 2010/2011, but it's taken a few years for the hardware to become mature enough (or technically feasible and cheap enough) to make it to mass market chips. We now have it in the form of MCDRAM for Intel's upcoming Xeon Phi processor (Knights Landing), and as HBM2 on Nvidia's recently announced P100 GPU.
Posted: 30 Mar 2016 | 17:46
Does array index order affect performance?
A couple of weeks ago I was teaching an ARCHER Modern Fortran course, and one of the things we discuss during the course is index ordering for multi-dimension arrays. This course is an introduction to modern Fortran (primarily F90/F95), so we don't go into lots of details about parallel or performance programming, but as attendees are likely to be using Fortran for computational simulation it is important they understand which array dimensions are contiguous in memory so that they don't accidentally write code that is much slower than it should be.
Figure 1: Performance using the GNU compiler
During one of the practical sessions on the course, one of the students wrote a little program to investigate the performance impact of iterating through array elements in a non-contiguous order. They also included some code to investigate if there is a performance impact when using allocatable array rather than static arrays (I'd mentioned it shouldn't impact performance but I obviously wasn't convincing enough...).