Hardware

First phase of ARCHER2 installed at Advanced Computing Facility

Author: Lorna Smith
Posted: 15 Jul 2020 | 15:41

ARCHER2, the new UK national supercomputing service, is a world-class advanced computing resource for UK researchers. The service is due to commence later in 2020, replacing the current ARCHER service. 

The four-cabinet Shasta Mountain system completed its journey from Cray’s Chippewa Falls factory in the US to EPCC’s Advanced Computing Facility in July. This is the first phase of the 23-cabinet system of ARCHER2, the UK’s next national supercomputing service.

First phase of ARCHER2 arrives in Edinburgh!

Author: Lorna Smith
Posted: 13 Jul 2020 | 10:45

The four-cabinet Shasta Mountain system, the first phase of the 23-cabinet system, has completed its journey from Chippewa Falls in Wisconsin, making its way from Prestwick airport to Edinburgh this morning.   

The arrival of these large crates has, I admit, generated quite a lot of excitement here. Moving these specialist systems and getting the right people here to install them is a logistical challenge at the best of times, but with the necessary Covid-19 restrictions this has been considerably more challenging than usual. We are really grateful to our colleagues at Cray/HPE for all their planning and perseverance! It is a huge step forward to see these systems on site. You will be reassured to know that all necessary safety precautions have been taken to meet Covid-19 guidance and to keep everyone safe.  

ARCHER2: first four cabinets ship from the US

Author: Lorna Smith
Posted: 7 Jul 2020 | 10:04

Covid-19 has created significant challenges for the delivery of the new ARCHER2 system. It is therefore really exciting to see the first 4 cabinets of ARCHER2 leave Cray/HPE’s factory in Chippewa Falls, Wisconsin to begin their journey to Edinburgh.  

ARCHER2 will replace the current ARCHER system, a Cray XC30, as the UK’s National HPC system. Once fully configured, this should provide an average of over 11 times the science throughput of ARCHER.

Under pressure

Author: Adrian Jackson
Posted: 23 Mar 2020 | 10:45

Squeezed performance

Memory under pressure

I was recently working with a colleague to investigate performance issues on a login node for one of our HPC systems. I should say upfront that looking at performance on a login node is generally not advisable, they are shared resources not optimised for performance.

We always tell our students not to run performance benchmarking on login nodes, because it's hard to ensure the results are reproducible. However, in this case we were just running a very small (serial) test program on the login node to ensure it worked before submitting it to the batch systems and my colleague noticed a performance variation across login nodes that was unusual.

Edinburgh International Data Facility: an overview of Phase 1

Author: Rob Baxter
Posted: 22 Nov 2019 | 12:10

Developed by EPCC, the Edinburgh International Data Facility (EIDF) will facilitate new products, services, and research by bringing together regional, national and international datasets.

NEXTGenIO: the end is just the beginning

Author: Michele Weiland
Posted: 7 Nov 2019 | 14:55

After four years of hard work, the NEXTGenIO project has now come to an end. It has been an extremely enjoyable and successful collaboration with a dedicated group of HPC users, software and tools developers, and hardware providers from across Europe.

Precision persistent programming

Author: Adrian Jackson
Posted: 30 Oct 2019 | 12:48

Targeted Performance

Optane DIMM

Blog post updated 8th November 2019 to add Figure 6 highlighting PMDK vs fsdax performance for a range of node counts.

Following on from the recent blog post on our initial performance experiences when using byte-addressable persistent memory (B-APM) in the form of Intel's Optane DCPMM memory modules for data storage and access within compute nodes, we have been exploring performance and programming such memory beyond simple filesystem functionality.

For our previous performance results we used what is known as a fsdax (Filesystem Direct Access) filesystem, which enables bypassing the operating system (O/S) page cache and associated extra memory copies for I/O operations. We were using an ext4 filesystem on fsdax, although ext2 and xfs filesystems are also supported.

Global or local - which is best?

Author: Adrian Jackson
Posted: 9 Oct 2019 | 17:30

Selfish performance

Sharing of resources has challenges for the performance and scaling of large parallel applications. In the NEXTGenIO project we have been focusing specifically on I/O and data management/storage costs, working from the realisation that current filesystems will struggle to efficiently load and store data from millions of processes or tasks all requesting different data sets or bits of information.

Multi-network MPI on Intel Omni-Path

Author: Adrian Jackson
Posted: 17 Jul 2019 | 14:11

Networks

As part of the NEXTGenIO project we have a prototype HPC system that has two Intel Omni-Path networks attached to each node. The aim of having a dual-rail network setup for that system is to investigate the performance and functionality benefits of having separate networks for MPI communications and for I/O storage communications, either directing Lustre traffic and MPI traffic over separate networks, or using a separate network to access NVDIMMs over RDMA. We were also interested in the performance benefits for general applications exploiting multiple networks for MPI traffic, if and where possible.

HPC for urgent decision-making

Author: Nick Brown
Posted: 5 Jul 2019 | 11:13

The EU VESTEC research project is focused on the use of HPC for urgent decision-making and the project team will be running a workshop at SC’19.

VESTEC will build a flexible toolchain to combine multiple data sources, efficiently extract essential features, enable flexible scheduling and interactive supercomputing, and realise 3D visualisation environments for interactive explorations.

Pages