Hardware

Meet the ARCHER2 HPC Systems team

Author: Kieran Leach
Posted: 14 Apr 2021 | 10:46

The HPC Systems Team provides the System Development and System Operations functions for ARCHER2 - but who are we and what do we do?

We are a team of fifiteen System Administrators and Developers who work to deploy, manage, and maintain the services and systems offered by EPCC, as well as the infrastructure required to host and support all of EPCC’s services and systems.

EPCC's Advanced Computing Facility: planning for future growth

Author: Paul Clark
Posted: 1 Feb 2021 | 12:10

Since 2005, the Advanced Computing Facility (ACF) has housed all the major systems managed by EPCC. It has expanded and evolved since its creation, becoming one of the most innovative and efficient facilities of its kind in the world.

The building and its internals have changed greatly since I started in February 2018, as part of a drive to ensure that our wider master planning for the site is reflected in what visitors see. This includes a video wall using Raspberry Pis and PiWall software to allow us demonstrate HPC visualisations to visitors.

We are developing a site-wide Data Centre Infrastructure Management (DCIM) approach which allows us to view real-time data or room and system performance on screens outside of different rooms and on our video wall.

In addition, the ACF has had significant investment over the years, most recently with the creation of Computer Room 4, the home of the new Edinburgh International Data Facility (EIDF). We also host and support a number of other HPC systems at the ACF, such as the National Tier-2 system, Cirrus. The first phase of the next UK national supercomputing service, ARCHER2, has also been installed.

Update: The Edinburgh International Data Facility

Author: Rob Baxter
Posted: 13 Jan 2021 | 10:35

Work on the Edinburgh International Data Facility has passed three key milestones, bringing the infrastructure that will underpin the £600m Data-Driven Innovation Programme significantly closer to reality.

Firstly, and perhaps most importantly, EIDF’s home, Computer Room 4 (cr4) at the University’s Advanced Computing Facility, completed its main construction phase at the end of the third quarter of 2020 and cr4 has entered its commissioning and fit-out phase.  If everything goes to plan, we will start to build infrastructure in the room from January 2021.

First phase of ARCHER2 installed at Advanced Computing Facility

Author: Lorna Smith
Posted: 15 Jul 2020 | 15:41

ARCHER2, the new UK national supercomputing service, is a world-class advanced computing resource for UK researchers. The service is due to commence later in 2020, replacing the current ARCHER service. 

The four-cabinet Shasta Mountain system completed its journey from Cray’s Chippewa Falls factory in the US to EPCC’s Advanced Computing Facility in July. This is the first phase of the 23-cabinet system of ARCHER2, the UK’s next national supercomputing service.

First phase of ARCHER2 arrives in Edinburgh!

Author: Lorna Smith
Posted: 13 Jul 2020 | 10:45

The four-cabinet Shasta Mountain system, the first phase of the 23-cabinet system, has completed its journey from Chippewa Falls in Wisconsin, making its way from Prestwick airport to Edinburgh this morning.   

The arrival of these large crates has, I admit, generated quite a lot of excitement here. Moving these specialist systems and getting the right people here to install them is a logistical challenge at the best of times, but with the necessary Covid-19 restrictions this has been considerably more challenging than usual. We are really grateful to our colleagues at Cray/HPE for all their planning and perseverance! It is a huge step forward to see these systems on site. You will be reassured to know that all necessary safety precautions have been taken to meet Covid-19 guidance and to keep everyone safe.  

ARCHER2: first four cabinets ship from the US

Author: Lorna Smith
Posted: 7 Jul 2020 | 10:04

Covid-19 has created significant challenges for the delivery of the new ARCHER2 system. It is therefore really exciting to see the first 4 cabinets of ARCHER2 leave Cray/HPE’s factory in Chippewa Falls, Wisconsin to begin their journey to Edinburgh.  

ARCHER2 will replace the current ARCHER system, a Cray XC30, as the UK’s National HPC system. Once fully configured, this should provide an average of over 11 times the science throughput of ARCHER.

Under pressure

Author: Adrian Jackson
Posted: 23 Mar 2020 | 10:45

Squeezed performance

Memory under pressure

I was recently working with a colleague to investigate performance issues on a login node for one of our HPC systems. I should say upfront that looking at performance on a login node is generally not advisable, they are shared resources not optimised for performance.

We always tell our students not to run performance benchmarking on login nodes, because it's hard to ensure the results are reproducible. However, in this case we were just running a very small (serial) test program on the login node to ensure it worked before submitting it to the batch systems and my colleague noticed a performance variation across login nodes that was unusual.

Edinburgh International Data Facility: an overview of Phase 1

Author: Rob Baxter
Posted: 22 Nov 2019 | 12:10

Developed by EPCC, the Edinburgh International Data Facility (EIDF) will facilitate new products, services, and research by bringing together regional, national and international datasets.

NEXTGenIO: the end is just the beginning

Author: Michele Weiland
Posted: 7 Nov 2019 | 14:55

After four years of hard work, the NEXTGenIO project has now come to an end. It has been an extremely enjoyable and successful collaboration with a dedicated group of HPC users, software and tools developers, and hardware providers from across Europe.

Precision persistent programming

Author: Adrian Jackson
Posted: 30 Oct 2019 | 12:48

Targeted Performance

Optane DIMM

Blog post updated 8th November 2019 to add Figure 6 highlighting PMDK vs fsdax performance for a range of node counts.

Following on from the recent blog post on our initial performance experiences when using byte-addressable persistent memory (B-APM) in the form of Intel's Optane DCPMM memory modules for data storage and access within compute nodes, we have been exploring performance and programming such memory beyond simple filesystem functionality.

For our previous performance results we used what is known as a fsdax (Filesystem Direct Access) filesystem, which enables bypassing the operating system (O/S) page cache and associated extra memory copies for I/O operations. We were using an ext4 filesystem on fsdax, although ext2 and xfs filesystems are also supported.

Pages

Blog Archive