Posted: 14 Apr 2021 | 10:46
The HPC Systems Team provides the System Development and System Operations functions for ARCHER2 - but who are we and what do we do?
We are a team of fifiteen System Administrators and Developers who work to deploy, manage, and maintain the services and systems offered by EPCC, as well as the infrastructure required to host and support all of EPCC’s services and systems.
Posted: 1 Feb 2021 | 12:10
Since 2005, the Advanced Computing Facility (ACF) has housed all the major systems managed by EPCC. It has expanded and evolved since its creation, becoming one of the most innovative and efficient facilities of its kind in the world.
The building and its internals have changed greatly since I started in February 2018, as part of a drive to ensure that our wider master planning for the site is reflected in what visitors see. This includes a video wall using Raspberry Pis and PiWall software to allow us demonstrate HPC visualisations to visitors.
We are developing a site-wide Data Centre Infrastructure Management (DCIM) approach which allows us to view real-time data or room and system performance on screens outside of different rooms and on our video wall.
In addition, the ACF has had significant investment over the years, most recently with the creation of Computer Room 4, the home of the new Edinburgh International Data Facility (EIDF). We also host and support a number of other HPC systems at the ACF, such as the National Tier-2 system, Cirrus. The first phase of the next UK national supercomputing service, ARCHER2, has also been installed.
Posted: 13 Jan 2021 | 10:35
Work on the Edinburgh International Data Facility has passed three key milestones, bringing the infrastructure that will underpin the £600m Data-Driven Innovation Programme significantly closer to reality.
Firstly, and perhaps most importantly, EIDF’s home, Computer Room 4 (cr4) at the University’s Advanced Computing Facility, completed its main construction phase at the end of the third quarter of 2020 and cr4 has entered its commissioning and fit-out phase. If everything goes to plan, we will start to build infrastructure in the room from January 2021.
Posted: 15 Jul 2020 | 15:41
ARCHER2, the new UK national supercomputing service, is a world-class advanced computing resource for UK researchers. The service is due to commence later in 2020, replacing the current ARCHER service.
The four-cabinet Shasta Mountain system completed its journey from Cray’s Chippewa Falls factory in the US to EPCC’s Advanced Computing Facility in July. This is the first phase of the 23-cabinet system of ARCHER2, the UK’s next national supercomputing service.
Posted: 13 Jul 2020 | 10:45
The four-cabinet Shasta Mountain system, the first phase of the 23-cabinet system, has completed its journey from Chippewa Falls in Wisconsin, making its way from Prestwick airport to Edinburgh this morning.
The arrival of these large crates has, I admit, generated quite a lot of excitement here. Moving these specialist systems and getting the right people here to install them is a logistical challenge at the best of times, but with the necessary Covid-19 restrictions this has been considerably more challenging than usual. We are really grateful to our colleagues at Cray/HPE for all their planning and perseverance! It is a huge step forward to see these systems on site. You will be reassured to know that all necessary safety precautions have been taken to meet Covid-19 guidance and to keep everyone safe.
Posted: 7 Jul 2020 | 10:04
Covid-19 has created significant challenges for the delivery of the new ARCHER2 system. It is therefore really exciting to see the first 4 cabinets of ARCHER2 leave Cray/HPE’s factory in Chippewa Falls, Wisconsin to begin their journey to Edinburgh.
ARCHER2 will replace the current ARCHER system, a Cray XC30, as the UK’s National HPC system. Once fully configured, this should provide an average of over 11 times the science throughput of ARCHER.
Posted: 23 Mar 2020 | 10:45
I was recently working with a colleague to investigate performance issues on a login node for one of our HPC systems. I should say upfront that looking at performance on a login node is generally not advisable, they are shared resources not optimised for performance.
We always tell our students not to run performance benchmarking on login nodes, because it's hard to ensure the results are reproducible. However, in this case we were just running a very small (serial) test program on the login node to ensure it worked before submitting it to the batch systems and my colleague noticed a performance variation across login nodes that was unusual.
Posted: 22 Nov 2019 | 12:10
Developed by EPCC, the Edinburgh International Data Facility (EIDF) will facilitate new products, services, and research by bringing together regional, national and international datasets.
Posted: 7 Nov 2019 | 14:55
After four years of hard work, the NEXTGenIO project has now come to an end. It has been an extremely enjoyable and successful collaboration with a dedicated group of HPC users, software and tools developers, and hardware providers from across Europe.
Posted: 30 Oct 2019 | 12:48
Blog post updated 8th November 2019 to add Figure 6 highlighting PMDK vs fsdax performance for a range of node counts.
Following on from the recent blog post on our initial performance experiences when using byte-addressable persistent memory (B-APM) in the form of Intel's Optane DCPMM memory modules for data storage and access within compute nodes, we have been exploring performance and programming such memory beyond simple filesystem functionality.
For our previous performance results we used what is known as a fsdax (Filesystem Direct Access) filesystem, which enables bypassing the operating system (O/S) page cache and associated extra memory copies for I/O operations. We were using an ext4 filesystem on fsdax, although ext2 and xfs filesystems are also supported.