Posted: 13 Jan 2021 | 10:35
Work on the Edinburgh International Data Facility has passed three key milestones, bringing the infrastructure that will underpin the £600m Data-Driven Innovation Programme significantly closer to reality.
Firstly, and perhaps most importantly, EIDF’s home, Computer Room 4 (cr4) at the University’s Advanced Computing Facility, completed its main construction phase at the end of the third quarter of 2020 and cr4 has entered its commissioning and fit-out phase. If everything goes to plan, we will start to build infrastructure in the room from January 2021.
Posted: 15 Jul 2020 | 15:41
ARCHER2, the new UK national supercomputing service, is a world-class advanced computing resource for UK researchers. The service is due to commence later in 2020, replacing the current ARCHER service.
The four-cabinet Shasta Mountain system completed its journey from Cray’s Chippewa Falls factory in the US to EPCC’s Advanced Computing Facility in July. This is the first phase of the 23-cabinet system of ARCHER2, the UK’s next national supercomputing service.
Posted: 13 Jul 2020 | 10:45
The four-cabinet Shasta Mountain system, the first phase of the 23-cabinet system, has completed its journey from Chippewa Falls in Wisconsin, making its way from Prestwick airport to Edinburgh this morning.
The arrival of these large crates has, I admit, generated quite a lot of excitement here. Moving these specialist systems and getting the right people here to install them is a logistical challenge at the best of times, but with the necessary Covid-19 restrictions this has been considerably more challenging than usual. We are really grateful to our colleagues at Cray/HPE for all their planning and perseverance! It is a huge step forward to see these systems on site. You will be reassured to know that all necessary safety precautions have been taken to meet Covid-19 guidance and to keep everyone safe.
Posted: 7 Jul 2020 | 10:04
Covid-19 has created significant challenges for the delivery of the new ARCHER2 system. It is therefore really exciting to see the first 4 cabinets of ARCHER2 leave Cray/HPE’s factory in Chippewa Falls, Wisconsin to begin their journey to Edinburgh.
ARCHER2 will replace the current ARCHER system, a Cray XC30, as the UK’s National HPC system. Once fully configured, this should provide an average of over 11 times the science throughput of ARCHER.
Posted: 23 Mar 2020 | 10:45
I was recently working with a colleague to investigate performance issues on a login node for one of our HPC systems. I should say upfront that looking at performance on a login node is generally not advisable, they are shared resources not optimised for performance.
We always tell our students not to run performance benchmarking on login nodes, because it's hard to ensure the results are reproducible. However, in this case we were just running a very small (serial) test program on the login node to ensure it worked before submitting it to the batch systems and my colleague noticed a performance variation across login nodes that was unusual.
Posted: 22 Nov 2019 | 12:10
Developed by EPCC, the Edinburgh International Data Facility (EIDF) will facilitate new products, services, and research by bringing together regional, national and international datasets.
Posted: 7 Nov 2019 | 14:55
After four years of hard work, the NEXTGenIO project has now come to an end. It has been an extremely enjoyable and successful collaboration with a dedicated group of HPC users, software and tools developers, and hardware providers from across Europe.
Posted: 30 Oct 2019 | 12:48
Blog post updated 8th November 2019 to add Figure 6 highlighting PMDK vs fsdax performance for a range of node counts.
Following on from the recent blog post on our initial performance experiences when using byte-addressable persistent memory (B-APM) in the form of Intel's Optane DCPMM memory modules for data storage and access within compute nodes, we have been exploring performance and programming such memory beyond simple filesystem functionality.
For our previous performance results we used what is known as a fsdax (Filesystem Direct Access) filesystem, which enables bypassing the operating system (O/S) page cache and associated extra memory copies for I/O operations. We were using an ext4 filesystem on fsdax, although ext2 and xfs filesystems are also supported.
Posted: 9 Oct 2019 | 17:30
Sharing of resources has challenges for the performance and scaling of large parallel applications. In the NEXTGenIO project we have been focusing specifically on I/O and data management/storage costs, working from the realisation that current filesystems will struggle to efficiently load and store data from millions of processes or tasks all requesting different data sets or bits of information.
Posted: 17 Jul 2019 | 14:11
As part of the NEXTGenIO project we have a prototype HPC system that has two Intel Omni-Path networks attached to each node. The aim of having a dual-rail network setup for that system is to investigate the performance and functionality benefits of having separate networks for MPI communications and for I/O storage communications, either directing Lustre traffic and MPI traffic over separate networks, or using a separate network to access NVDIMMs over RDMA. We were also interested in the performance benefits for general applications exploiting multiple networks for MPI traffic, if and where possible.