Posted: 23 Jun 2021 | 10:05
For over a decade our community has enjoyed significant performance benefits by leveraging heterogeneous supercomputers. Whilst GPUs are the most common form of accelerator there are also other hardware technologies which can be complementary.
Field Programmable Gate Arrays (FPGAs) enable developers to directly configure the chip, effectively enabling their application to run at the electronics level. There are potential performance and power benefits to tailoring code execution and avoiding the general purpose architecture imposed by CPUs and GPUs, and as such FPGAs have been popular in embedded computing for many years but have not yet enjoyed any level of uptake in HPC.
Posted: 21 Jun 2021 | 10:55
We provide world-class computing systems, data storage and support services. Here are some highlights from our work in this area.
ExCALIBUR FPGA testbed
ExCALIBUR is a £45.7m programme to address the challenges and opportunities offered by computing at the exascale (high performance computing at 1018 floating point operations per second). The programme will address problems of strategic importance, and how to approach them in an efficient, effective, and productive fashion on the world’s largest computers.
Posted: 11 May 2021 | 10:36
EPCC held its first GPU hackathon this April in partnership with NVIDIA, hosting 28 participants and 20 mentors across seven teams. The event was held virtually due to the ongoing Covid-19 pandemic. Using Zoom and Slack, individual teams were able to work alongside mentors in separate breakout rooms and channels.
Posted: 14 Apr 2021 | 10:46
The HPC Systems Team provides the System Development and System Operations functions for ARCHER2 - but who are we and what do we do?
We are a team of fifiteen System Administrators and Developers who work to deploy, manage, and maintain the services and systems offered by EPCC, as well as the infrastructure required to host and support all of EPCC’s services and systems.
Posted: 1 Feb 2021 | 12:10
Since 2005, the Advanced Computing Facility (ACF) has housed all the major systems managed by EPCC. It has expanded and evolved since its creation, becoming one of the most innovative and efficient facilities of its kind in the world.
The building and its internals have changed greatly since I started in February 2018, as part of a drive to ensure that our wider master planning for the site is reflected in what visitors see. This includes a video wall using Raspberry Pis and PiWall software to allow us demonstrate HPC visualisations to visitors.
We are developing a site-wide Data Centre Infrastructure Management (DCIM) approach which allows us to view real-time data or room and system performance on screens outside of different rooms and on our video wall.
In addition, the ACF has had significant investment over the years, most recently with the creation of Computer Room 4, the home of the new Edinburgh International Data Facility (EIDF). We also host and support a number of other HPC systems at the ACF, such as the National Tier-2 system, Cirrus. The first phase of the next UK national supercomputing service, ARCHER2, has also been installed.
Posted: 13 Jan 2021 | 10:35
Work on the Edinburgh International Data Facility has passed three key milestones, bringing the infrastructure that will underpin the £600m Data-Driven Innovation Programme significantly closer to reality.
Firstly, and perhaps most importantly, EIDF’s home, Computer Room 4 (cr4) at the University’s Advanced Computing Facility, completed its main construction phase at the end of the third quarter of 2020 and cr4 has entered its commissioning and fit-out phase. If everything goes to plan, we will start to build infrastructure in the room from January 2021.
Posted: 15 Jul 2020 | 15:41
ARCHER2, the new UK national supercomputing service, is a world-class advanced computing resource for UK researchers. The service is due to commence later in 2020, replacing the current ARCHER service.
The four-cabinet Shasta Mountain system completed its journey from Cray’s Chippewa Falls factory in the US to EPCC’s Advanced Computing Facility in July. This is the first phase of the 23-cabinet system of ARCHER2, the UK’s next national supercomputing service.
Posted: 13 Jul 2020 | 10:45
The four-cabinet Shasta Mountain system, the first phase of the 23-cabinet system, has completed its journey from Chippewa Falls in Wisconsin, making its way from Prestwick airport to Edinburgh this morning.
The arrival of these large crates has, I admit, generated quite a lot of excitement here. Moving these specialist systems and getting the right people here to install them is a logistical challenge at the best of times, but with the necessary Covid-19 restrictions this has been considerably more challenging than usual. We are really grateful to our colleagues at Cray/HPE for all their planning and perseverance! It is a huge step forward to see these systems on site. You will be reassured to know that all necessary safety precautions have been taken to meet Covid-19 guidance and to keep everyone safe.
Posted: 7 Jul 2020 | 10:04
Covid-19 has created significant challenges for the delivery of the new ARCHER2 system. It is therefore really exciting to see the first 4 cabinets of ARCHER2 leave Cray/HPE’s factory in Chippewa Falls, Wisconsin to begin their journey to Edinburgh.
ARCHER2 will replace the current ARCHER system, a Cray XC30, as the UK’s National HPC system. Once fully configured, this should provide an average of over 11 times the science throughput of ARCHER.
Posted: 23 Mar 2020 | 10:45
I was recently working with a colleague to investigate performance issues on a login node for one of our HPC systems. I should say upfront that looking at performance on a login node is generally not advisable, they are shared resources not optimised for performance.
We always tell our students not to run performance benchmarking on login nodes, because it's hard to ensure the results are reproducible. However, in this case we were just running a very small (serial) test program on the login node to ensure it worked before submitting it to the batch systems and my colleague noticed a performance variation across login nodes that was unusual.