Training at EPCC: Fundamentals of HPC System Administration

9 December 2025

EPCC's online Fundamentals of HPC System Administration course provides essential skills required by the sector.

When recruiting candidates to our HPC Systems team, we have frequently found that we interview candidates with very strong system administration skills but very little experience of HPC systems. Informal discussions with other HPC service providers confirms this is also their experience. Since providing training and education is a core objective of EPCC, we decided to bridge the skills gap in this specialised area of system administration by creating the Fundamentals of HPC System Administration (FHPCSA) course

Lecturers from EPCC's Systems Team (with a combined 67 years' experience in HPC systems administration) delivered this course for the first time in Semester Two of 2024/2025, and feedback from our first class of students was excellent. 

Course content

The course covers a wide breadth of topics aimed at preparing students for administrating HPC systems. In particular: 

  • Networks: from common ethernet to fast HPC interconnects
  • Clustering: how to deploy many hosts with automated tools
  • Schedulers: how to configure and deploy a workload manager
  • Filesystems: network-attached and parallel file systems
  • Logging and monitoring: automated alerts and logs of past actions
  • Information security: how to protect the system and user data
  • User environments: how to enable different types of users to efficiently use HPC systems
  • Automation: how recurring tasks can be automated. 

This course is fully online. There are weekly releases of pre-recorded lectures on each topic as well as any relevant exercises and supplementary material. 

Tutorial sessions with the lecturers are also held weekly, with students encouraged to discuss and ask questions about the topic of the week or the exercises. 

Practical exercises, as well as the final assessment, are all run using the Edinburgh International Data Facility's VM cluster. This gives each student a pool of resources (CPU cores, memory, storage) that can then be partitioned into different virtual servers and clustered together. Each exercise will give hands-on experience on the different topics and build on each other so that by the end of the semester students will be able to deploy, configure and manage a fully functional HPC system. 

How to apply 

Applications for the next run of our Fundamentals of HPC System Administration course are open now. This course is only available online. 

To apply, visit the same page that you would use to apply for our full MSc programme. See: High Performance Computing programmes. Select “PgProfDev High Performance Computing (2 years)” from the drop-down menu that appears on this page under "Apply".

For more details about the FHPCA course, please see the page Fundamentals of HPC Systems Administration.

The course is currently scheduled to run in the second semester of every academic year, from mid-January to May. 

Please note that anyone interested in taking this course only should apply through the Postgraduate Professional Development route.  

MSc programmes at EPCC

To apply to EPCC's online MSc programmes, please see:

EPCC is a leading provider of high performance computing and data science education and training in Europe. Find out how we can help build your career: 
https://www.epcc.ed.ac.uk/education-and-training 

Authors

Dr Rui Apóstolo
Rui Apóstolo