How it feels to be a winner!
26 May 2023
Members of TeamEPCC give their first reactions to their incredible win in the ISC23 Student Cluster Competition.
TeamEPCC comprised five students from our MSc programmes in High Performance Computing (HPC) and HPC with Data Science: Hristo Belchev; Ikraduya Edian; Jaffery Irudayasamy; Oleksandr Piekhota; and Tomas Rubio Cruz, and a third year undergraduate student intern from Edinburgh Napier University's Networking and Cybersecurity programme: Kris Tanev.
It was one of seven teams selected for the on-site Student Cluster Competition (SCC) at ISC23 in Hamburg, Germany, with another 15 teams competing virtually.
The ISC SCC is an annual event for all the selected student teams to demonstrate their abilities to design their own clusters and achieve the best performance on a selection of benchmarks and scientific applications within a certain power usage limit.
I think that what led us to winning was the collaborative spirit of the team as well as the drive of all its members - I felt like I was part of something much bigger than myself and that every member was giving their best to propel everyone forward.
Winning the competition was an amazing experience. Everyone in the team committed an enormous effort to optimise all the applications and benchmarks throughout the year and the hard work really paid off in the end.
Arriving on site, we faced some issues with the cluster, such as missing power adapters and network configuration issues, but the systems support and team members really stepped up to overcome these challenges, allowing for the team's excellent performance.
I am proud to have been a part of TeamEPCC and to have worked with my teammates - all of them are exceptional at what they do and brought amazing energy and passion that allowed us to excel in all the benchmarks. Taking part of ISC has allowed me to forge what I believe are going to be life-long friendships, both with TeamEPCC members and with members of the wider HPC community.
I was originally interested in joining TeamEPCC because it was an opportunity to get hands-on experience with real HPC hardware and applications, as well as a chance to meet the wider community. Preparing for the competition, however, proved to be much more than that - it exposed to me the full software and hardware HPC stacks, allowing me to learn about compilers, NUMA memory, RDMA communications, OpenMPI components and so much more. Arriving at the competition, I felt that the months I spent preparing were worth it, but that experience also showed me that there is much I still didn't know.
If there is anything I have learned, it is that having such capable and driven team members will always lead to an amazing time and learning something new, and that hard work is bound to be rewarded!
To future students/team members: if you are motivated, have a passion for HPC and want to learn beyond the curriculum, joining TeamEPCC is the best thing you can ever do!
At first it felt unbelievable. However, looking back on our preparation for the past 6 months and our effort during the competition, the result felt really well deserved. I really appreciate all the support that we received from our mentors, EPCC, sponsors, and everyone who continues to support us.
I joined TeamEPCC because I am interested in the competition and wanted to apply the knowledge I gained from my MSc in HPC. I believe that I learn and retain knowledge better when I can use it directly, not just learn it from lectures.
Our preparation for the competition spanned approximately six months. After the TeamEPCC members were announced, we held weekly meetings to ensure our progress was on track. These meetings focused on various aspects, including application setup, benchmarking, performance, and addressing any challenges or logistical concerns. As a result, we felt fully prepared for each application and had a well thought out plan for execution during the competition.
Honestly, the competition itself was both tiring and exciting. We felt the pressure from other teams but also made valuable connections with them along the way.
We encountered several challenges during the preparation phase. Initially, the late arrival of our second node limited our time to practice using two nodes simultaneously. Additionally, one of our original nodes was faulty, prompting us to introduce a third node and migrate our application. This added complexity and required additional time.
During the competition, we faced further challenges. Just before the competition began, we noticed our cluster was not behaving as expected due to the low clock speed on our GPU. The issue was traced back to inadequate power supply, which forced us to urgently source additional cables. Thanks to the efforts of Hristo and Spyro, we managed to find the necessary cables in time, resolving the problem.
Another challenge arose when the committee announced the secret application, named MILC. We had not managed to build and run this application by the end of the second day. However, we persevered and leveraged another cluster, ACF (Advanced Computing Facility), in our data centre to figure out how to build and run MILC. With some late-night experimentation, we eventually achieved excellent performance with MILC on the actual cluster.
I believe I performed particularly well in running the FluTAS application and MILC, the secret application. While our FluTAS result may not have been the best according to our exchanges with other teams, our MILC result stood out as the best among the competitors.
The competition has been a tremendous learning experience for me, specifically in the field of HPC clusters and hardware. Coming from a Computer Science background, I had limited knowledge of computer hardware, and this competition has significantly expanded my understanding in this area.
Finally, I would like to express my gratitude to my colleagues in the MSc in High Performance Computing and MSc in High Performance Computing in Data Science programmes for their unwavering support. I am also immensely thankful for the teachings and guidance provided by our dedicated lecturers.
Being part of EPCC and being able to support our MSc students in HPC is a part of my career that I will remember for the rest of my life.
A big day to remember!
Winning the overall first place in this year's student cluster competition is a feeling that I can hardly express with a few words, but I can safely say that it's one of a kind.
When I first heard about ISC23 and the opportunity to meet new people and help out fellow colleagues, I immediately wanted to get involved and leave a mark that could inspire future students in our industry.
Of course, we had to fight hard for our success, as the competition was very good, but because of the team's dedication and hard work, we pulled it off. We faced multiple challenges like package problems, networking problems, and some hardware faults in preparation for the competition, but thanks to everybody's teamwork, nothing stopped us.
This experience and the things that I have learned are invaluable to me. From balancing workload, efficient communication, and task prioritisation to technicalities such as running applications with OpenMPI, I have realised that there is always room for improvement, and I will do my utmost to keep growing.
We have met many interesting people, gotten into, and understood many modern HPC trends, seen many HPC companies, and made many connections.
I joined TeamEPCC because it was a great chance to practice and improve our skills in Linux administration, applications benchmarking, optimizing performance, and running software in a cluster environment. This competition assumed software optimization and skill in working with the hardware and power consumption monitoring. In addition, we were required to get through many architectural questions, finding the best cluster fit for the competition and deciding between GPU-first or CPU-first approach. Last but not least, being a part of a great team, visiting the international conference, meeting other HPC students, networking, and representing EPCC was also very important.
Preparing for the competition was a long process. We started during the first semester and did a lot of initial preparation and work before we were approved for the competition.
After we were confirmed, we started working on our application improvements. In my case, working on the HPCC benchmark required doing extra work to understand the benchmark architecture and apply a tailored approach for each HPCC benchmark component, such as HPL, Stream, Fft, and others. Experimenting with dependencies, compiler optimization flags, specific test parameters, and benchmarking the relation between MPI processes and OpenMP threads performance allowed us to get as much as possible from the selected hardware.
We faced many challenges and difficulties during the competition, issues with power cables, limited time for running our benchmarks, passing through the problems running and optimizing the secret application. During the match, we endured a lot of emotional pressure and tiredness. Still, all of those disappeared when we understood that our work got great results and benefited us by winning the first place in the competition.
The biggest challenge was to find extra cables and electrical adapters to power up our cluster. We hadn't expected the power delivery units would need more proper power connectors to supply the electricity for the server cluster. We almost got to the situation where we could fail the whole competition without starting it. Thanks to Spyro Nita, Hristo Belchev, and the SCC Supervisors, we were able to fix the issue. However, that happened several hours before the competition started, and our rush after that point required us to work very precisely and under enormous pressure. In addition, the secret application was challenging to run and compile. Therefore, we had to work up to very late hours to make some progress. Luckily, we were able to handle that and show good performance.
All other team members were blocked while we were looking for the cables and adapter. However, my application could still be tested and run. Therefore, I spent that day up to the very late point testing and optimizing the performance, saving time for the other team members when we found the cables.
I tried to help the team with every problem they faced during their benchmarks and did not interfere when my presence was not needed. Everyone in the group put significant effort into improving the secret application.
We've had huge practical experience optimizing applications in concise critical conditions, having to consider even the temperature outside the conference hall. The challenges and experience of sitting and working next to the cluster, monitoring power usage, and benchmarking the applications, knowing that the time is limited, was a huge lesson itself.
Thanks to our supporters
We thank our sponsor, Hewlett-Packard Enterprise (HPE), and the EPCC systems team for their support.
Below: TeamEPCC competing in ISC23 SCC; submitting final results; awaiting the announcement of the competition winners; on stage at the awards ceremony, celebrating in the TeamEPCC booth at ISC, and with EPCC mentors.
Image credits: results submission by Ophir Maor, HPC-AI Advisory Council; all others by Xu Guo, Spyro Nita, and Kris Taner.
MSc programmes at EPCC
Read about all our training opportunities on our Education and Training pages.
If you think you could be part of the next TeamEPCC, apply today!
Application deadline for 2023 entry (international students): 31 May 2023.
Application deadline for 2023 entry (UK-based students, including EU/EEA settled/pre-settled status): 15 July 2023.
Application deadline for 2023 entry: August 2023
Applications for 2024 entry will open in early October 2023, with deadlines usually similar to those listed above.