Energy efficient supercomputing

Author: Nick Johnson
Posted: 19 Nov 2013 | 06:35

Today I attended the "First International Workshop on Energy Efficient Supercomputing (E2SC)"

It was really interesting (and confidence boosting) to find that what we are trying to do in the Adept project is similar in approach to that done by other research labs. The talks could be divided into two categories: modelling the effect of system parameters (such as cache) on energy and performance efficiency; and methods for measuring energy consumption.

The keynote talk by Pradip Bose (IBM TJ Watson) introduced the trade-off between performance and power. Most HPC programmers understand performance well but may not be aware of the implications for power. To squeeze that last 5 or 10% of performance from your system (or code) you expend a huge volume of energy. This is one of the justifications of multi-core over single-threaded operation. But, you might be operating in a system where energy usage is a constraint which must be managed like runtime or IO. So, there are trade-offs. To lower power consumption you can reduce voltage but this can lead to soft errors, which we generally try to avoid. Thus there are choices with clear energy implications and trying to reduce power use may increase error rates, which we will have to expend energy to fix.

The other talks concentrated on power measurement, optimization and modelling strategies. For example, in a modern processor with DVFS, you can try to tune the core frequency to minimise power for the current set of instructions. To do this you need a good model of what's happening in your CPU or system (at the node level). A simple but effective model can be built from some observations which allows you to implement power control algorithms. Alongside this, you need to be able to compare apples with apples to establish how efficient systeam A is compared to system B. Generally, the current thinking to to characterize the CPU in terms of per core power consumption for processing a unit of computation.

Finally, last night I got the chance to meet the folks from Energy Efficient High Performance Computing Working Group who are trying to bring some standardization to this emerging world of energy measurement. For those with an interest, their website contains slides from past webinars and various documents.

Author

Nick Johnson, EPCC