Improving the performance of TINKER, a molecular dynamics codebase
Posted: 4 Nov 2014 | 10:52
Justs Zarins reports on his work to improve the performance of TINKER, a molecular dynamics codebase. This 3-month dissertation project was undertaken as part of his MSc in High Performance Computing at EPCC. Justs has now joined EPCC as a post-graduate researcher.
TINKER has a leading implementation of the AMOEBA polarizable force-field which affords more accurate simulations, but at the cost of increased computation time. To focus the work it was decided to only look at the implicit solvation part of the code which calculates properties of systems immersed in a solvent without explicitly simulating particles that make up the solvent. The project was run in collaboration with researchers from Washington and Southampton universities.
The original code was partially parallelised using OpenMP. Running some test cases showed that parallel scaling was rather small, peaking at a speed-up factor of roughly 1.4. Analysing the code revealed that this was not actually that bad considering that only 35% of it was parallelised, allowing for a theoretical maximum speed-up of just 1.5 according to Amdahl’s law.
The first task then was to increase parallel coverage. Most of the code time was contained in a few large loops so this was deemed to be well suited for OpenMP. Parallel coverage was thus increased to 90% which yielded a speedup of 3.9. This is much smaller than the theoretical Amdahl’s law prediction of a maximal speed-up of 10! Further investigation provided evidence of suboptimal use of hardware which could be caused by memory contention due to the code being originally optimised for serial execution.
In an ideal world I would have spent time exploring these issues in more depth and rewriting the code to fix them, as any industrious HPC practitioner would. However, it is worth taking a step back and noting that the actual users of the software were glad to receive the speed boost. So while bigger code changes are likely needed to push the performance further, at least for the moment the researchers’ curiosity can be satisfied a few times faster.
Molecular dynamics (MD) is a valuable technique for many scientists including biologists, chemists and physicists. It allows insight to be gained into atomic-scale processes through computer simulation, which is cheaper and more convenient than direct experimentation. However the need to explore larger systems more realistically and in greater detail increases the time needed to run these simulations. Code parallelisation and optimisation address this issue and are crucial for ensuring that performance can continue to improve on future hardware.
Image shows Staphylococcal nuclease - an important model system for the study of protein folding and used to test the performance of TINKER. PDB ID: 3BDC Castaneda et al. (2009) Molecular determinants of the pKa values of Asp and Glu residues in staphylococcal nuclease. Proteins 77:570-588.
See also the earlier blog post: Optimising OpenMP implementation of Tinker, a molecular dynamics modelling package