Better software, better COVID-19 research

Author: Mike Jackson
Posted: 30 Apr 2020 | 10:08

A friend recently forwarded me a tweet from Professor Neil Ferguson, director of J-IDEA and the MRC Centre for Global Infectious Disease Analysis at Imperial College London, concerning their COVID-19 pandemic modeller, used by the UK Government:

"I'm conscious that lots of people would like to see and run the pandemic simulation code we are using to model control measures against COVID-19. To explain the background - I wrote the code (thousands of lines of undocumented C) 13+ years ago to model flu pandemics..." @neil_ferguson - 22 Mar

Prof. Ferguson's follow-up tweet was to announce that refactoring of the COVID-19 pandemic modeller had begun:

"I am happy to say that @Microsoft and @GitHub are working with @Imperial_JIDEA and @MRC_Outbreak to document, refactor and extend the code to allow others to use without the multiple days training it would currently require (and which we don't have time to give)..." @neil_ferguson - 22 Mar

There proceeded a (remarkably civilised by Twitter standards) thread touching upon issues including software as critical infrastructure, research software development best practice (or lack of), open science, open source (or lack of) and reproducibility. "Should have gone to The Software Sustainability Institute" I thought. The Institute has been assisting researchers with exactly those challenges for 10 years now!

Some of the Twitter reactions reminded me of the furore around GISS Surface Temperature Analysis (GISTEMP), NASA's simulation to calculate and compare global surface temperature anomalies. In 2007, responding to demands from climate bloggers, the GISTEMP source code was released, to a less than positive reaction. GISTEMP was deemed to be poorly organised, incomplete, bug-ridden and would not run. Indeed, there were accusations that the code that was released was not the actual source code. This motivated the formation of the Clear Climate Code Project to reimplement GISTEMP with an emphasis on code clarity, to encourage people to download, inspect and run the code, and to increase public confidence in climate science results. Nick Barnes and David Jones give a great overview of the project and how it came into being in "Clear Climate Code: Rewriting Legacy Science Software for Clarity" (doi: 10.1109/MS.2011.113).

Prof. Ferguson and his team have now released their refactored C++ code on GitHub, within the covid-sim project, available under the GNU General Public License v3.0.

The Software Sustainability Institute cultivates better, more sustainable, research software to enable world-class research. EPCC is a founder member.


Mike Jackson, EPCC