Using prototyping to select software for a research software project

Author: Mike Jackson
Posted: 1 Mar 2021 | 14:03

A selection of macaronsChoosing the right software for use in a research software project can be challenging. How do we know which software is both fit for purpose and provides a sound basis for our project for the foreseeable future? And, how do we make such a choice given that the time and effort to explore what could be myriad alternatives may be limited?

This was a challenge we faced in the RiboViz project, a multi-disciplinary team of biologists, bioinformaticians and research software engineers based at EPCC and The Wallace Lab at University of Edinburgh, The Shah Lab at Rutgers University, and The Lareau Lab at University of California, Berkeley. RiboViz is an open source package to help us develop our understanding of protein synthesis via analysis of ribosome profiling data. At the heart of RiboViz is an analysis workflow, implemented in a Python script. To conform to best practices for scientific computing, which recommend the use of build tools to automate workflows and to reuse code instead of rewriting it, we sought to reimplement this workflow within a bioinformatics workflow management system.

To select a workflow management system, from the dozens of options available, we undertook a rapid survey of the available systems and shortlisted four candidates: Snakemake, Toil, cwltool, and Nextflow. The Software Sustainability Institute's guide on Choosing the right open-source software for your project was of great help during this shortlisting.

We then evaluated each candidate by quickly prototyping a subset of the RiboViz workflow, and assessed our experiences against both objective (functional) criteria and subjective (usability) criteria. From this evaluation, we decided to choose Nextflow. Our selection process took 10 person-days, a small cost for the assurance that Nextflow both satisfied our requirements and will continue to do so for the foreseeable future.

In our view, the use of prototyping can offer a low-cost way of making a more informed selection of software to use within projects, rather than relying solely upon reviews and recommendations by others

A comprehensive article describing our approach to shortlisting and prototyping has now been published in PLoS Computational Biology as Jackson M, Kavoussanakis K, Wallace EWJ (2021) Using prototyping to choose a bioinformatics workflow management system. PLoS Comput Biol 17(2): e1008622, doi: 10.1371/journal.pcbi.1008622.

Image: Julian Haler Flickr.

Acknowledgements

This work is funded by BBSRC in the UK and the NSF/BIO in the USA as a BBSRC-NSF/BIO Lead Agency collaboration.

Authors

Mike Jackson, EPCC.

Blog Archive