R users can calculate Hamming distance faster

Author: Terry Sloan
Posted: 10 Jan 2014 | 14:09
EPCC and the Division of Pathway Medicine at the University of Edinburgh have released version 1.0.5 of the SPRINT R software package. This includes a new faster parallel implementation of the Hamming distance function.
SPRINT (Simple Parallel R INTerface) is an easy-to-use parallel version of R -  a free software environment for statistical computing and graphics that is very popular in both academia and commerce. SPRINT allows R users access to high performance computing without the need to master parallel programming methods, enabling the easy exploitation of HPC systems. SPRINT v1.0.5 includes the function pstringdistmatrix(). This utilises HPC to compute the Hamming distance between any strings (eg nucleotide bases, Next Generation Sequencing short reads) much faster than the existing serial implementation, stringdistmatrix, of the function.

In addition in v1.0.5, SPRINT has been updated to work with the MPI-3 compliant version of the mpich package and has simplified installation instructions. A number of bug fixes and other updates have also been made, the details of which can be found in the Release Notes and User Guide for v1.0.5.

Download SPRINT v1.0.5 

Please note that SPRINT v1.0.5 is currently not compatible with R version 3 and higher and that it is not available from CRAN at this time, as package submission guidelines have changed with the newly released R 3.0.x.


Terry Sloan, EPCC


Blog Archive