Accelerating key bioinformatics tasks 100-fold by improving memory access
Igor Sfiligoi, Daniel McDonald, Rob Knight
Most experimental sciences now rely on computing, and biological sciences are
no exception. As datasets get bigger, so do the computing costs, making proper
optimization of the codes used by scientists increasingly important. Many of
the codes developed in recent years are based on the Python-based NumPy, due to
its ease of use and good performance characteristics. The composable nature of
NumPy, however, does not generally play well with the multi-tier nature of
modern CPUs, making any non-trivial multi-step algorithm limited by the
external memory access speeds, which are hundreds of times slower than the
CPU's compute capabilities. In order to fully utilize the CPU compute
capabilities, one must keep the working memory footprint small enough to fit in
the CPU caches, which requires splitting the problem into smaller portions and
fusing together as many steps as possible. In this paper, we present changes
based on these principles to two important functions in the scikit-bio library,
principal coordinates analysis and the Mantel test, that resulted in over 100x
speed improvement in these widely used, general-purpose tools.