Lossy compression of quality scores in genomic data

Abstract

We present existing compression options for quality score data, and then introduce two new lossy techniques. Experiments measuring the trade-off between compression ratio and information loss are reported, including quantifying the effect of lossy representations on a downstream application that carries out single nucleotide polymorphism and insert/deletion detection. The new methods are demonstrably superior to other techniques when assessed against the spectrum of possible trade-offs between storage required and fidelity of representation.

Publication
Bioinformatics 2014
Date