Systematic benchmarking of omics computational tools

S Mangul, LS Martin, BL Hill, AKM Lam… - Nature …, 2019 - nature.com
Computational omics methods packaged as software have become essential to modern
biological research. The increasing dependence of scientists on these powerful software …

Genomic data compression

M Hernaez, D Pavlichin, T Weissman… - Annual Review of …, 2019 - annualreviews.org
Recently, there has been growing interest in genome sequencing, driven by advances in
sequencing technology, in terms of both efficiency and affordability. These developments …

An introduction to mpeg-g: the first open iso/iec standard for the compression and exchange of genomic sequencing data

J Voges, M Hernaez, M Mattavelli… - Proceedings of the …, 2021 - ieeexplore.ieee.org
The development and progress of high-throughput sequencing technologies have
transformed the sequencing of DNA from a scientific research challenge to practice. With the …

CALQ: compression of quality values of aligned sequencing data

J Voges, J Ostermann, M Hernaez - Bioinformatics, 2018 - academic.oup.com
Motivation Recent advancements in high-throughput sequencing technology have led to a
rapid growth of genomic data. Several lossless compression schemes have been proposed …

An introduction to MPEG-G, the new ISO standard for genomic information representation

C Alberti, T Paridaens, J Voges, D Naro, JJ Ahmad… - bioRxiv, 2018 - biorxiv.org
The MPEG-G standardization initiative is a coordinated international effort to specify a
compressed data format that enables large scale genomic data to be processed, transported …

CMIC: an efficient quality score compressor with random access functionality

H Chen, J Chen, Z Lu, R Wang - BMC bioinformatics, 2022 - Springer
Background Over the past few decades, the emergence and maturation of new technologies
have substantially reduced the cost of genome sequencing. As a result, the amount of …

CSAM: compressed SAM format

R Cánovas, A Moffat, A Turpin - Bioinformatics, 2016 - academic.oup.com
Motivation: Next generation sequencing machines produce vast amounts of genomic data.
For the data to be useful, it is essential that it can be stored and manipulated efficiently. This …

AQUa: an adaptive framework for compression of sequencing quality scores with random access functionality

T Paridaens, G Van Wallendael, W De Neve… - …, 2018 - academic.oup.com
Motivation The past decade has seen the introduction of new technologies that significantly
lowered the cost of genome sequencing. As a result, the amount of genomic data that must …

MZPAQ: a FASTQ data compression tool

A El Allali, M Arshad - Source code for biology and medicine, 2019 - Springer
Background Due to the technological progress in Next Generation Sequencing (NGS), the
amount of genomic data that is produced daily has seen a tremendous increase. This …

A two-level scheme for quality score compression

J Voges, A Fotouhi, J Ostermann… - Journal of Computational …, 2018 - liebertpub.com
Previous studies on quality score compression can be classified into two main lines: lossy
schemes and lossless schemes. Lossy schemes enable a better management of …