Systematic benchmarking of omics computational tools
Computational omics methods packaged as software have become essential to modern
biological research. The increasing dependence of scientists on these powerful software …
biological research. The increasing dependence of scientists on these powerful software …
Genomic data compression
Recently, there has been growing interest in genome sequencing, driven by advances in
sequencing technology, in terms of both efficiency and affordability. These developments …
sequencing technology, in terms of both efficiency and affordability. These developments …
An introduction to mpeg-g: the first open iso/iec standard for the compression and exchange of genomic sequencing data
The development and progress of high-throughput sequencing technologies have
transformed the sequencing of DNA from a scientific research challenge to practice. With the …
transformed the sequencing of DNA from a scientific research challenge to practice. With the …
CALQ: compression of quality values of aligned sequencing data
Motivation Recent advancements in high-throughput sequencing technology have led to a
rapid growth of genomic data. Several lossless compression schemes have been proposed …
rapid growth of genomic data. Several lossless compression schemes have been proposed …
An introduction to MPEG-G, the new ISO standard for genomic information representation
The MPEG-G standardization initiative is a coordinated international effort to specify a
compressed data format that enables large scale genomic data to be processed, transported …
compressed data format that enables large scale genomic data to be processed, transported …
CMIC: an efficient quality score compressor with random access functionality
H Chen, J Chen, Z Lu, R Wang - BMC bioinformatics, 2022 - Springer
Background Over the past few decades, the emergence and maturation of new technologies
have substantially reduced the cost of genome sequencing. As a result, the amount of …
have substantially reduced the cost of genome sequencing. As a result, the amount of …
CSAM: compressed SAM format
Motivation: Next generation sequencing machines produce vast amounts of genomic data.
For the data to be useful, it is essential that it can be stored and manipulated efficiently. This …
For the data to be useful, it is essential that it can be stored and manipulated efficiently. This …
AQUa: an adaptive framework for compression of sequencing quality scores with random access functionality
Motivation The past decade has seen the introduction of new technologies that significantly
lowered the cost of genome sequencing. As a result, the amount of genomic data that must …
lowered the cost of genome sequencing. As a result, the amount of genomic data that must …
MZPAQ: a FASTQ data compression tool
A El Allali, M Arshad - Source code for biology and medicine, 2019 - Springer
Background Due to the technological progress in Next Generation Sequencing (NGS), the
amount of genomic data that is produced daily has seen a tremendous increase. This …
amount of genomic data that is produced daily has seen a tremendous increase. This …
A two-level scheme for quality score compression
Previous studies on quality score compression can be classified into two main lines: lossy
schemes and lossless schemes. Lossy schemes enable a better management of …
schemes and lossless schemes. Lossy schemes enable a better management of …