Data compression for sequencing data

S Deorowicz, S Grabowski - Algorithms for Molecular Biology, 2013 - Springer
Post-Sanger sequencing methods produce tons of data, and there is a generalagreement
that the challenge to store and process them must be addressedwith data compression. In …

High-throughput DNA sequence data compression

Z Zhu, Y Zhang, Z Ji, S He, X Yang - Briefings in bioinformatics, 2015 - academic.oup.com
The exponential growth of high-throughput DNA sequence data has posed great challenges
to genomic data storage, retrieval and transmission. Compression is a critical tool to address …

A survey on data compression methods for biological sequences

M Hosseini, D Pratas, AJ Pinho - Information, 2016 - mdpi.com
The ever increasing growth of the production of high-throughput sequencing data poses a
serious challenge to the storage, processing and transmission of these data. As frequently …

Efficient DNA sequence compression with neural networks

M Silva, D Pratas, AJ Pinho - GigaScience, 2020 - academic.oup.com
Background The increasing production of genomic data has led to an intensified need for
models that can cope efficiently with the lossless compression of DNA sequences. Important …

MFCompress: a compression tool for FASTA and multi-FASTA data

AJ Pinho, D Pratas - Bioinformatics, 2014 - academic.oup.com
Motivation: The data deluge phenomenon is becoming a serious problem in most genomic
centers. To alleviate it, general purpose tools, such as gzip, are used to compress the data …

GReEn: a tool for efficient compression of genome resequencing data

AJ Pinho, D Pratas, SP Garcia - Nucleic acids research, 2012 - academic.oup.com
Research in the genomic sciences is confronted with the volume of sequencing and
resequencing data increasing at a higher pace than that of data storage and communication …

Biometric and emotion identification: An ECG compression based method

S Brás, JHT Ferreira, SC Soares, AJ Pinho - Frontiers in psychology, 2018 - frontiersin.org
We present an innovative and robust solution to both biometric and emotion identification
using the electrocardiogram (ECG). The ECG represents the electrical signal that comes …

Efficient compression of genomic sequences

D Pratas, AJ Pinho… - 2016 Data compression …, 2016 - ieeexplore.ieee.org
The number of genomic sequences is growing substantially. Besides discarding part of the
data, the only efficient possibility for co** with this trend is data compression. We present …

The complexity landscape of viral genomes

JM Silva, D Pratas, T Caetano, S Matos - GigaScience, 2022 - academic.oup.com
Background Viruses are among the shortest yet highly abundant species that harbor
minimal instructions to infect cells, adapt, multiply, and exist. However, with the current …

FQSqueezer: k-mer-based compression of sequencing data

S Deorowicz - Scientific reports, 2020 - nature.com
The amount of data produced by modern sequencing instruments that needs to be stored is
huge. Therefore it is not surprising that a lot of work has been done in the field of specialized …