Sequence Alignment/Map format: a comprehensive review of approaches and applications

Y Liu, X Shen, Y Gong, Y Liu, B Song… - Briefings in …, 2023 - academic.oup.com
Abstract The Sequence Alignment/Map (SAM) format file is the text file used to record
alignment information. Alignment is the core of sequencing analysis, and downstream tasks …

The real cost of sequencing: scaling computation to keep pace with data generation

P Muir, S Li, S Lou, D Wang, DJ Spakowicz, L Salichos… - Genome biology, 2016 - Springer
As the cost of sequencing continues to decrease and the amount of sequence data
generated grows, new paradigms for data storage and analysis are increasingly important …

Genomic data compression

M Hernaez, D Pavlichin, T Weissman… - Annual Review of …, 2019 - annualreviews.org
Recently, there has been growing interest in genome sequencing, driven by advances in
sequencing technology, in terms of both efficiency and affordability. These developments …

Introducing BASE: the Biomes of Australian Soil Environments soil microbial diversity database

A Bissett, A Fitzgerald, T Meintjes, PM Mele, F Reith… - GigaScience, 2016 - Springer
Background Microbial inhabitants of soils are important to ecosystem and planetary
functions, yet there are large gaps in our knowledge of their diversity and ecology. The …

Static analysis tools as early indicators of pre-release defect density

N Nagappan, T Ball - Proceedings of the 27th international conference …, 2005 - dl.acm.org
During software development it is helpful to obtain early estimates of the defect density of
software components. Such estimates identify fault-prone areas of code requiring further …

Storing Images in DNA via base128 Encoding

K Wang, B Cao, T Ma, Y Zhao, Y Zheng… - Journal of Chemical …, 2024 - ACS Publications
Current DNA storage schemes lack flexibility and consistency in processing highly
redundant and correlated image data, resulting in low sequence stability and image …

A survey on data compression methods for biological sequences

M Hosseini, D Pratas, AJ Pinho - Information, 2016 - mdpi.com
The ever increasing growth of the production of high-throughput sequencing data poses a
serious challenge to the storage, processing and transmission of these data. As frequently …

[KNJIGA][B] Bioinformatics and the Cell

X **a - 2007 - Springer
Why should we start a book on bioinformatics with BLAST (Altschul et al. 1990) and FASTA
(Lipman and Pearson 1985; Pearson 1990; Pearson and Lipman 1988)? There surely were …

Predicting protein–protein interactions by fusing various Chou's pseudo components and using wavelet denoising approach

B Tian, X Wu, C Chen, W Qiu, Q Ma, B Yu - Journal of Theoretical Biology, 2019 - Elsevier
Research on protein–protein interactions (PPIs) not only helps to reveal the nature of life
activities but also plays a driving role in understanding the mechanisms of disease activity …

Efficient and robust search of microbial genomes via phylogenetic compression

K Břinda, L Lima, S Pignotti, N Quinones-Olvera… - …, 2024 - pmc.ncbi.nlm.nih.gov
Comprehensive collections approaching millions of sequenced genomes have become
central information sources in the life sciences. However, the rapid growth of these …