Variation graph toolkit improves read map** by representing genetic variation in the reference

E Garrison, J Sirén, AM Novak, G Hickey… - Nature …, 2018 - nature.com
Reference genomes guide our interpretation of DNA sequence data. However, conventional
linear references represent only one version of each locus, ignoring variation in the …

GraphAligner: rapid and versatile sequence-to-graph alignment

M Rautiainen, T Marschall - Genome biology, 2020 - Springer
Genome graphs can represent genetic variation and sequence uncertainty. Aligning
sequences to genome graphs is key to many applications, including error correction …

Fully functional suffix trees and optimal text searching in BWT-runs bounded space

T Gagie, G Navarro, N Prezza - Journal of the ACM (JACM), 2020 - dl.acm.org
Indexing highly repetitive texts—such as genomic databases, software repositories and
versioned text collections—has become an important problem since the turn of the …

MONI: a pangenomic index for finding maximal exact matches

M Rossi, M Oliva, B Langmead, T Gagie… - Journal of …, 2022 - liebertpub.com
Recently, Gagie et al. proposed a version of the FM-index, called the r-index, that can store
thousands of human genomes on a commodity computer. Then Kuhnle et al. showed how to …

Unbiased pangenome graphs

E Garrison, A Guarracino - Bioinformatics, 2023 - academic.oup.com
Motivation Pangenome variation graphs model the mutual alignment of collections of DNA
sequences. A set of pairwise alignments implies a variation graph, but there are no scalable …

Efficient pedigree recording for fast population genetics simulation

J Kelleher, KR Thornton, J Ashander… - PLoS computational …, 2018 - journals.plos.org
In this paper we describe how to efficiently record the entire genetic history of a population in
forwards-time, individual-based population genetics simulations with arbitrary breeding …

Succinct de Bruijn graphs

A Bowe, T Onodera, K Sadakane, T Shibuya - International workshop on …, 2012 - Springer
We propose a new succinct de Bruijn graph representation. If the de Bruijn graph of k-mers
in a DNA sequence of length N has m edges, it can be represented in 4 m+ o (m) bits. This is …

Fast search of thousands of short-read sequencing experiments

B Solomon, C Kingsford - Nature biotechnology, 2016 - nature.com
The amount of sequence information in public repositories is growing at a rapid rate.
Although these data are likely to contain clinically important information that has not yet …

Tracy: basecalling, alignment, assembly and deconvolution of sanger chromatogram trace files

T Rausch, MHY Fritz, A Untergasser, V Benes - BMC genomics, 2020 - Springer
Background DNA sequencing is at the core of many molecular biology laboratories. Despite
its long history, there is a lack of user-friendly Sanger sequencing data analysis tools that …

CompressDB: Enabling efficient compressed data direct processing for various databases

F Zhang, W Wan, C Zhang, J Zhai, Y Chai… - Proceedings of the 2022 …, 2022 - dl.acm.org
In modern data management systems, directly performing operations on compressed data
has been proven to be a big success facing big data problems. These systems have …