Variation graph toolkit improves read map** by representing genetic variation in the reference
Reference genomes guide our interpretation of DNA sequence data. However, conventional
linear references represent only one version of each locus, ignoring variation in the …
linear references represent only one version of each locus, ignoring variation in the …
GraphAligner: rapid and versatile sequence-to-graph alignment
Genome graphs can represent genetic variation and sequence uncertainty. Aligning
sequences to genome graphs is key to many applications, including error correction …
sequences to genome graphs is key to many applications, including error correction …
Fully functional suffix trees and optimal text searching in BWT-runs bounded space
Indexing highly repetitive texts—such as genomic databases, software repositories and
versioned text collections—has become an important problem since the turn of the …
versioned text collections—has become an important problem since the turn of the …
MONI: a pangenomic index for finding maximal exact matches
Recently, Gagie et al. proposed a version of the FM-index, called the r-index, that can store
thousands of human genomes on a commodity computer. Then Kuhnle et al. showed how to …
thousands of human genomes on a commodity computer. Then Kuhnle et al. showed how to …
Unbiased pangenome graphs
Motivation Pangenome variation graphs model the mutual alignment of collections of DNA
sequences. A set of pairwise alignments implies a variation graph, but there are no scalable …
sequences. A set of pairwise alignments implies a variation graph, but there are no scalable …
Efficient pedigree recording for fast population genetics simulation
In this paper we describe how to efficiently record the entire genetic history of a population in
forwards-time, individual-based population genetics simulations with arbitrary breeding …
forwards-time, individual-based population genetics simulations with arbitrary breeding …
Succinct de Bruijn graphs
We propose a new succinct de Bruijn graph representation. If the de Bruijn graph of k-mers
in a DNA sequence of length N has m edges, it can be represented in 4 m+ o (m) bits. This is …
in a DNA sequence of length N has m edges, it can be represented in 4 m+ o (m) bits. This is …
Fast search of thousands of short-read sequencing experiments
B Solomon, C Kingsford - Nature biotechnology, 2016 - nature.com
The amount of sequence information in public repositories is growing at a rapid rate.
Although these data are likely to contain clinically important information that has not yet …
Although these data are likely to contain clinically important information that has not yet …
Tracy: basecalling, alignment, assembly and deconvolution of sanger chromatogram trace files
Background DNA sequencing is at the core of many molecular biology laboratories. Despite
its long history, there is a lack of user-friendly Sanger sequencing data analysis tools that …
its long history, there is a lack of user-friendly Sanger sequencing data analysis tools that …
CompressDB: Enabling efficient compressed data direct processing for various databases
In modern data management systems, directly performing operations on compressed data
has been proven to be a big success facing big data problems. These systems have …
has been proven to be a big success facing big data problems. These systems have …