[PDF][PDF] Computational pan-genomics: status, promises and challenges

Briefings in bioinformatics, 2018 - academic.oup.com
Many disciplines, from human genetics and oncology to plant breeding, microbiology and
virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes …

A benchmark study of k-mer counting methods for high-throughput sequencing

SC Manekar, SR Sathe - GigaScience, 2018 - academic.oup.com
The rapid development of high-throughput sequencing technologies means that hundreds of
gigabytes of sequencing data can be produced in a single study. Many bioinformatics tools …

ABySS 2.0: resource-efficient assembly of large genomes using a Bloom filter

SD Jackman, BP Vandervalk, H Mohamadi… - Genome …, 2017 - genome.cshlp.org
The assembly of DNA sequences de novo is fundamental to genomics research. It is the first
of many steps toward elucidating and characterizing whole genomes. Downstream …

KMC 3: counting and manipulating k-mer statistics

M Kokot, M Długosz, S Deorowicz - Bioinformatics, 2017 - academic.oup.com
Counting all k-mers in a given dataset is a standard procedure in many bioinformatics
applications. We introduce KMC3, a significant improvement of the former KMC2 algorithm …

Informed and automated k-mer size selection for genome assembly

R Chikhi, P Medvedev - Bioinformatics, 2014 - academic.oup.com
Motivation: Genome assembly tools based on the de Bruijn graph framework rely on a
parameter k, which represents a trade-off between several competing effects that are difficult …

Assembly of long error-prone reads using de Bruijn graphs

Y Lin, J Yuan, M Kolmogorov… - Proceedings of the …, 2016 - National Acad Sciences
The recent breakthroughs in assembling long error-prone reads were based on the overlap-
layout-consensus (OLC) approach and did not utilize the strengths of the alternative de …

Genome-wide association studies of global Mycobacterium tuberculosis resistance to 13 antimicrobials in 10,228 genomes identify new resistance mechanisms

CRyPTIC Consortium - PLoS biology, 2022 - journals.plos.org
The emergence of drug-resistant tuberculosis is a major global public health concern that
threatens the ability to control the disease. Whole-genome sequencing as a tool to rapidly …

Identifying lineage effects when controlling for population structure improves power in bacterial association studies

SG Earle, CH Wu, J Charlesworth, N Stoesser… - Nature …, 2016 - nature.com
Bacteria pose unique challenges for genome-wide association studies because of strong
structuring into distinct strains and substantial linkage disequilibrium across the genome 1 …

Space-efficient and exact de Bruijn graph representation based on a Bloom filter

R Chikhi, G Rizk - Algorithms for Molecular Biology, 2013 - Springer
Abstract Background The de Bruijn graph data structure is widely used in next-generation
sequencing (NGS). Many programs, eg de novo assemblers, rely on in-memory …

KMC 2: fast and resource-frugal k-mer counting

S Deorowicz, M Kokot, S Grabowski… - …, 2015 - academic.oup.com
Motivation: Building the histogram of occurrences of every k-symbol long substring of
nucleotide data is a standard step in many bioinformatics applications, known under the …