The khmer software package: enabling efficient nucleotide sequence analysis

MR Crusoe, HF Alameldin, S Awad… - …, 2015 - pmc.ncbi.nlm.nih.gov
The khmer package is a freely available software library for working efficiently with fixed
length DNA words, or k-mers. khmer provides implementations of a probabilistic k-mer …

Re-assembly, quality evaluation, and annotation of 678 microbial eukaryotic reference transcriptomes

LK Johnson, H Alexander, CT Brown - Gigascience, 2019 - academic.oup.com
Background De novo transcriptome assemblies are required prior to analyzing RNA
sequencing data from a species without an existing reference genome or transcriptome …

Niche partitioning of the N cycling microbial community of an offshore oxygen deficient zone

CA Fuchsman, AH Devol, JK Saunders… - Frontiers in …, 2017 - frontiersin.org
Microbial communities in marine oxygen deficient zones (ODZs) are responsible for up to
half of marine N loss through conversion of nutrients to N2O and N2. This N loss is …

Apollo: a sequencing-technology-independent, scalable and accurate assembly polishing algorithm

C Firtina, JS Kim, M Alser, D Senol Cali, AE Cicek… - …, 2020 - academic.oup.com
Motivation Third-generation sequencing technologies can sequence long reads that contain
as many as 2 million base pairs. These long reads are used to construct an assembly (ie the …

Evaluating metagenome assembly on a simple defined community with many strain variants

S Awad, L Irber, CT Brown - BioRxiv, 2017 - biorxiv.org
We evaluate the performance of three metagenome assemblers, IDBA, MetaSPAdes, and
MEGAHIT, on short-read sequencing of a defined “mock” community containing 64 genomes …

Novel computational techniques for map** and classification of Next-Generation Sequencing data

K Brinda - 2016 - hal.science
Since their emergence around 2006, Next-Generation Sequencing technologies have been
revolutionizing biological and medical research. Obtaining instantly an extensive amount of …

In silico read normalization using set multi-cover optimization

DA Durai, MH Schulz - Bioinformatics, 2018 - academic.oup.com
Abstract Motivation De Bruijn graphs are a common assembly data structure for sequencing
datasets. But with the advances in sequencing technologies, assembling high coverage …

Exploring neighborhoods in large metagenome assembly graphs reveals hidden sequence diversity

CT Brown, D Moritz, MP O'Brien, F Reidl, T Reiter… - BioRxiv, 2018 - biorxiv.org
Genomes computationally inferred from large metagenomic data sets are often incomplete
and may be missing functionally important content and strain variation. We introduce an …

Mqf and buffered mqf: Quotient filters for efficient storage of k-mers with their counts and metadata

M Shokrof, CT Brown, TA Mansour - BMC bioinformatics, 2021 - Springer
Background Specialized data structures are required for online algorithms to efficiently
handle large sequencing datasets. The counting quotient filter (CQF), a compact hashtable …

K-mer based prediction of Clostridioides difficile relatedness and ribotypes

MP Moore, MH Wilcox, AS Walker… - Microbial …, 2022 - microbiologyresearch.org
Comparative analysis of Clostridioides difficile whole-genome sequencing (WGS) data
enables fine scaled investigation of transmission and is increasingly becoming part of …