Computational graph pangenomics: a tutorial on data structures and their applications

JA Baaijens, P Bonizzoni, C Boucher, G Della Vedova… - Natural computing, 2022 - Springer
Computational pangenomics is an emerging research field that is changing the way
computer scientists are facing challenges in biological sequence analysis. In past decades …

Hardware acceleration of genomics data analysis: challenges and opportunities

T Robinson, J Harkin, P Shukla - Bioinformatics, 2021 - academic.oup.com
The significant decline in the cost of genome sequencing has dramatically changed the
typical bioinformatics pipeline for analysing sequencing data. Where traditionally, the …

Representation of k-Mer Sets Using Spectrum-Preserving String Sets

A Rahman, P Medevedev - Journal of Computational Biology, 2021 - liebertpub.com
Given the popularity and elegance of k-mer-based tools, finding a space-efficient way to
represent a set of k-mers is important for improving the scalability of bioinformatics analyses …

SVDSS: structural variation discovery in hard-to-call genomic regions using sample-specific strings from accurate long reads

L Denti, P Khorsand, P Bonizzoni, F Hormozdiari… - Nature …, 2023 - nature.com
Structural variants (SVs) account for a large amount of sequence variability across genomes
and play an important role in human genomics and precision medicine. Despite intense …

The Statistics of k-mers from a Sequence Undergoing a Simple Mutation Process Without Spurious Matches

A Blanca, RS Harris, D Koslicki… - Journal of Computational …, 2022 - liebertpub.com
k-mer-based methods are widely used in bioinformatics, but there are many gaps in our
understanding of their statistical properties. Here, we consider the simple model where a …

Sequencing technologies and analyses: where have we been and where are we going?

V Bansal, C Boucher - IScience, 2019 - cell.com
A wave of technologies transformed sequencing over a decade ago into the high-throughput
era, demanding research in new computational methods to analyze these data. The …

Disk compression of k-mer sets

A Rahman, R Chikhi, P Medvedev - Algorithms for Molecular Biology, 2021 - Springer
K-mer based methods have become prevalent in many areas of bioinformatics. In
applications such as database search, they often work with large multi-terabyte-sized …

Kevlar: a map**-free framework for accurate discovery of de novo variants

DS Standage, CT Brown, F Hormozdiari - Iscience, 2019 - cell.com
De novo genetic variants are an important source of causative variation in complex genetic
disorders. Many methods for variant discovery rely on map** reads to a reference …

KAGE: fast alignment-free graph-based genoty** of SNPs and short indels

I Grytten, K Dagestad Rand, GK Sandve - Genome biology, 2022 - Springer
Genoty** is a core application of high-throughput sequencing. We present KAGE, a
genotyper for SNPs and short indels that is inspired by recent developments within graph …

Nebula: ultra-efficient map**-free structural variant genotyper

P Khorsand, F Hormozdiari - Nucleic acids research, 2021 - academic.oup.com
Large scale catalogs of common genetic variants (including indels and structural variants)
are being created using data from second and third generation whole-genome sequencing …