Computational methods for transcriptome annotation and quantification using RNA-seq
High-throughput RNA sequencing (RNA-seq) promises a comprehensive picture of the
transcriptome, allowing for the complete annotation and quantification of all genes and their …
transcriptome, allowing for the complete annotation and quantification of all genes and their …
Compressed full-text indexes
Full-text indexes provide fast substring search over large text collections. A serious problem
of these indexes has traditionally been their space consumption. A recent trend is to develop …
of these indexes has traditionally been their space consumption. A recent trend is to develop …
Efficient architecture-aware acceleration of BWA-MEM for multicore systems
Innovations in Next-Generation Sequencing are enabling generation of DNA sequence data
at ever faster rates and at very low cost. For example, the Illumina NovaSeq 6000 sequencer …
at ever faster rates and at very low cost. For example, the Illumina NovaSeq 6000 sequencer …
Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks
Recent advances in high-throughput cDNA sequencing (RNA-seq) can reveal new genes
and splice variants and quantify expression genome-wide in a single assay. The volume …
and splice variants and quantify expression genome-wide in a single assay. The volume …
BWA-MEME: BWA-MEM emulated with a machine learning approach
Motivation The growing use of next-generation sequencing and enlarged sequencing
throughput require efficient short-read alignment, where seeding is one of the major …
throughput require efficient short-read alignment, where seeding is one of the major …
TopHat: discovering splice junctions with RNA-Seq
Motivation: A new protocol for sequencing the messenger RNA in a cell, known as RNA-
Seq, generates millions of short sequence fragments in a single run. These fragments, or …
Seq, generates millions of short sequence fragments in a single run. These fragments, or …
Ultrafast and memory-efficient alignment of short DNA sequences to the human genome
Bowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence
reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie …
reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie …
Indexing compressed text
We design two compressed data structures for the full-text indexing problem that support
efficient substring searches using roughly the space required for storing the text in …
efficient substring searches using roughly the space required for storing the text in …
[HTML][HTML] Replacing suffix trees with enhanced suffix arrays
The suffix tree is one of the most important data structures in string processing and
comparative genomics. However, the space consumption of the suffix tree is a bottleneck in …
comparative genomics. However, the space consumption of the suffix tree is a bottleneck in …
Compressed suffix arrays and suffix trees with applications to text indexing and string matching
The proliferation of online text, such as on the World Wide Web and in databases, motivates
the need for space-efficient index methods that support fast search. Consider a text T of n …
the need for space-efficient index methods that support fast search. Consider a text T of n …