An overview of automatic speaker diarization systems

SE Tranter, DA Reynolds - IEEE Transactions on audio, speech …, 2006‏ - ieeexplore.ieee.org
Audio diarization is the process of annotating an input audio channel with information that
attributes (possibly overlap**) temporal regions of signal energy to their specific sources …

Speaker segmentation and clustering

M Kotti, V Moschou, C Kotropoulos - Signal processing, 2008‏ - Elsevier
This survey focuses on two challenging speech processing topics, namely: speaker
segmentation and speaker clustering. Speaker segmentation aims at finding speaker …

Retrieval and browsing of spoken content

C Chelba, TJ Hazen, M Saraclar - IEEE Signal Processing …, 2008‏ - ieeexplore.ieee.org
Ever-increasing computing power and connectivity bandwidth, together with falling storage
costs, are resulting in an overwhelming amount of data of various types being produced …

A review on speaker diarization systems and approaches

MH Moattar, MM Homayounpour - Speech Communication, 2012‏ - Elsevier
Speaker indexing or diarization is an important task in audio processing and retrieval.
Speaker diarization is the process of labeling a speech signal with labels corresponding to …

Spoken content retrieval: A survey of techniques and technologies

M Larson, GJF Jones - Foundations and Trends® in …, 2012‏ - nowpublishers.com
Speech media, that is, digital audio and video containing spoken content, has blossomed in
recent years. Large collections are accruing on the Internet as well as in private and …

Analysis and compensation of Lombard speech across noise type and levels with application to in-set/out-of-set speaker recognition

JHL Hansen, V Varadarajan - IEEE Transactions on Audio …, 2009‏ - ieeexplore.ieee.org
Speech production in the presence of noise results in the Lombard effect, which is known to
have a serious impact on speech system performance. In this study, Lombard speech …

Advances in phone-based modeling for automatic accent classification

P Angkititrakul, JHL Hansen - IEEE transactions on audio …, 2006‏ - ieeexplore.ieee.org
It is suggested that algorithms capable of estimating and characterizing accent knowledge
would provide valuable information in the development of more effective speech systems …

On Growing and Pruning Kneser–Ney Smoothed -Gram Models

V Siivola, T Hirsimaki, S Virpioja - IEEE Transactions on Audio …, 2007‏ - ieeexplore.ieee.org
N-gram models are the most widely used language models in large vocabulary continuous
speech recognition. Since the size of the model grows rapidly with respect to the model …

Unsupervised accent classification for deep data fusion of accent and language information

JHL Hansen, G Liu - Speech Communication, 2016‏ - Elsevier
Abstract Automatic Dialect Identification (DID) has recently gained substantial interest in the
speech processing community. Studies have shown that the variation in speech due to …

Rapid yet accurate speech indexing using dynamic match lattice spotting

K Thambiratnam, S Sridharan - IEEE Transactions on Audio …, 2006‏ - ieeexplore.ieee.org
The support for typically out-of-vocabulary query terms such as names, acronyms, and
foreign words is an important requirement of many speech indexing applications. However …