[PDF][PDF] XIMERA: A new TTS from ATR based on corpus-based technologies

H Kawai, T Toda, J Ni, M Tsuzaki… - Fifth ISCA Workshop on …, 2004 - isca-archive.org
This paper describes a new concatenative TTS system under development at ATR. The
system, named XIMERA, is based on corpus-based technologies, as was the case for the …

HMM-based speech segmentation: Improvements of fully automatic approaches

S Brognaux, T Drugman - IEEE/ACM Transactions on Audio …, 2015 - ieeexplore.ieee.org
Speech segmentation refers to the problem of determining the phoneme boundaries from an
acoustic recording of an utterance together with its orthographic transcription. This paper …

Train&Align: A new online tool for automatic phonetic alignment

S Brognaux, S Roekhaut, T Drugman… - 2012 ieee spoken …, 2012 - ieeexplore.ieee.org
Several automatic phonetic alignment tools have been proposed in the literature. They
usually rely on pre-trained speaker-independent models to align new corpora. Their …

Weakly-supervised forced alignment of disfluent speech using phoneme-level modeling

T Kouzelis, G Paraskevopoulos, A Katsamanis… - arxiv preprint arxiv …, 2023 - arxiv.org
The study of speech disorders can benefit greatly from time-aligned data. However, audio-
text mismatches in disfluent speech cause rapid performance degradation for modern …

On using multiple models for automatic speech segmentation

SS Park, NS Kim - IEEE Transactions on Audio, Speech, and …, 2007 - ieeexplore.ieee.org
In this paper, we propose a novel approach to automatic speech segmentation for unit-
selection based text-to-speech systems. Instead of using a single automatic segmentation …

Robust detection of phone boundaries using model selection criteria with few observations

G Almpanidis, M Kotti… - IEEE Transactions on …, 2009 - ieeexplore.ieee.org
Automatic phone segmentation techniques based on model selection criteria are studied.
We investigate the phone boundary detection efficiency of entropy-and Bayesian-based …

Sociophonetics and class differentiation: A study of working-and middle-class English in Cape Town's coloured community

TL Toefy - 2014 - open.uct.ac.za
This thesis provides a detailed acoustic description of the phonetic variation and changes
evident in the monophthongal vowel system of Coloured South African English in Cape …

Using LSTM neural networks for cross‐lingual phonetic speech segmentation with an iterative correction procedure

Z Hanzlíček, J Matoušek, J Vít - Computational Intelligence, 2024 - Wiley Online Library
This article describes experiments on speech segmentation using long short‐term memory
recurrent neural networks. The main part of the paper deals with multi‐lingual and cross …

Automatic phone alignment: A comparison between speaker-independent models and models trained on the corpus to align

S Brognaux, S Roekhaut, T Drugman… - Advances in Natural …, 2012 - Springer
Several automatic phonetic alignment tools have been proposed in the literature. They
generally use speaker-independent acoustic models of the language to align new corpora …

Emilia: a speech corpus for Argentine Spanish text to speech synthesis

HM Torres, JA Gurlekian, DA Evin… - Language Resources …, 2019 - Springer
This paper introduces Emilia, a speech corpus created to build a female voice in Spanish
spoken in Buenos Aires for the Aromo text-to-speech system. Aromo is a unit selection text …