Exploiting morphological and phonological features to improve prosodic phrasing for mongolian speech synthesis

R Liu, B Sisman, F Bao, J Yang… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org
Prosodic phrasing is an important factor that affects naturalness and intelligibility in text-to-
speech synthesis. Studies show that deep learning techniques improve prosodic phrasing …

Predicting prosodic prominence from text with pre-trained contextualized word representations

A Talman, A Suni, H Celikkanat, S Kakouros… - arxiv preprint arxiv …, 2019 - arxiv.org
In this paper we introduce a new natural language processing dataset and benchmark for
predicting prosodic prominence from written text. To our knowledge this will be the largest …

Self-attention based prosodic boundary prediction for chinese speech synthesis

C Lu, P Zhang, Y Yan - ICASSP 2019-2019 IEEE International …, 2019 - ieeexplore.ieee.org
Predicting prosodic boundaries from input text plays an important role in Chinese text-to-
speech (TTS) system, which directly influences the naturalness and intelligibility of …

[PDF][PDF] BLSTM-CRF Based End-to-End Prosodic Boundary Prediction with Context Sensitive Embeddings in a Text-to-Speech Front-End.

Y Zheng, J Tao, Z Wen, Y Li - Interspeech, 2018 - isca-archive.org
In this paper, we propose a language-independent end-to-end architecture for prosodic
boundary prediction based on BLSTMCRF. The proposed architecture has three …

Assessing phrase break of ESL speech with pre-trained language models and large language models

Z Wang, S Mao, W Wu, Y **a, Y Deng, J Tien - arxiv preprint arxiv …, 2023 - arxiv.org
This work introduces approaches to assessing phrase breaks in ESL learners' speech using
pre-trained language models (PLMs) and large language models (LLMs). There are two …

Adversarial multi-task learning for mandarin prosodic boundary prediction with multi-modal embeddings

J Yi, J Tao, R Fu, T Wang, CY Zhang… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org
Prosodic boundaries are still crucial to the naturalness of end-to-end speech synthesis
systems. This article proposes to use adversarial multi-task learning to predict prosodic …

[PDF][PDF] Pre-Trained Text Representations for Improving Front-End Text Processing in Mandarin Text-to-Speech Synthesis.

B Yang, J Zhong, S Liu - INTERSPEECH, 2019 - isca-archive.org
In this paper, we propose a novel method to improve the performance and robustness of the
front-end text processing modules of Mandarin text-to-speech (TTS) synthesis. We use …

Using autonomous agents to improvise music compositions in real-time

P Hutchings, J McCormack - … Intelligence in Music, Sound, Art and Design …, 2017 - Springer
This paper outlines an approach to real-time music generation using melody and harmony
focused agents in a process inspired by jazz improvisation. A harmony agent employs a …

[PDF][PDF] Improving Mongolian Phrase Break Prediction by Using Syllable and Morphological Embeddings with BiLSTM Model.

R Liu, F Bao, G Gao, H Zhang, Y Wang - Interspeech, 2018 - ttslr.github.io
In the speech synthesis systems, the phrase break (PB) prediction is the first and most
important step. Recently, the state-of-the-art PB prediction systems mainly rely on word …

The effects of modulating fundamental frequency and speech rate on the intelligibility, communication efficiency, and perceived naturalness of synthetic speech

JM Vojtech, JP Noordzij Jr, GJ Cler… - American journal of …, 2019 - pubs.asha.org
Purpose This study investigated how modulating fundamental frequency (f0) and speech
rate differentially impact the naturalness, intelligibility, and communication efficiency of …