Systems and methods for multi-speaker neural text-to-speech
Described herein are systems and methods for augmenting neural speech synthesis
networks with low-dimensional trainable speaker embeddings in order to generate speech …
networks with low-dimensional trainable speaker embeddings in order to generate speech …
Systems and methods for parallel wave generation in end-to-end text-to-speech
Described herein are embodiments of an end-to-end text-to speech (TTS) system with
parallel wave generation. In one or more embodiments, a Gaussian inverse autoregressive …
parallel wave generation. In one or more embodiments, a Gaussian inverse autoregressive …
Systems and methods for real-time neural text-to-speech
Embodiments of a production-quality text-to-speech (TTS) system constructed from deep
neural networks described. System embodiments comprise five major build ing blocks: a …
neural networks described. System embodiments comprise five major build ing blocks: a …
Parallel neural text-to-speech
Presented herein are embodiments of a non-autoregressive sequence-to-sequence model
that converts text to an audio representation. Embodiment are fully convolutional, and a …
that converts text to an audio representation. Embodiment are fully convolutional, and a …
Systems and methods for neural text-to-speech using convolutional sequence learning
Described herein are embodiments of a fully-convolutional attention-based neural text-to-
speech (TTS) system, which various embodiments may generally be referred to as Deep …
speech (TTS) system, which various embodiments may generally be referred to as Deep …
Waveform generation using end-to-end text-to-waveform system
Described herein are embodiments of an end-to-end text-to-speech (TTS) system with
parallel wave generation. In one or more embodiments, a Gaussian inverse autoregressive …
parallel wave generation. In one or more embodiments, a Gaussian inverse autoregressive …
System and method for outlier identification to remove poor alignments in speech synthesis
A system and method are presented for outlier identification to remove poor alignments in
speech synthesis. The quality of the output of a text-to-speech system directly depends on …
speech synthesis. The quality of the output of a text-to-speech system directly depends on …
Method for pronunciation transcription using speech-to-text model
D Shin - US Patent 12,051,421, 2024 - Google Patents
Disclosed is a pronunciation transcription method performed by a computing device. The
method may include: acquiring a partial audio signal of a first sound unit generated by …
method may include: acquiring a partial audio signal of a first sound unit generated by …
Acoustic model training method, speech recognition method, apparatus, device and medium
H Liang, J Wang, N Cheng, J **ao - US Patent 11,030,998, 2021 - Google Patents
An acoustic model training method, a speech recognition method, an apparatus, a device
and a medium. The acoustic model training method comprises: performing feature extraction …
and a medium. The acoustic model training method comprises: performing feature extraction …
Multi-speaker neural text-to-speech
the present disclosure relates generally to systems and methods for machine learning that
can provide improved computer performance, features, and uses. More particularly, the …
can provide improved computer performance, features, and uses. More particularly, the …