- Academic Search

X Song, C Liang, B Zhang, P Zhang, ZY Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Automatic Speech Recognition (ASR) models demand a vast number of parameters,
copious amounts of data, and significant computational resources during the training …

Save Cite Cited by 2 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Interleaved Speech-Text Language Models are Simple Streaming Text to Speech Synthesizers

Y Yang, Z Ma, S Liu, J Li, H Wang, L Meng… - arxiv preprint arxiv …, 2024 - arxiv.org

This paper introduces Interleaved Speech-Text Language Model (IST-LM) for streaming
zero-shot Text-to-Speech (TTS). Unlike many previous approaches, IST-LM is directly …

Save Cite Cited by 1 Related articles All 2 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

TouchTTS: An Embarrassingly Simple TTS Framework that Everyone Can Touch

TouchASP: Elastic Automatic Speech Perception that Everyone Can Touch

Interleaved Speech-Text Language Models are Simple Streaming Text to Speech Synthesizers