Llama-omni: Seamless speech interaction with large language models

Q Fang, S Guo, Y Zhou, Z Ma, S Zhang… - ar** Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
KH Lu, Z Chen, SW Fu, CHH Yang, J Balam… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent end-to-end speech language models (SLMs) have expanded upon the capabilities
of large language models (LLMs) by incorporating pre-trained speech models. However …

Roadmap towards superhuman speech understanding using large language models

F Bu, Y Zhang, X Wang, B Wang, Q Liu, H Li - arxiv preprint arxiv …, 2024 - arxiv.org
The success of large language models (LLMs) has prompted efforts to integrate speech and
audio data, aiming to create general foundation models capable of processing both textual …