Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback

GT Lin, PG Shivakumar, A Gourav, Y Gu… - arxiv preprint arxiv …, 2024 - arxiv.org
While textless Spoken Language Models (SLMs) have shown potential in end-to-end
speech-to-speech modeling, they still lag behind text-based Large Language Models …

Data-Centric Improvements for Enhancing Multi-Modal Understanding in Spoken Conversation Modeling

M Chen, R Sun, SÖ Arık - arxiv preprint arxiv:2412.15995, 2024 - arxiv.org
Conversational assistants are increasingly popular across diverse real-world applications,
highlighting the need for advanced multimodal speech modeling. Speech, as a natural …