Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model

Z Ma, Z Chen, Y Wang, ES Chng, X Chen - arxiv preprint arxiv …, 2025 - arxiv.org
Large Audio-Language Models (LALMs) have demonstrated remarkable performance in
tasks involving audio perception and understanding, such as speech recognition and audio …

SpeechPrune: Context-aware Token Pruning for Speech Information Retrieval

Y Lin, Y Fu, J Zhang, Y Liu, J Zhang, J Sun, H Li… - arxiv preprint arxiv …, 2024 - arxiv.org
We introduce Speech Information Retrieval (SIR), a new long-context task for Speech Large
Language Models (Speech LLMs), and present SPIRAL, a 1,012-sample benchmark testing …