- Academic Search

X Yang, RB Bist, B Paneru, T Liu, T Applegate… - … and Electronics in …, 2024 - Elsevier

As global demands on the poultry production and welfare both intensify, the precision
poultry farming technologies such as computer vision-based cybernetics system is …

Save Cite Cited by 6 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Recent advances in speech language models: A survey

W Cui, D Yu, X Jiao, Z Meng, G Zhang, Q Wang… - ar** Language-Speech Pre-training via Knowledge Distillation

C Wang, M Liao, Z Huang, J Zhang - arxiv preprint arxiv:2405.19041, 2024 - arxiv.org

Recent end-to-end approaches have shown promise in extending large language models
(LLMs) to speech inputs, but face limitations in directly assessing and optimizing alignment …

Save Cite Cited by 5 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Salsa: Speedy asr-llm synchronous aggregation

A Mittal, D Prabhu, S Sarawagi, P Jyothi - arxiv preprint arxiv:2408.16542, 2024 - arxiv.org

Harnessing pre-trained LLMs to improve ASR systems, particularly for low-resource
languages, is now an emerging area of research. Existing methods range from using LLMs …

Save Cite Cited by 2 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Roadmap towards superhuman speech understanding using large language models

F Bu, Y Zhang, X Wang, B Wang, Q Liu, H Li - arxiv preprint arxiv …, 2024 - arxiv.org

The success of large language models (LLMs) has prompted efforts to integrate speech and
audio data, aiming to create general foundation models capable of processing both textual …

Save Cite Cited by 1 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities

S Wang, CHH Yang, J Wu, C Zhang - arxiv preprint arxiv:2404.14716, 2024 - arxiv.org

Large language models (LLMs) can adapt to new tasks through in-context learning (ICL)
based on a few examples presented in dialogue history without any model parameter …

Save Cite Cited by 6 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models

CY Kuan, WP Huang, H Lee - arxiv preprint arxiv:2406.08402, 2024 - arxiv.org

Large audio-language models (LALMs) enhance traditional large language models by
integrating audio perception capabilities, allowing them to tackle audio-related tasks …

Save Cite Cited by 3 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison

TK Lam, M Gaido, S Papi, L Bentivogli… - arxiv preprint arxiv …, 2025 - arxiv.org

Following the remarkable success of Large Language Models (LLMs) in NLP tasks, there is
increasing interest in extending their capabilities to speech--the most common form in …

Create alert

Cite

Advanced search

Saved to My library

Cosmic: Data efficient instruction-tuning for speech in-context learning

Computer vision-based cybernetics systems for promoting modern poultry farming: a critical review

Recent advances in speech language models: A survey

Salsa: Speedy asr-llm synchronous aggregation

Roadmap towards superhuman speech understanding using large language models

Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities

Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models

Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison