Computer vision-based cybernetics systems for promoting modern poultry farming: a critical review
As global demands on the poultry production and welfare both intensify, the precision
poultry farming technologies such as computer vision-based cybernetics system is …
poultry farming technologies such as computer vision-based cybernetics system is …
Recent advances in speech language models: A survey
Salsa: Speedy asr-llm synchronous aggregation
Harnessing pre-trained LLMs to improve ASR systems, particularly for low-resource
languages, is now an emerging area of research. Existing methods range from using LLMs …
languages, is now an emerging area of research. Existing methods range from using LLMs …
Roadmap towards superhuman speech understanding using large language models
The success of large language models (LLMs) has prompted efforts to integrate speech and
audio data, aiming to create general foundation models capable of processing both textual …
audio data, aiming to create general foundation models capable of processing both textual …
Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities
Large language models (LLMs) can adapt to new tasks through in-context learning (ICL)
based on a few examples presented in dialogue history without any model parameter …
based on a few examples presented in dialogue history without any model parameter …
Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models
Large audio-language models (LALMs) enhance traditional large language models by
integrating audio perception capabilities, allowing them to tackle audio-related tasks …
integrating audio perception capabilities, allowing them to tackle audio-related tasks …
Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison
Following the remarkable success of Large Language Models (LLMs) in NLP tasks, there is
increasing interest in extending their capabilities to speech--the most common form in …
increasing interest in extending their capabilities to speech--the most common form in …