Audio-Language Datasets of Scenes and Events: A Survey

G Wijngaard, E Formisano, M Esposito… - IEEE …, 2025‏ - ieeexplore.ieee.org
Audio-language models (ALMs) generate linguistic descriptions of sound-producing events
and scenes. Advances in dataset creation and computational power have led to significant …

EHR-based prediction modelling meets multimodal deep learning: A systematic review of structured and textual data fusion methods

AS Teles, IR de Moura, F Silva, A Roberts, D Stahl - Information Fusion, 2025‏ - Elsevier
Abstract Electronic Health Records (EHRs) have transformed healthcare by digitally
consolidating patient medical history, encompassing structured data (eg, demographic data …

Clinical entity augmented retrieval for clinical information extraction

I Lopez, A Swaminathan, K Vedula, S Narayanan… - npj Digital …, 2025‏ - nature.com
Large language models (LLMs) with retrieval-augmented generation (RAG) have improved
information extraction over previous methods, yet their reliance on embeddings often leads …

Kimi k1. 5: Scaling Reinforcement Learning with LLMs

K Team, A Du, B Gao, B **ng, C Jiang, C Chen… - arxiv preprint arxiv …, 2025‏ - arxiv.org
Language model pretraining with next token prediction has proved effective for scaling
compute but is limited to the amount of available training data. Scaling reinforcement …

Vision-Language Models Represent Darker-Skinned Black Individuals as More Homogeneous than Lighter-Skinned Black Individuals

MHJ Lee, S Jeon - arxiv preprint arxiv:2412.09668, 2024‏ - arxiv.org
Vision-Language Models (VLMs) combine Large Language Model (LLM) capabilities with
image processing, enabling tasks like image captioning and text-to-image generation. Yet …

Beyond Factual Accuracy: Evaluating Coverage of Diverse Factual Information in Long-form Text Generation

C Samarinas, A Krubner, A Salemi, Y Kim… - arxiv preprint arxiv …, 2025‏ - arxiv.org
This paper presents ICAT, an evaluation framework for measuring coverage of diverse
factual information in long-form text generation. ICAT breaks down a long output text into a …

A recent evaluation on the performance of LLMs on radiation oncology physics using questions of randomly shuffled options

P Wang, J Holmes, Z Liu, D Chen, T Liu, J Shen… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Purpose: We present an updated study evaluating the performance of large language
models (LLMs) in answering radiation oncology physics questions, focusing on the recently …

Resource-efficient photonic networks for next-generation AI computing

I Oguz, M Yildirim, JL Hsieh, NU Dinc… - Light: Science & …, 2025‏ - nature.com
Current trends in artificial intelligence toward larger models demand a rethinking of both
hardware and algorithms. Photonics-based systems offer high-speed, energy-efficient …

Lightweight safety classification using pruned language models

M Sawtell, T Masterman, S Besen, J Brown - arxiv preprint arxiv …, 2024‏ - arxiv.org
In this paper, we introduce a novel technique for content safety and prompt injection
classification for Large Language Models. Our technique, Layer Enhanced Classification …

Integrating personalized and contextual information in fine-grained emotion recognition in text: A multi-source fusion approach with explainability

A Ngo, J Kocoń - Information Fusion, 2025‏ - Elsevier
Emotion recognition in textual data is a rapidly evolving field with diverse applications. While
the state-of-the-art (SOTA) models based on pre-trained large language models (LLMs) …