A systematic survey and critical review on evaluating large language models: Challenges, limitations, and recommendations

MTR Laskar, S Alqahtani, MS Bari… - Proceedings of the …, 2024 - aclanthology.org
Abstract Large Language Models (LLMs) have recently gained significant attention due to
their remarkable capabilities in performing diverse tasks across various domains. However …

Natural Language Generation for Visualizations: State of the Art, Challenges and Future Directions

E Hoque, MS Islam - Computer Graphics Forum, 2024 - Wiley Online Library
Natural language and visualization are two complementary modalities of human
communication that play a crucial role in conveying information effectively. While …

Alleviating hallucinations of large language models through induced hallucinations

Y Zhang, L Cui, W Bi, S Shi - ar**_a_Framework_for_Auditing_Large_Language_Models_Using_Human-in-the-Loop/links/65cdc8b6790074549791de40/Develo**-a-Framework-for-Auditing-Large-Language-Models-Using-Human-in-the-Loop.pdf" data-clk="hl=en&sa=T&oi=gga&ct=gga&cd=5&d=10236333257919685027&ei=0VKxZ9DXA4C96rQP29mI6AY" data-clk-atid="o-HRGN3CDo4J" target="_blank">[PDF] researchgate.net

[PDF][PDF] Develo** a framework for auditing large language models using human-in-the-loop

M Amirizaniani, J Yao, A Lavergne… - arxiv preprint arxiv …, 2024 - researchgate.net
* Work does not relate to position at Amazon. Authors' addresses: Maryam Amirizaniani,
amaryam@ uw. edu, University of Washington, Seattle, WA, USA; Jihan Yao, jihany2@ uw …

GenAudit: Fixing Factual Errors in Language Model Outputs with Evidence

K Krishna, S Ramprasad, P Gupta, BC Wallace… - arxiv preprint arxiv …, 2024 - arxiv.org
LLMs can generate factually incorrect statements even when provided access to reference
documents. Such errors can be dangerous in high-stakes applications (eg, document …

ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models

J Zhang, L Xue, L Song, J Wang, W Huang… - arxiv preprint arxiv …, 2024 - arxiv.org
With the rise of multimodal applications, instruction data has become critical for training
multimodal language models capable of understanding complex image-based queries …

An Audit on the Perspectives and Challenges of Hallucinations in NLP

PN Venkit, T Chakravorti, V Gupta… - Proceedings of the …, 2024 - aclanthology.org
We audit how hallucination in large language models (LLMs) is characterized in peer-
reviewed literature, using a critical examination of 103 publications across NLP research …

A Comparative Analysis of Text-Based Explainable Recommender Systems

A Ariza-Casabona, L Boratto, M Salamó - Proceedings of the 18th ACM …, 2024 - dl.acm.org
One way to increase trust among users towards recommender systems is to provide the
recommendation along with a textual explanation. In the literature, extraction-based …