Typos that Broke the RAG's Back: Genetic Attack on RAG Pipeline by Simulating Documents in the Wild via Low-level Perturbations

S Cho, S Jeong, J Seo, T Hwang, JC Park - arxiv preprint arxiv …, 2024 - arxiv.org
The robustness of recent Large Language Models (LLMs) has become increasingly crucial
as their applicability expands across various domains and real-world applications. Retrieval …

Eyes on Google's NotebookLM: using generative AI to create ophthalmology podcasts with a single click

QA Dihan, BR Nihalani, AA Tooley, AM Elhusseiny - Eye, 2024 - nature.com
NotebookLM is a new and exciting artificial intelligence (AI)-powered research assistant by
Google that learns from useruploaded multimodal information—such as documents, images …

MM-Eval: A Multilingual Meta-Evaluation Benchmark for LLM-as-a-Judge and Reward Models

G Son, D Yoon, J Suk, J Aula-Blasco, M Aslan… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) are commonly used as evaluators in tasks (eg, reward
modeling, LLM-as-a-judge), where they act as proxies for human preferences or judgments …

Semiparametric Token-Sequence Co-Supervision

H Lee, D Kim, J Jun, S Joo, J Jang, KW On… - arxiv preprint arxiv …, 2024 - arxiv.org
In this work, we introduce a semiparametric token-sequence co-supervision training method.
It trains a language model by simultaneously leveraging supervision from the traditional next …

DSAI: Unbiased and Interpretable Latent Feature Extraction for Data-Centric AI

H Cho, S Ka, D Park, J Kang, M Seo, B Son - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) often struggle to objectively identify latent characteristics in
large datasets due to their reliance on pre-trained knowledge rather than actual data …