Foundation models in robotics: Applications, challenges, and the future

R Firoozi, J Tucker, S Tian… - … Journal of Robotics …, 2023 - journals.sagepub.com
We survey applications of pretrained foundation models in robotics. Traditional deep
learning models in robotics are trained on small datasets tailored for specific tasks, which …

Lm vs lm: Detecting factual errors via cross examination

R Cohen, M Hamri, M Geva, A Globerson - arxiv preprint arxiv …, 2023 - arxiv.org
A prominent weakness of modern language models (LMs) is their tendency to generate
factually incorrect text, which hinders their usability. A natural question is whether such …

Navigating the grey area: How expressions of uncertainty and overconfidence affect language models

K Zhou, D Jurafsky, T Hashimoto - arxiv preprint arxiv:2302.13439, 2023 - arxiv.org
The increased deployment of LMs for real-world tasks involving knowledge and facts makes
it important to understand model epistemology: what LMs think they know, and how their …

Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration

S Feng, W Shi, Y Wang, W Ding… - arxiv preprint arxiv …, 2024 - arxiv.org
Despite efforts to expand the knowledge of large language models (LLMs), knowledge gaps-
-missing or outdated information in LLMs--might always persist given the evolving nature of …

[HTML][HTML] Evaluation of uncertainty quantification methods in multi-label classification: A case study with automatic diagnosis of electrocardiogram

M Barandas, L Famiglini, A Campagner, D Folgado… - Information …, 2024 - Elsevier
Artificial Intelligence (AI) use in automated Electrocardiogram (ECG) classification has
continuously attracted the research community's interest, motivated by their promising …

One vs. many: Comprehending accurate information from multiple erroneous and inconsistent AI generations

Y Lee, K Son, TS Kim, J Kim, JJY Chung… - Proceedings of the …, 2024 - dl.acm.org
As Large Language Models (LLMs) are nondeterministic, the same input can generate
different outputs, some of which may be incorrect or hallucinated. If run again, the LLM may …

Knowing what llms do not know: A simple yet effective self-detection method

Y Zhao, L Yan, W Sun, G **ng, C Meng, S Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have shown great potential in Natural Language
Processing (NLP) tasks. However, recent literature reveals that LLMs generate nonfactual …

Gaussian stochastic weight averaging for Bayesian low-rank adaptation of large language models

E Onal, K Flöge, E Caldwell, A Sheverdin… - arxiv preprint arxiv …, 2024 - arxiv.org
Fine-tuned Large Language Models (LLMs) often suffer from overconfidence and poor
calibration, particularly when fine-tuned on small datasets. To address these challenges, we …

Hallucination detection in llms: Fast and memory-efficient finetuned models

GY Arteaga, TB Schön, N Pielawski - arxiv preprint arxiv:2409.02976, 2024 - arxiv.org
Uncertainty estimation is a necessary component when implementing AI in high-risk
settings, such as autonomous cars, medicine, or insurances. Large Language Models …

Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs

R Niu, D Wu, R Yu, YA Ma - arxiv preprint arxiv:2410.06431, 2024 - arxiv.org
From common-sense reasoning to domain-specific tasks, parameter-efficient fine tuning
(PEFT) methods for large language models (LLMs) have showcased significant performance …