A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

Datasets for large language models: A comprehensive survey

Y Liu, J Cao, C Liu, K Ding, L ** - arxiv preprint arxiv:2402.18041, 2024 - arxiv.org
This paper embarks on an exploration into the Large Language Model (LLM) datasets,
which play a crucial role in the remarkable advancements of LLMs. The datasets serve as …

Judging llm-as-a-judge with mt-bench and chatbot arena

L Zheng, WL Chiang, Y Sheng… - Advances in …, 2023 - proceedings.neurips.cc
Evaluating large language model (LLM) based chat assistants is challenging due to their
broad capabilities and the inadequacy of existing benchmarks in measuring human …

[PDF][PDF] Retrieval-augmented generation for large language models: A survey

Y Gao, Y **ong, X Gao, K Jia, J Pan, Y Bi… - arxiv preprint arxiv …, 2023 - simg.baai.ac.cn
Large language models (LLMs) demonstrate powerful capabilities, but they still face
challenges in practical applications, such as hallucinations, slow knowledge updates, and …

Harnessing the power of llms in practice: A survey on chatgpt and beyond

J Yang, H **, R Tang, X Han, Q Feng, H Jiang… - ACM Transactions on …, 2024 - dl.acm.org
This article presents a comprehensive and practical guide for practitioners and end-users
working with Large Language Models (LLMs) in their downstream Natural Language …

Semantic uncertainty: Linguistic invariances for uncertainty estimation in natural language generation

L Kuhn, Y Gal, S Farquhar - arxiv preprint arxiv:2302.09664, 2023 - arxiv.org
We introduce a method to measure uncertainty in large language models. For tasks like
question answering, it is essential to know when we can trust the natural language outputs …

Evaluating correctness and faithfulness of instruction-following models for question answering

V Adlakha, P BehnamGhader, XH Lu… - Transactions of the …, 2024 - direct.mit.edu
Instruction-following models are attractive alternatives to fine-tuned approaches for question
answering (QA). By simply prepending relevant documents and an instruction to their input …

Beyond the imitation game: Quantifying and extrapolating the capabilities of language models

A Srivastava, A Rastogi, A Rao, AAM Shoeb… - arxiv preprint arxiv …, 2022 - arxiv.org
Language models demonstrate both quantitative improvement and new qualitative
capabilities with increasing scale. Despite their potentially transformative impact, these new …

Pandalm: An automatic evaluation benchmark for llm instruction tuning optimization

Y Wang, Z Yu, Z Zeng, L Yang, C Wang, H Chen… - arxiv preprint arxiv …, 2023 - arxiv.org
Instruction tuning large language models (LLMs) remains a challenging task, owing to the
complexity of hyperparameter selection and the difficulty involved in evaluating the tuned …

Benchmarking foundation models with language-model-as-an-examiner

Y Bai, J Ying, Y Cao, X Lv, Y He… - Advances in …, 2023 - proceedings.neurips.cc
Numerous benchmarks have been established to assess the performance of foundation
models on open-ended question answering, which serves as a comprehensive test of a …