Retrieval-augmented generation for large language models: A survey
Y Gao, Y **ong, X Gao, K Jia, J Pan, Y Bi, Y Dai… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) demonstrate powerful capabilities, but they still face
challenges in practical applications, such as hallucinations, slow knowledge updates, and …
challenges in practical applications, such as hallucinations, slow knowledge updates, and …
Survey on factuality in large language models: Knowledge, retrieval and domain-specificity
This survey addresses the crucial issue of factuality in Large Language Models (LLMs). As
LLMs find applications across diverse domains, the reliability and accuracy of their outputs …
LLMs find applications across diverse domains, the reliability and accuracy of their outputs …
Survey of hallucination in natural language generation
Natural Language Generation (NLG) has improved exponentially in recent years thanks to
the development of sequence-to-sequence deep learning technologies such as Transformer …
the development of sequence-to-sequence deep learning technologies such as Transformer …
Documenting large webtext corpora: A case study on the colossal clean crawled corpus
Large language models have led to remarkable progress on many NLP tasks, and
researchers are turning to ever-larger text corpora to train them. Some of the largest corpora …
researchers are turning to ever-larger text corpora to train them. Some of the largest corpora …
Embers of autoregression show how large language models are shaped by the problem they are trained to solve
The widespread adoption of large language models (LLMs) makes it important to recognize
their strengths and limitations. We argue that to develop a holistic understanding of these …
their strengths and limitations. We argue that to develop a holistic understanding of these …
On faithfulness and factuality in abstractive summarization
It is well known that the standard likelihood training and approximate decoding objectives in
neural text generation models lead to less human-like responses for open-ended tasks such …
neural text generation models lead to less human-like responses for open-ended tasks such …
Repairing the cracked foundation: A survey of obstacles in evaluation practices for generated text
Abstract Evaluation practices in natural language generation (NLG) have many known flaws,
but improved evaluation approaches are rarely widely adopted. This issue has become …
but improved evaluation approaches are rarely widely adopted. This issue has become …
Embers of autoregression: Understanding large language models through the problem they are trained to solve
The widespread adoption of large language models (LLMs) makes it important to recognize
their strengths and limitations. We argue that in order to develop a holistic understanding of …
their strengths and limitations. We argue that in order to develop a holistic understanding of …
Exploring the benefits of training expert language models over instruction tuning
Abstract Recently, Language Models (LMs) instruction-tuned on multiple tasks, also known
as multitask-prompted fine-tuning (MT), have shown capabilities to generalize to unseen …
as multitask-prompted fine-tuning (MT), have shown capabilities to generalize to unseen …
ToTTo: A controlled table-to-text generation dataset
We present ToTTo, an open-domain English table-to-text dataset with over 120,000 training
examples that proposes a controlled generation task: given a Wikipedia table and a set of …
examples that proposes a controlled generation task: given a Wikipedia table and a set of …