Advances, challenges and opportunities in creating data for trustworthy AI

W Liang, GA Tadesse, D Ho, L Fei-Fei… - Nature Machine …, 2022‏ - nature.com
As artificial intelligence (AI) transitions from research to deployment, creating the appropriate
datasets and data pipelines to develop and evaluate AI models is increasingly the biggest …

Survey of hallucination in natural language generation

Z Ji, N Lee, R Frieske, T Yu, D Su, Y Xu, E Ishii… - ACM computing …, 2023‏ - dl.acm.org
Natural Language Generation (NLG) has improved exponentially in recent years thanks to
the development of sequence-to-sequence deep learning technologies such as Transformer …

Qlora: Efficient finetuning of quantized llms

T Dettmers, A Pagnoni, A Holtzman… - Advances in neural …, 2023‏ - proceedings.neurips.cc
We present QLoRA, an efficient finetuning approach that reduces memory usage enough to
finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit …

Datacomp: In search of the next generation of multimodal datasets

SY Gadre, G Ilharco, A Fang… - Advances in …, 2023‏ - proceedings.neurips.cc
Multimodal datasets are a critical component in recent breakthroughs such as CLIP, Stable
Diffusion and GPT-4, yet their design does not receive the same research attention as model …

Should chatgpt be biased? challenges and risks of bias in large language models

E Ferrara - arxiv preprint arxiv:2304.03738, 2023‏ - arxiv.org
As the capabilities of generative language models continue to advance, the implications of
biases ingrained within these models have garnered increasing attention from researchers …

Large language models are not fair evaluators

P Wang, L Li, L Chen, Z Cai, D Zhu, B Lin… - arxiv preprint arxiv …, 2023‏ - arxiv.org
In this paper, we uncover a systematic bias in the evaluation paradigm of adopting large
language models~(LLMs), eg, GPT-4, as a referee to score and compare the quality of …

Gpt-ner: Named entity recognition via large language models

S Wang, X Sun, X Li, R Ouyang, F Wu, T Zhang… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Despite the fact that large-scale Language Models (LLM) have achieved SOTA
performances on a variety of NLP tasks, its performance on NER is still significantly below …

[PDF][PDF] Trustllm: Trustworthiness in large language models

L Sun, Y Huang, H Wang, S Wu, Q Zhang… - arxiv preprint arxiv …, 2024‏ - mosis.eecs.utk.edu
Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …

Unnatural instructions: Tuning language models with (almost) no human labor

O Honovich, T Scialom, O Levy, T Schick - arxiv preprint arxiv:2212.09689, 2022‏ - arxiv.org
Instruction tuning enables pretrained language models to perform new tasks from inference-
time natural language descriptions. These approaches rely on vast amounts of human …

The debate over understanding in AI's large language models

M Mitchell, DC Krakauer - Proceedings of the National Academy of …, 2023‏ - pnas.org
We survey a current, heated debate in the artificial intelligence (AI) research community on
whether large pretrained language models can be said to understand language—and the …