A survey on data selection for language models

A Albalak, Y Elazar, SM **e, S Longpre… - arxiv preprint arxiv …, 2024 - arxiv.org
A major factor in the recent success of large language models is the use of enormous and
ever-growing text datasets for unsupervised pre-training. However, naively training a model …

A survey on rag meeting llms: Towards retrieval-augmented large language models

W Fan, Y Ding, L Ning, S Wang, H Li, D Yin… - Proceedings of the 30th …, 2024 - dl.acm.org
As one of the most advanced techniques in AI, Retrieval-Augmented Generation (RAG) can
offer reliable and up-to-date external knowledge, providing huge convenience for numerous …

Retrieval augmented generation (rag) and beyond: A comprehensive survey on how to make your llms use external data more wisely

S Zhao, Y Yang, Z Wang, Z He, LK Qiu… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) augmented with external data have demonstrated
remarkable capabilities in completing real-world tasks. Techniques for integrating external …

A systematic survey of prompt engineering on vision-language foundation models

J Gu, Z Han, S Chen, A Beirami, B He, G Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
Prompt engineering is a technique that involves augmenting a large pre-trained model with
task-specific hints, known as prompts, to adapt the model to new tasks. Prompts can be …

Grammar prompting for domain-specific language generation with large language models

B Wang, Z Wang, X Wang, Y Cao… - Advances in Neural …, 2023 - proceedings.neurips.cc
Large language models (LLMs) can learn to perform a wide range of natural language tasks
from just a handful of in-context examples. However, for generating strings from highly …

Label words are anchors: An information flow perspective for understanding in-context learning

L Wang, L Li, D Dai, D Chen, H Zhou, F Meng… - arxiv preprint arxiv …, 2023 - arxiv.org
In-context learning (ICL) emerges as a promising capability of large language models
(LLMs) by providing them with demonstration examples to perform diverse tasks. However …

Kicgpt: Large language model with knowledge in context for knowledge graph completion

Y Wei, Q Huang, JT Kwok, Y Zhang - arxiv preprint arxiv:2402.02389, 2024 - arxiv.org
Knowledge Graph Completion (KGC) is crucial for addressing knowledge graph
incompleteness and supporting downstream applications. Many models have been …

Learning to retrieve in-context examples for large language models

L Wang, N Yang, F Wei - arxiv preprint arxiv:2307.07164, 2023 - arxiv.org
Large language models (LLMs) have demonstrated their ability to learn in-context, allowing
them to perform various tasks based on a few input-output examples. However, the …

Do large language models have compositional ability? an investigation into limitations and scalability

Z Xu, Z Shi, Y Liang - arxiv preprint arxiv:2407.15720, 2024 - arxiv.org
Large language models (LLMs) have emerged as powerful tools for many AI problems and
exhibit remarkable in-context learning (ICL) capabilities. Compositional ability, solving …

In-context learning with iterative demonstration selection

C Qin, A Zhang, C Chen, A Dagar, W Ye - arxiv preprint arxiv:2310.09881, 2023 - arxiv.org
Spurred by advancements in scale, large language models (LLMs) have demonstrated
strong few-shot learning ability via in-context learning (ICL). However, the performance of …