Linguistically inspired roadmap for building biologically reliable protein language models

MH Vu, R Akbar, PA Robert, B Swiatczak… - Nature Machine …, 2023 - nature.com
Deep neural-network-based language models (LMs) are increasingly applied to large-scale
protein sequence data to predict protein function. However, being largely black-box models …

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arxiv preprint arxiv …, 2023 - arxiv.org
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

A survey on in-context learning

Q Dong, L Li, D Dai, C Zheng, J Ma, R Li, H **a… - arxiv preprint arxiv …, 2022 - arxiv.org
With the increasing capabilities of large language models (LLMs), in-context learning (ICL)
has emerged as a new paradigm for natural language processing (NLP), where LLMs make …

Scaling data-constrained language models

N Muennighoff, A Rush, B Barak… - Advances in …, 2023 - proceedings.neurips.cc
The current trend of scaling language models involves increasing both parameter count and
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …

Large language models struggle to learn long-tail knowledge

N Kandpal, H Deng, A Roberts… - International …, 2023 - proceedings.mlr.press
The Internet contains a wealth of knowledge—from the birthdays of historical figures to
tutorials on how to code—all of which may be learned by language models. However, while …

Supervised pretraining can learn in-context reinforcement learning

J Lee, A **e, A Pacchiano, Y Chandak… - Advances in …, 2024 - proceedings.neurips.cc
Large transformer models trained on diverse datasets have shown a remarkable ability to
learn in-context, achieving high few-shot performance on tasks they were not explicitly …

Synthetic prompting: Generating chain-of-thought demonstrations for large language models

Z Shao, Y Gong, Y Shen, M Huang… - International …, 2023 - proceedings.mlr.press
Large language models can perform various reasoning tasks by using chain-of-thought
prompting, which guides them to find answers through step-by-step demonstrations …

Birth of a transformer: A memory viewpoint

A Bietti, V Cabannes, D Bouchacourt… - Advances in …, 2024 - proceedings.neurips.cc
Large language models based on transformers have achieved great empirical successes.
However, as they are deployed more widely, there is a growing need to better understand …

The mystery of in-context learning: A comprehensive survey on interpretation and analysis

Y Zhou, J Li, Y **ang, H Yan, L Gui… - Proceedings of the 2024 …, 2024 - aclanthology.org
Understanding in-context learning (ICL) capability that enables large language models
(LLMs) to excel in proficiency through demonstration examples is of utmost importance. This …

In-context vectors: Making in context learning more effective and controllable through latent space steering

S Liu, H Ye, L **ng, J Zou - arxiv preprint arxiv:2311.06668, 2023 - arxiv.org
Large language models (LLMs) demonstrate emergent in-context learning capabilities,
where they adapt to new tasks based on example demonstrations. However, in-context …