Recent advances in natural language processing via large pre-trained language models: A survey

B Min, H Ross, E Sulem, APB Veyseh… - ACM Computing …, 2023 - dl.acm.org
Large, pre-trained language models (PLMs) such as BERT and GPT have drastically
changed the Natural Language Processing (NLP) field. For numerous NLP tasks …

Datasets for large language models: A comprehensive survey

Y Liu, J Cao, C Liu, K Ding, L ** - arxiv preprint arxiv:2402.18041, 2024 - arxiv.org
This paper embarks on an exploration into the Large Language Model (LLM) datasets,
which play a crucial role in the remarkable advancements of LLMs. The datasets serve as …

[HTML][HTML] ChatGPT: Jack of all trades, master of none

J Kocoń, I Cichecki, O Kaszyca, M Kochanek, D Szydło… - Information …, 2023 - Elsevier
OpenAI has released the Chat Generative Pre-trained Transformer (ChatGPT) and
revolutionized the approach in artificial intelligence to human-model interaction. The first …

Weak-to-strong generalization: Eliciting strong capabilities with weak supervision

C Burns, P Izmailov, JH Kirchner, B Baker… - arxiv preprint arxiv …, 2023 - arxiv.org
Widely used alignment techniques, such as reinforcement learning from human feedback
(RLHF), rely on the ability of humans to supervise model behavior-for example, to evaluate …

Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models

N Guha, J Nyarko, D Ho, C Ré… - Advances in …, 2023 - proceedings.neurips.cc
The advent of large language models (LLMs) and their adoption by the legal community has
given rise to the question: what types of legal reasoning can LLMs perform? To enable …

Language models are super mario: Absorbing abilities from homologous models as a free lunch

L Yu, B Yu, H Yu, F Huang, Y Li - Forty-first International Conference …, 2024 - openreview.net
In this paper, we unveil that Language Models (LMs) can acquire new capabilities by
assimilating parameters from homologous models without retraining or GPUs. We first …

Pretraining language models with human preferences

T Korbak, K Shi, A Chen, RV Bhalerao… - International …, 2023 - proceedings.mlr.press
Abstract Language models (LMs) are pretrained to imitate text from large and diverse
datasets that contain content that would violate human preferences if generated by an LM …

Can chatgpt understand too? a comparative study on chatgpt and fine-tuned bert

Q Zhong, L Ding, J Liu, B Du, D Tao - arxiv preprint arxiv:2302.10198, 2023 - arxiv.org
Recently, ChatGPT has attracted great attention, as it can generate fluent and high-quality
responses to human inquiries. Several prior studies have shown that ChatGPT attains …

[HTML][HTML] Modern language models refute Chomsky's approach to language

ST Piantadosi - From fieldwork to linguistic theory: A tribute to …, 2023 - books.google.com
Modern machine learning has subverted and bypassed the theoretical framework of
Chomsky's generative approach to linguistics, including its core claims to particular insights …

Zeroquant: Efficient and affordable post-training quantization for large-scale transformers

Z Yao, R Yazdani Aminabadi… - Advances in …, 2022 - proceedings.neurips.cc
How to efficiently serve ever-larger trained natural language models in practice has become
exceptionally challenging even for powerful cloud servers due to their prohibitive …