A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

Tool learning with foundation models

Y Qin, S Hu, Y Lin, W Chen, N Ding, G Cui… - ACM Computing …, 2024 - dl.acm.org
Humans possess an extraordinary ability to create and utilize tools. With the advent of
foundation models, artificial intelligence systems have the potential to be equally adept in …

Toolformer: Language models can teach themselves to use tools

T Schick, J Dwivedi-Yu, R Dessì… - Advances in …, 2023 - proceedings.neurips.cc
Abstract Language models (LMs) exhibit remarkable abilities to solve new tasks from just a
few examples or textual instructions, especially at scale. They also, paradoxically, struggle …

Beyond the imitation game: Quantifying and extrapolating the capabilities of language models

A Srivastava, A Rastogi, A Rao, AAM Shoeb… - arxiv preprint arxiv …, 2022 - arxiv.org
Language models demonstrate both quantitative improvement and new qualitative
capabilities with increasing scale. Despite their potentially transformative impact, these new …

A metaverse: Taxonomy, components, applications, and open challenges

SM Park, YG Kim - IEEE access, 2022 - ieeexplore.ieee.org
Unlike previous studies on the Metaverse based on Second Life, the current Metaverse is
based on the social value of Generation Z that online and offline selves are not different …

[PDF][PDF] mt5: A massively multilingual pre-trained text-to-text transformer

L Xue - arxiv preprint arxiv:2010.11934, 2020 - fq.pkwyx.com
The recent" Text-to-Text Transfer Transformer"(T5) leveraged a unified text-to-text format and
scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this …

Crosslingual generalization through multitask finetuning

N Muennighoff, T Wang, L Sutawika, A Roberts… - arxiv preprint arxiv …, 2022 - arxiv.org
Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …

[PDF][PDF] Unsupervised cross-lingual representation learning at scale

A Conneau - arxiv preprint arxiv:1911.02116, 2019 - fq.pkwyx.com
This paper shows that pretraining multilingual language models at scale leads to significant
performance gains for a wide range of cross-lingual transfer tasks. We train a Transformer …

Minilm: Deep self-attention distillation for task-agnostic compression of pre-trained transformers

W Wang, F Wei, L Dong, H Bao… - Advances in Neural …, 2020 - proceedings.neurips.cc
Pre-trained language models (eg, BERT (Devlin et al., 2018) and its variants) have achieved
remarkable success in varieties of NLP tasks. However, these models usually consist of …

Spanish pre-trained bert model and evaluation data

J Cañete, G Chaperon, R Fuentes, JH Ho… - arxiv preprint arxiv …, 2023 - arxiv.org
The Spanish language is one of the top 5 spoken languages in the world. Nevertheless,
finding resources to train or evaluate Spanish language models is not an easy task. In this …