- Academic Search

KS Kalyan, A Rajasekharan, S Sangeetha - arxiv preprint arxiv …, 2021 - arxiv.org

Transformer-based pretrained language models (T-PTLMs) have achieved great success in
almost every NLP task. The evolution of these models started with GPT and BERT. These …

Enregistrer Citer Cité 363 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Understanding llms: A comprehensive overview from training to inference

Y Liu, H He, T Han, X Zhang, M Liu, J Tian… - arxiv preprint arxiv …, 2024 - arxiv.org

The introduction of ChatGPT has led to a significant increase in the utilization of Large
Language Models (LLMs) for addressing downstream tasks. There's an increasing focus on …

Enregistrer Citer Cité 77 fois Autres articles Les 4 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Beyond the imitation game: Quantifying and extrapolating the capabilities of language models

A Srivastava, A Rastogi, A Rao, AAM Shoeb… - arxiv preprint arxiv …, 2022 - arxiv.org

Language models demonstrate both quantitative improvement and new qualitative
capabilities with increasing scale. Despite their potentially transformative impact, these new …

Enregistrer Citer Cité 1293 fois Autres articles Les 11 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

XLS-R: Self-supervised cross-lingual speech representation learning at scale

A Babu, C Wang, A Tjandra, K Lakhotia, Q Xu… - arxiv preprint arxiv …, 2021 - arxiv.org

This paper presents XLS-R, a large-scale model for cross-lingual speech representation
learning based on wav2vec 2.0. We train models with up to 2B parameters on nearly half a …

Enregistrer Citer Cité 713 fois Autres articles Les 5 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Language models are multilingual chain-of-thought reasoners

F Shi, M Suzgun, M Freitag, X Wang, S Srivats… - arxiv preprint arxiv …, 2022 - arxiv.org

We evaluate the reasoning abilities of large language models in multilingual settings. We
introduce the Multilingual Grade School Math (MGSM) benchmark, by manually translating …

Enregistrer Citer Cité 258 fois Autres articles Les 3 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] neurips.cc

M3exam: A multilingual, multimodal, multilevel benchmark for examining large language models

W Zhang, M Aljunied, C Gao… - Advances in Neural …, 2023 - proceedings.neurips.cc

Despite the existence of various benchmarks for evaluating natural language processing
models, we argue that human exams are a more suitable means of evaluating general …

Enregistrer Citer Cité 103 fois Autres articles Les 5 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Modular deep learning

J Pfeiffer, S Ruder, I Vulić, EM Ponti - arxiv preprint arxiv:2302.11529, 2023 - arxiv.org

Transfer learning has recently become the dominant paradigm of machine learning. Pre-
trained models fine-tuned for downstream tasks achieve better performance with fewer …

Enregistrer Citer Cité 116 fois Autres articles Les 5 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

mslam: Massively multilingual joint pre-training for speech and text

A Bapna, C Cherry, Y Zhang, Y Jia, M Johnson… - arxiv preprint arxiv …, 2022 - arxiv.org

We present mSLAM, a multilingual Speech and LAnguage Model that learns cross-lingual
cross-modal representations of speech and text by pre-training jointly on large amounts of …

Enregistrer Citer Cité 121 fois Autres articles Les 3 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Charformer: Fast character transformers via gradient-based subword tokenization

Y Tay, VQ Tran, S Ruder, J Gupta, HW Chung… - arxiv preprint arxiv …, 2021 - arxiv.org

State-of-the-art models in natural language processing rely on separate rigid subword
tokenization algorithms, which limit their generalization ability and adaptation to new …

Enregistrer Citer Cité 151 fois Autres articles Les 5 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] aclanthology.org

JGLUE: Japanese general language understanding evaluation

K Kurihara, D Kawahara, T Shibata - Proceedings of the Thirteenth …, 2022 - aclanthology.org

To develop high-performance natural language understanding (NLU) models, it is
necessary to have a benchmark to evaluate and analyze NLU ability from various …

Enregistrer Citer Cité 95 fois Autres articles Les 6 versions Free GPT-4 Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

XTREME-R: Towards more challenging and nuanced multilingual evaluation

Ammus: A survey of transformer-based pretrained models in natural language processing

Understanding llms: A comprehensive overview from training to inference

Beyond the imitation game: Quantifying and extrapolating the capabilities of language models

XLS-R: Self-supervised cross-lingual speech representation learning at scale

Language models are multilingual chain-of-thought reasoners

M3exam: A multilingual, multimodal, multilevel benchmark for examining large language models

Modular deep learning

mslam: Massively multilingual joint pre-training for speech and text

Charformer: Fast character transformers via gradient-based subword tokenization

JGLUE: Japanese general language understanding evaluation