Google znalac

T Nguyen, C Van Nguyen, VD Lai, H Man… - arxiv preprint arxiv …, 2023 - arxiv.org

The driving factors behind the development of large language models (LLMs) with
impressive learning capabilities are their colossal model sizes and extensive training …

Spremi Citiraj Spominje se 84 puta Srodni članci Svih 6 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

What language model to train if you have one million GPU hours?

TL Scao, T Wang, D Hesslow, L Saulnier… - arxiv preprint arxiv …, 2022 - arxiv.org

The crystallization of modeling methods around the Transformer architecture has been a
boon for practitioners. Simple, well-motivated architectural variations can transfer across …

Spremi Citiraj Spominje se 118 puta Srodni članci Svih 4 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

BLOOM+ 1: Adding language support to BLOOM for zero-shot prompting

ZX Yong, H Schoelkopf, N Muennighoff, AF Aji… - arxiv preprint arxiv …, 2022 - arxiv.org

The BLOOM model is a large publicly available multilingual language model, but its
pretraining was limited to 46 languages. To extend the benefits of BLOOM to other …

Spremi Citiraj Spominje se 63 puta Srodni članci Svih 8 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] facctconference.org

A critical analysis of the largest source for generative ai training data: Common crawl

S Baack - Proceedings of the 2024 ACM Conference on Fairness …, 2024 - dl.acm.org

Common Crawl is the largest freely available collection of web crawl data and one of the
most important sources of pre-training data for large language models (LLMs). It is used so …

Spremi Citiraj Spominje se 15 puta Srodni članci Svih 2 inačica

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Representation in AI evaluations

AS Bergman, LA Hendricks, M Rauh, B Wu… - Proceedings of the …, 2023 - dl.acm.org

Calls for representation in artificial intelligence (AI) and machine learning (ML) are
widespread, with" representation" or" representativeness" generally understood to be both …

Spremi Citiraj Spominje se 21 puta Srodni članci

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Lonas: Elastic low-rank adapters for efficient large language models

JP Munoz, J Yuan, Y Zheng, N Jain - Proceedings of the 2024 …, 2024 - aclanthology.org

Abstract Large Language Models (LLMs) continue to grow, reaching hundreds of billions of
parameters and making it challenging for Deep Learning practitioners with resource …

Spremi Citiraj Spominje se 12 puta Srodni članci Svih 3 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Pivoine: Instruction tuning for open-world entity profiling

K Lu, X Pan, K Song, H Zhang, D Yu… - Findings of the …, 2023 - aclanthology.org

This work considers the problem of Open-world Entity Profiling, a sub-domain of Open-world
Information Extraction (Open-world IE). Unlike the conventional closed-world IE, Open-world …

Spremi Citiraj Spominje se 7 puta Srodni članci Svih 2 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Pivoine: Instruction tuning for open-world information extraction

K Lu, X Pan, K Song, H Zhang, D Yu, J Chen - arxiv preprint arxiv …, 2023 - arxiv.org

We consider the problem of Open-world Information Extraction (Open-world IE), which
extracts comprehensive entity profiles from unstructured texts. Different from the …

Spremi Citiraj Spominje se 8 puta Srodni članci Svih 2 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Spacerini: Plug-and-play search engines with Pyserini and Hugging Face

C Akiki, O Ogundepo, A Piktus, X Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org

We present Spacerini, a modular framework for seamless building and deployment of
interactive search applications, designed to facilitate the qualitative analysis of large scale …

Spremi Citiraj Spominje se 8 puta Srodni članci Svih 6 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The Nordic Pile: A 1.2 TB Nordic dataset for language modeling

J Öhman, S Verlinden, A Ekgren, AC Gyllensten… - arxiv preprint arxiv …, 2023 - arxiv.org

Pre-training Large Language Models (LLMs) require massive amounts of text data, and the
performance of the LLMs typically correlates with the scale and quality of the datasets. This …

Spremi Citiraj Spominje se 7 puta Srodni članci Svih 3 inačica Prikaži kao HTML

Stvori obavijest

Citiraj

Napredno pretraživanje

Spremljeno u Moju knjižnicu

Loubna Ben allal

Culturax: A cleaned, enormous, and multilingual dataset for large language models in 167 languages

What language model to train if you have one million GPU hours?

BLOOM+ 1: Adding language support to BLOOM for zero-shot prompting

A critical analysis of the largest source for generative ai training data: Common crawl

Representation in AI evaluations

Lonas: Elastic low-rank adapters for efficient large language models

Pivoine: Instruction tuning for open-world entity profiling

Pivoine: Instruction tuning for open-world information extraction

Spacerini: Plug-and-play search engines with Pyserini and Hugging Face

The Nordic Pile: A 1.2 TB Nordic dataset for language modeling