- Academic Search

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

Uložit Citovat Počet citací tohoto článku: 772 Související články Všechny verze (počet: 4) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Datasets for large language models: A comprehensive survey

Y Liu, J Cao, C Liu, K Ding, L ** - arxiv preprint arxiv:2402.18041, 2024 - arxiv.org

This paper embarks on an exploration into the Large Language Model (LLM) datasets,
which play a crucial role in the remarkable advancements of LLMs. The datasets serve as …

Uložit Citovat Počet citací tohoto článku: 134 Související články Všechny verze (počet: 9) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The llama 3 herd of models

A Dubey, A Jauhri, A Pandey, A Kadian… - arxiv preprint arxiv …, 2024 - arxiv.org

Modern artificial intelligence (AI) systems are powered by foundation models. This paper
presents a new set of foundation models, called Llama 3. It is a herd of language models …

Uložit Citovat Počet citací tohoto článku: 2899 Související články Všechny verze (počet: 4) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Ties-merging: Resolving interference when merging models

P Yadav, D Tam, L Choshen… - Advances in Neural …, 2023 - proceedings.neurips.cc

Transfer learning–ie, further fine-tuning a pre-trained model on a downstream task–can
confer significant advantages, including improved downstream performance, faster …

Uložit Citovat Počet citací tohoto článku: 224 Související články Všechny verze (počet: 8) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Weak-to-strong generalization: Eliciting strong capabilities with weak supervision

C Burns, P Izmailov, JH Kirchner, B Baker… - arxiv preprint arxiv …, 2023 - arxiv.org

Widely used alignment techniques, such as reinforcement learning from human feedback
(RLHF), rely on the ability of humans to supervise model behavior-for example, to evaluate …

Uložit Citovat Počet citací tohoto článku: 224 Související články Všechny verze (počet: 9) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Crosslingual generalization through multitask finetuning

N Muennighoff, T Wang, L Sutawika, A Roberts… - arxiv preprint arxiv …, 2022 - arxiv.org

Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …

Uložit Citovat Počet citací tohoto článku: 710 Související články Všechny verze (počet: 6) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Enabling large language models to generate text with citations

T Gao, H Yen, J Yu, D Chen - arxiv preprint arxiv:2305.14627, 2023 - arxiv.org

Large language models (LLMs) have emerged as a widely-used tool for information
seeking, but their generated outputs are prone to hallucination. In this work, our aim is to …

Uložit Citovat Počet citací tohoto článku: 255 Související články Všechny verze (počet: 8) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Rarr: Researching and revising what language models say, using language models

L Gao, Z Dai, P Pasupat, A Chen, AT Chaganty… - arxiv preprint arxiv …, 2022 - arxiv.org

Language models (LMs) now excel at many tasks such as few-shot learning, question
answering, reasoning, and dialog. However, they sometimes generate unsupported or …

Uložit Citovat Počet citací tohoto článku: 261 Související články Všechny verze (počet: 4) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Prompting gpt-3 to be reliable

C Si, Z Gan, Z Yang, S Wang, J Wang… - arxiv preprint arxiv …, 2022 - arxiv.org

Large language models (LLMs) show impressive abilities via few-shot prompting.
Commercialized APIs such as OpenAI GPT-3 further increase their use in real-world …

Uložit Citovat Počet citací tohoto článku: 262 Související články Všechny verze (počet: 4) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Unified-io: A unified model for vision, language, and multi-modal tasks

J Lu, C Clark, R Zellers, R Mottaghi… - arxiv preprint arxiv …, 2022 - arxiv.org

We propose Unified-IO, a model that performs a large variety of AI tasks spanning classical
computer vision tasks, including pose estimation, object detection, depth estimation and …

Uložit Citovat Počet citací tohoto článku: 406 Související články Všechny verze (počet: 3) Zobrazit jako HTML

Vytvořit upozornění

Citovat

Rozšířené vyhledávání

Uloženo do Mojí knihovny

PAWS: Paraphrase adversaries from word scrambling

A comprehensive overview of large language models

Datasets for large language models: A comprehensive survey

The llama 3 herd of models

Ties-merging: Resolving interference when merging models

Weak-to-strong generalization: Eliciting strong capabilities with weak supervision

Crosslingual generalization through multitask finetuning

Enabling large language models to generate text with citations

Rarr: Researching and revising what language models say, using language models

Prompting gpt-3 to be reliable

Unified-io: A unified model for vision, language, and multi-modal tasks