- Academic Search

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

Simpan Kutip Dirujuk 720 kali Artikel terkait 3 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Parameter-efficient fine-tuning for large models: A comprehensive survey

Z Han, C Gao, J Liu, J Zhang, SQ Zhang - arxiv preprint arxiv:2403.14608, 2024 - arxiv.org

Large models represent a groundbreaking advancement in multiple application fields,
enabling remarkable achievements across various tasks. However, their unprecedented …

Simpan Kutip Dirujuk 251 kali Artikel terkait 2 versi Versi HTML

Llama 2: Open foundation and fine-tuned chat models

H Touvron, L Martin, K Stone, P Albert… - arxiv preprint arxiv …, 2023 - arxiv.org

In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large
language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine …

Simpan Kutip Dirujuk 12133 kali Artikel terkait 2 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arxiv preprint arxiv …, 2023 - arxiv.org

Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

Simpan Kutip Dirujuk 3598 kali Artikel terkait 4 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Palm 2 technical report

R Anil, AM Dai, O Firat, M Johnson, D Lepikhin… - arxiv preprint arxiv …, 2023 - arxiv.org

We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and
reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is …

Simpan Kutip Dirujuk 1568 kali Artikel terkait 2 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The llama 3 herd of models

A Dubey, A Jauhri, A Pandey, A Kadian… - arxiv preprint arxiv …, 2024 - arxiv.org

Modern artificial intelligence (AI) systems are powered by foundation models. This paper
presents a new set of foundation models, called Llama 3. It is a herd of language models …

Simpan Kutip Dirujuk 2369 kali Artikel terkait 4 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The RefinedWeb dataset for Falcon LLM: outperforming curated corpora with web data, and web data only

G Penedo, Q Malartic, D Hesslow, R Cojocaru… - arxiv preprint arxiv …, 2023 - arxiv.org

Large language models are commonly trained on a mixture of filtered web data and curated
high-quality corpora, such as social media conversations, books, or technical papers. This …

Simpan Kutip Dirujuk 738 kali Artikel terkait 4 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Llm-pruner: On the structural pruning of large language models

X Ma, G Fang, X Wang - Advances in neural information …, 2023 - proceedings.neurips.cc

Large language models (LLMs) have shown remarkable capabilities in language
understanding and generation. However, such impressive capability typically comes with a …

Simpan Kutip Dirujuk 499 kali Artikel terkait 5 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Yi: Open foundation models by 01. ai

A Young, B Chen, C Li, C Huang, G Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce the Yi model family, a series of language and multimodal models that
demonstrate strong multi-dimensional capabilities. The Yi model family is based on 6B and …

Simpan Kutip Dirujuk 348 kali Artikel terkait 2 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mixtral of experts

AQ Jiang, A Sablayrolles, A Roux, A Mensch… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model. Mixtral has
the same architecture as Mistral 7B, with the difference that each layer is composed of 8 …

Simpan Kutip Dirujuk 1317 kali Artikel terkait 2 versi Versi HTML

Buat notifikasi

Kutip

Penelusuran lanjutan

Disimpan ke Koleksi saya

Can a suit of armor conduct electricity? a new dataset for open book question answering

A comprehensive overview of large language models

Parameter-efficient fine-tuning for large models: A comprehensive survey

Llama 2: Open foundation and fine-tuned chat models

A survey of large language models

Palm 2 technical report

The llama 3 herd of models

The RefinedWeb dataset for Falcon LLM: outperforming curated corpora with web data, and web data only

Llm-pruner: On the structural pruning of large language models

Yi: Open foundation models by 01. ai

Mixtral of experts