- Academic Search

K Mahowald, AA Ivanova, IA Blank, N Kanwisher… - Trends in Cognitive …, 2024 - cell.com

Large language models (LLMs) have come closest among all models to date to mastering
human language, yet opinions about their linguistic and cognitive capabilities remain split …

Enregistrer Citer Cité 434 fois Autres articles Les 10 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

A review of sparse expert models in deep learning

W Fedus, J Dean, B Zoph - arxiv preprint arxiv:2209.01667, 2022 - arxiv.org

Sparse expert models are a thirty-year old concept re-emerging as a popular architecture in
deep learning. This class of architecture encompasses Mixture-of-Experts, Switch …

Enregistrer Citer Cité 145 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

The llama 3 herd of models

A Dubey, A Jauhri, A Pandey, A Kadian… - arxiv preprint arxiv …, 2024 - arxiv.org

Modern artificial intelligence (AI) systems are powered by foundation models. This paper
presents a new set of foundation models, called Llama 3. It is a herd of language models …

Enregistrer Citer Cité 2246 fois Autres articles Les 3 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Mixtral of experts

AQ Jiang, A Sablayrolles, A Roux, A Mensch… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model. Mixtral has
the same architecture as Mistral 7B, with the difference that each layer is composed of 8 …

Enregistrer Citer Cité 1298 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Revisiting class-incremental learning with pre-trained models: Generalizability and adaptivity are all you need

DW Zhou, ZW Cai, HJ Ye, DC Zhan, Z Liu - arxiv preprint arxiv …, 2023 - arxiv.org

Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting
old ones. Traditional CIL models are trained from scratch to continually acquire knowledge …

Enregistrer Citer Cité 234 fois Autres articles Les 6 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Efficient large language models: A survey

Z Wan, X Wang, C Liu, S Alam, Y Zheng, J Liu… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …

Enregistrer Citer Cité 126 fois Autres articles Les 7 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] thecvf.com

Content-aware local gan for photo-realistic super-resolution

JK Park, S Son, KM Lee - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com

Recently, GAN has successfully contributed to making single-image super-resolution (SISR)
methods produce more realistic images. However, natural images have complex distribution …

Enregistrer Citer Cité 48 fois Autres articles Les 3 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] neurips.cc

Large language models are visual reasoning coordinators

L Chen, B Li, S Shen, J Yang, C Li… - Advances in …, 2024 - proceedings.neurips.cc

Visual reasoning requires multimodal perception and commonsense cognition of the world.
Recently, multiple vision-language models (VLMs) have been proposed with excellent …

Enregistrer Citer Cité 35 fois Autres articles Les 5 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Can knowledge graphs reduce hallucinations in llms?: A survey

G Agrawal, T Kumarage, Z Alghamdi, H Liu - arxiv preprint arxiv …, 2023 - arxiv.org

The contemporary LLMs are prone to producing hallucinations, stemming mainly from the
knowledge gaps within the models. To address this critical limitation, researchers employ …

Enregistrer Citer Cité 82 fois Autres articles Les 4 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Modular deep learning

J Pfeiffer, S Ruder, I Vulić, EM Ponti - arxiv preprint arxiv:2302.11529, 2023 - arxiv.org

Transfer learning has recently become the dominant paradigm of machine learning. Pre-
trained models fine-tuned for downstream tasks achieve better performance with fewer …

Enregistrer Citer Cité 116 fois Autres articles Les 5 versions Free GPT-4 Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Mixture-of-experts with expert choice routing

Dissociating language and thought in large language models

A review of sparse expert models in deep learning

The llama 3 herd of models

Mixtral of experts

Revisiting class-incremental learning with pre-trained models: Generalizability and adaptivity are all you need

Efficient large language models: A survey

Content-aware local gan for photo-realistic super-resolution

Large language models are visual reasoning coordinators

Can knowledge graphs reduce hallucinations in llms?: A survey

Modular deep learning