Академия Google

J Obando-Ceron, A Courville, PS Castro - arxiv preprint arxiv:2402.12479, 2024 - arxiv.org

Recent work has shown that deep reinforcement learning agents have difficulty in effectively
using their network parameters. We leverage prior insights into the advantages of sparse …

Сохранить Цитировать Цитируется: 11 Похожие статьи Все версии статьи (5) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Scaling laws for sparsely-connected foundation models

E Frantar, C Riquelme, N Houlsby, D Alistarh… - arxiv preprint arxiv …, 2023 - arxiv.org

We explore the impact of parameter sparsity on the scaling behavior of Transformers trained
on massive datasets (ie," foundation models"), in both vision and language domains. In this …

Сохранить Цитировать Цитируется: 21 Похожие статьи Все версии статьи (4) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Navigating Extremes: Dynamic Sparsity in Large Output Spaces

N Nasibullah, E Schultheis, M Lasby… - Advances in …, 2025 - proceedings.neurips.cc

Abstract In recent years, Dynamic Sparse Training (DST) has emerged as an alternative to
post-training pruning for generating efficient models. In principle, DST allows for a much …

Сохранить Цитировать Похожие статьи В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Navigating Extremes: Dynamic Sparsity in Large Output Spaces

N Ullah, E Schultheis, M Lasby, Y Ioannou… - arxiv preprint arxiv …, 2024 - arxiv.org

In recent years, Dynamic Sparse Training (DST) has emerged as an alternative to post-
training pruning for generating efficient models. In principle, DST allows for a more memory …

Сохранить Цитировать Цитируется: 1 Похожие статьи Все версии статьи (4) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Compiler Support for Sparse Tensor Convolutions

P Liu, AJ Root, A Xu, Y Li, F Kjolstad… - Proceedings of the ACM on …, 2024 - dl.acm.org

This paper extends prior work on sparse tensor algebra compilers to generate
asymptotically efficient code for tensor expressions with affine subscript expressions. Our …

Сохранить Цитировать Похожие статьи Все версии статьи (2)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

ELSA: Partial Weight Freezing for Overhead-Free Sparse Network Deployment

P Halvachi, A Peste, D Alistarh, CH Lampert - arxiv preprint arxiv …, 2023 - arxiv.org

We present ELSA, a practical solution for creating deep networks that can easily be
deployed at different levels of sparsity. The core idea is to embed one or more sparse …

Сохранить Цитировать Похожие статьи Все версии статьи (2) В виде HTML

Создать оповещение

Цитировать

Расширенный поиск

Сохранено в вашей библиотеке

JaxPruner: A concise library for sparsity research

In value-based deep reinforcement learning, a pruned network is a good network

Scaling laws for sparsely-connected foundation models

Navigating Extremes: Dynamic Sparsity in Large Output Spaces

Navigating Extremes: Dynamic Sparsity in Large Output Spaces

Compiler Support for Sparse Tensor Convolutions

ELSA: Partial Weight Freezing for Overhead-Free Sparse Network Deployment