- Academic Search

JTH Smith, A Warrington, SW Linderman - arxiv preprint arxiv:2208.04933, 2022 - arxiv.org

Models using structured state space sequence (S4) layers have achieved state-of-the-art
performance on long-range sequence modeling tasks. An S4 layer combines linear state …

Zapisz Cytuj Cytowane przez 486 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Monarch mixer: A simple sub-quadratic gemm-based architecture

D Fu, S Arora, J Grogan, I Johnson… - Advances in …, 2024 - proceedings.neurips.cc

Abstract Machine learning models are increasingly being scaled in both sequence length
and model dimension to reach longer contexts and better performance. However, existing …

Zapisz Cytuj Cytowane przez 47 Powiązane artykuły Wszystkie wersje 6 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mega: moving average equipped gated attention

X Ma, C Zhou, X Kong, J He, L Gui, G Neubig… - arxiv preprint arxiv …, 2022 - arxiv.org

The design choices in the Transformer attention mechanism, including weak inductive bias
and quadratic computational complexity, have limited its application for modeling long …

Zapisz Cytuj Cytowane przez 151 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

More convnets in the 2020s: Scaling up kernels beyond 51x51 using sparsity

S Liu, T Chen, X Chen, X Chen, Q **ao, B Wu… - arxiv preprint arxiv …, 2022 - arxiv.org

Transformers have quickly shined in the computer vision world since the emergence of
Vision Transformers (ViTs). The dominant role of convolutional neural networks (CNNs) …

Zapisz Cytuj Cytowane przez 209 Powiązane artykuły Wszystkie wersje 12 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Towards multi-spatiotemporal-scale generalized pde modeling

JK Gupta, J Brandstetter - arxiv preprint arxiv:2209.15616, 2022 - arxiv.org

Partial differential equations (PDEs) are central to describing complex physical system
simulations. Their expensive solution techniques have led to an increased interest in deep …

Zapisz Cytuj Cytowane przez 114 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Convolutional networks with oriented 1d kernels

A Kirchmeyer, J Deng - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com

In computer vision, 2D convolution is arguably the most important operation performed by a
ConvNet. Unsurprisingly, it has been the focus of intense software and hardware …

Zapisz Cytuj Cytowane przez 5 Powiązane artykuły Wszystkie wersje 8 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] nature.com

Learning long sequences in spiking neural networks

MI Stan, O Rhodes - Scientific Reports, 2024 - nature.com

Spiking neural networks (SNNs) take inspiration from the brain to enable energy-efficient
computations. Since the advent of Transformers, SNNs have struggled to compete with …

Zapisz Cytuj Cytowane przez 4 Powiązane artykuły

[Free GPT-4]
[DeepSeek]

[PDF] nature.com

Transformers significantly improve splice site prediction

BA Jónsson, GH Halldórsson, S Árdal… - Communications …, 2024 - nature.com

Mutations that affect RNA splicing significantly impact human diversity and disease. Here we
present a method using transformers, a type of machine learning model, to detect splicing …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 10

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

QuadConv: Quadrature-based convolutions with applications to non-uniform PDE data compression

K Doherty, C Simpson, S Becker, A Doostan - Journal of Computational …, 2024 - Elsevier

We present a new convolution layer for deep learning architectures which we call
QuadConv—an approximation to continuous convolution via quadrature. Our operator is …

Zapisz Cytuj Cytowane przez 7 Powiązane artykuły Wszystkie wersje 6

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Dnarch: Learning convolutional neural architectures by backpropagation

DW Romero, N Zeghidour - arxiv preprint arxiv:2302.05400, 2023 - arxiv.org

We present Differentiable Neural Architectures (DNArch), a method that jointly learns the
weights and the architecture of Convolutional Neural Networks (CNNs) by backpropagation …

Zapisz Cytuj Cytowane przez 5 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Towards a General Purpose CNN for Long Range Dependencies in $ N $ D

Simplified state space layers for sequence modeling

Monarch mixer: A simple sub-quadratic gemm-based architecture

Mega: moving average equipped gated attention

More convnets in the 2020s: Scaling up kernels beyond 51x51 using sparsity

Towards multi-spatiotemporal-scale generalized pde modeling

Convolutional networks with oriented 1d kernels

Learning long sequences in spiking neural networks

Transformers significantly improve splice site prediction

QuadConv: Quadrature-based convolutions with applications to non-uniform PDE data compression

Dnarch: Learning convolutional neural architectures by backpropagation