Академия Google

Z **a, D Han, Y Han, X Pan, S Song… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Generalized Referring Expression Segmentation (GRES) extends the scope of
classic RES to refer to multiple objects in one expression or identify the empty targets absent …

Сохранить Цитировать Цитируется: 40 Похожие статьи Все версии статьи (5) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] nature.com

Mosaic: in-memory computing and routing for small-world spike-based neuromorphic systems

T Dalgaty, F Moro, Y Demirağ, A De Pra… - Nature …, 2024 - nature.com

The brain's connectivity is locally dense and globally sparse, forming a small-world graph—
a principle prevalent in the evolution of various species, suggesting a universal solution for …

Сохранить Цитировать Цитируется: 31 Похожие статьи Все версии статьи (13)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

TransXNet: learning both global and local dynamics with a dual dynamic token mixer for visual recognition

M Lou, HY Zhou, S Yang, Y Yu - arxiv preprint arxiv:2310.19380, 2023 - arxiv.org

Recent studies have integrated convolution into transformers to introduce inductive bias and
improve generalization performance. However, the static nature of conventional convolution …

Сохранить Цитировать Цитируется: 48 Похожие статьи Все версии статьи (3) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Efficient diffusion transformer with step-wise dynamic attention mediators

Y Pu, Z **a, J Guo, D Han, Q Li, D Li, Y Yuan… - … on Computer Vision, 2024 - Springer

This paper identifies significant redundancy in the query-key interactions within self-attention
mechanisms of diffusion transformer models, particularly during the early stages of …

Сохранить Цитировать Цитируется: 10 Похожие статьи Все версии статьи (7)

Mesam: Multiscale enhanced segment anything model for optical remote sensing images

X Zhou, F Liang, L Chen, H Liu, Q Song… - … on Geoscience and …, 2024 - ieeexplore.ieee.org

Segment anything model (SAM) has been widely applied to various downstream tasks for its
excellent performance and generalization capability. However, SAM exhibits three …

Сохранить Цитировать Цитируется: 14 Похожие статьи Все версии статьи (4)

Ct-net: Asymmetric compound branch transformer for medical image segmentation

N Zhang, L Yu, D Zhang, W Wu, S Tian, X Kang, M Li - Neural Networks, 2024 - Elsevier

The Transformer architecture has been widely applied in the field of image segmentation
due to its powerful ability to capture long-range dependencies. However, its ability to capture …

Сохранить Цитировать Цитируется: 20 Похожие статьи Все версии статьи (4)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Dat++: Spatially dynamic vision transformer with deformable attention

Z **a, X Pan, S Song, LE Li, G Huang - arxiv preprint arxiv:2309.01430, 2023 - arxiv.org

Transformers have shown superior performance on various vision tasks. Their large
receptive field endows Transformer models with higher representation power than their CNN …

Сохранить Цитировать Цитируется: 22 Похожие статьи Все версии статьи (2) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On the role of attention masks and layernorm in transformers

X Wu, A Ajorlou, Y Wang, S Jegelka… - arxiv preprint arxiv …, 2024 - arxiv.org

Self-attention is the key mechanism of transformers, which are the essential building blocks
of modern foundation models. Recent studies have shown that pure self-attention suffers …

Сохранить Цитировать Цитируется: 10 Похожие статьи Все версии статьи (4) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

MG-ViT: a multi-granularity method for compact and efficient vision transformers

Y Zhang, Y Liu, D Miao, Q Zhang… - Advances in Neural …, 2023 - proceedings.neurips.cc

Abstract Vision Transformer (ViT) faces obstacles in wide application due to its huge
computational cost. Almost all existing studies on compressing ViT adopt the manner of …

Сохранить Цитировать Цитируется: 10 Похожие статьи Все версии статьи (3) В виде HTML

ViT-MVT: A unified vision transformer network for multiple vision tasks

T **e, K Dai, Z Jiang, R Li, S Mao… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

In this work, we seek to learn multiple mainstream vision tasks concurrently using a unified
network, which is storage-efficient as numerous networks with task-shared parameters can …

Сохранить Цитировать Цитируется: 12 Похожие статьи Все версии статьи (3)

Создать оповещение

Цитировать

Расширенный поиск

Сохранено в вашей библиотеке

Slide-transformer: Hierarchical vision transformer with local self-attention

Gsva: Generalized segmentation via multimodal large language models

Mosaic: in-memory computing and routing for small-world spike-based neuromorphic systems

TransXNet: learning both global and local dynamics with a dual dynamic token mixer for visual recognition

Efficient diffusion transformer with step-wise dynamic attention mediators

Mesam: Multiscale enhanced segment anything model for optical remote sensing images

Ct-net: Asymmetric compound branch transformer for medical image segmentation

Dat++: Spatially dynamic vision transformer with deformable attention

On the role of attention masks and layernorm in transformers

MG-ViT: a multi-granularity method for compact and efficient vision transformers

ViT-MVT: A unified vision transformer network for multiple vision tasks