Google 학술 검색

BXB Yu, J Chang, H Wang, L Liu, S Wang… - ACM Computing …, 2024 - dl.acm.org

Fine-tuning visual models has been widely shown promising performance on many
downstream visual tasks. With the surprising development of pre-trained visual foundation …

저장 인용 33회 인용 관련 학술자료 전체 4개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Token contrast for weakly-supervised semantic segmentation

L Ru, H Zheng, Y Zhan, B Du - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Abstract Weakly-Supervised Semantic Segmentation (WSSS) using image-level labels
typically utilizes Class Activation Map (CAM) to generate the pseudo labels. Limited by the …

저장 인용 131회 인용 관련 학술자료 전체 8개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Video transformers: A survey

J Selva, AS Johansen, S Escalera… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org

Transformer models have shown great success handling long-range interactions, making
them a promising tool for modeling video. However, they lack inductive biases and scale …

저장 인용 137회 인용 관련 학술자료 전체 8개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Adamv-moe: Adaptive multi-task vision mixture-of-experts

T Chen, X Chen, X Du, A Rashwan… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Sparsely activated Mixture-of-Experts (MoE) is becoming a promising paradigm for
multi-task learning (MTL). Instead of compressing multiple tasks' knowledge into a single …

저장 인용 42회 인용 관련 학술자료 전체 4개의 버전 HTML 버전

Masked relation learning for deepfake detection

Z Yang, J Liang, Y Xu, XY Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

DeepFake detection aims to differentiate falsified faces from real ones. Most approaches
formulate it as a binary classification problem by solely mining the local artifacts and …

저장 인용 81회 인용 관련 학술자료 전체 2개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Ppt: token-pruned pose transformer for monocular and multi-view human pose estimation

H Ma, Z Wang, Y Chen, D Kong, L Chen, X Liu… - … on Computer Vision, 2022 - Springer

Recently, the vision transformer and its variants have played an increasingly important role
in both monocular and multi-view human pose estimation. Considering image patches as …

저장 인용 57회 인용 관련 학술자료 전체 6개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On filtrations of A (V)

J Liu - arxiv preprint arxiv:2103.08090, 2021 - arxiv.org

The filtrations on Zhu's algebra $ A (V) $ and bimodules $ A (M) $ are studied. As an
application, we prove that $ A (V) $ is noetherian when $ V $ is strongly finitely generated …

저장 인용 63회 인용 관련 학술자료 전체 2개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Sparse moe as the new dropout: Scaling dense and self-slimmable transformers

T Chen, Z Zhang, A Jaiswal, S Liu, Z Wang - arxiv preprint arxiv …, 2023 - arxiv.org

Despite their remarkable achievement, gigantic transformers encounter significant
drawbacks, including exorbitant computational and memory footprints during training, as …

저장 인용 40회 인용 관련 학술자료 전체 5개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Shvit: Single-head vision transformer with memory efficient macro design

S Yun, Y Ro - Proceedings of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

Abstract Recently efficient Vision Transformers have shown great performance with low
latency on resource-constrained devices. Conventionally they use 4x4 patch embeddings …

저장 인용 22회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The lighter the better: rethinking transformers in medical image segmentation through adaptive pruning

X Lin, L Yu, KT Cheng, Z Yan - IEEE Transactions on Medical …, 2023 - ieeexplore.ieee.org

Vision transformers have recently set off a new wave in the field of medical image analysis
due to their remarkable performance on various computer vision tasks. However, recent …

저장 인용 34회 인용 관련 학술자료 전체 5개의 버전

알림 만들기

인용

고급 검색

라이브러리에 저장됨

The principle of diversity: Training stronger vision transformers calls for reducing all...

Visual tuning

Token contrast for weakly-supervised semantic segmentation

Video transformers: A survey

Adamv-moe: Adaptive multi-task vision mixture-of-experts

Masked relation learning for deepfake detection

Ppt: token-pruned pose transformer for monocular and multi-view human pose estimation

On filtrations of A (V)

Sparse moe as the new dropout: Scaling dense and self-slimmable transformers

Shvit: Single-head vision transformer with memory efficient macro design

The lighter the better: rethinking transformers in medical image segmentation through adaptive pruning