Google Acadèmic

R Azad, EK Aghdam, A Rauland, Y Jia… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Automatic medical image segmentation is a crucial topic in the medical domain and
successively a critical counterpart in the computer-aided diagnosis paradigm. U-Net is the …

Desa Cita Citat per 276 Articles relacionats Totes les 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Transformers in vision: A survey

S Khan, M Naseer, M Hayat, SW Zamir… - ACM computing …, 2022 - dl.acm.org

Astounding results from Transformer models on natural language tasks have intrigued the
vision community to study their application to computer vision problems. Among their salient …

Desa Cita Citat per 2920 Articles relacionats Totes les 8 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Segment anything

A Kirillov, E Mintun, N Ravi, H Mao… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract We introduce the Segment Anything (SA) project: a new task, model, and dataset for
image segmentation. Using our efficient model in a data collection loop, we built the largest …

Desa Cita Citat per 8288 Articles relacionats Totes les 12 versions Free GPT-4 Versió HTML

[Free GPT-4]

[PDF] arxiv.org

Yolov9: Learning what you want to learn using programmable gradient information

CY Wang, IH Yeh, HY Mark Liao - European conference on computer …, 2024 - Springer

Today's deep learning methods focus on how to design the objective functions to make the
prediction as close as possible to the target. Meanwhile, an appropriate neural network …

Desa Cita Citat per 1380 Articles relacionats Totes les 3 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Biformer: Vision transformer with bi-level routing attention

L Zhu, X Wang, Z Ke, W Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

As the core building block of vision transformers, attention is a powerful tool to capture long-
range dependency. However, such power comes at a cost: it incurs a huge computation …

Desa Cita Citat per 704 Articles relacionats Totes les 10 versions Free GPT-4 Versió HTML

[Free GPT-4]

[PDF] neurips.cc

Visionllm: Large language model is also an open-ended decoder for vision-centric tasks

W Wang, Z Chen, X Chen, J Wu… - Advances in …, 2024 - proceedings.neurips.cc

Large language models (LLMs) have notably accelerated progress towards artificial general
intelligence (AGI), with their impressive zero-shot capacity for user-tailored tasks, endowing …

Desa Cita Citat per 445 Articles relacionats Totes les 6 versions Free GPT-4 Versió HTML

[Free GPT-4]

[PDF] arxiv.org

Vision mamba: Efficient visual representation learning with bidirectional state space model

L Zhu, B Liao, Q Zhang, X Wang, W Liu… - arxiv preprint arxiv …, 2024 - arxiv.org

Recently the state space models (SSMs) with efficient hardware-aware designs, ie, the
Mamba deep learning model, have shown great potential for long sequence modeling …

Desa Cita Citat per 1011 Articles relacionats Totes les 5 versions Free GPT-4 Versió HTML

[Free GPT-4]

[PDF] thecvf.com

Internimage: Exploring large-scale vision foundation models with deformable convolutions

W Wang, J Dai, Z Chen, Z Huang, Z Li… - Proceedings of the …, 2023 - openaccess.thecvf.com

Compared to the great progress of large-scale vision transformers (ViTs) in recent years,
large-scale models based on convolutional neural networks (CNNs) are still in an early …

Desa Cita Citat per 793 Articles relacionats Totes les 8 versions Free GPT-4 Versió HTML

[Free GPT-4]

[PDF] thecvf.com

Efficientvit: Memory efficient vision transformer with cascaded group attention

X Liu, H Peng, N Zheng, Y Yang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Vision transformers have shown great success due to their high model capabilities.
However, their remarkable performance is accompanied by heavy computation costs, which …

Desa Cita Citat per 348 Articles relacionats Totes les 8 versions Free GPT-4 Versió HTML

[Free GPT-4]

[PDF] thecvf.com

Large selective kernel network for remote sensing object detection

Y Li, Q Hou, Z Zheng, MM Cheng… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recent research on remote sensing object detection has largely focused on improving the
representation of oriented bounding boxes but has overlooked the unique prior knowledge …

Desa Cita Citat per 356 Articles relacionats Totes les 7 versions Free GPT-4 Versió HTML

Crea una alerta

Cita

Cerca avançada

S'ha desat a La meva biblioteca

Pvt v2: Improved baselines with pyramid vision transformer

Medical image segmentation review: The success of u-net

Transformers in vision: A survey

Segment anything

Yolov9: Learning what you want to learn using programmable gradient information

Biformer: Vision transformer with bi-level routing attention

Visionllm: Large language model is also an open-ended decoder for vision-centric tasks

Vision mamba: Efficient visual representation learning with bidirectional state space model

Internimage: Exploring large-scale vision foundation models with deformable convolutions

Efficientvit: Memory efficient vision transformer with cascaded group attention

Large selective kernel network for remote sensing object detection