Google Académico

Comparing vision transformers and convolutional neural networks for image classification: A literature review

J Maurício, I Domingues, J Bernardino - Applied Sciences, 2023 - mdpi.com

Transformers are models that implement a mechanism of self-attention, individually
weighting the importance of each part of the input data. Their use in image classification …

Guardar Citar Citado por 249 Artículos relacionados Las 4 versiones En caché

Transformer for object detection: Review and benchmark

Y Li, N Miao, L Ma, F Shuang, X Huang - Engineering Applications of …, 2023 - Elsevier

Object detection is a crucial task in computer vision (CV). With the rapid advancement of
Transformer-based models in natural language processing (NLP) and various visual tasks …

Guardar Citar Citado por 48 Artículos relacionados Las 2 versiones

[Free GPT-4]

[PDF] arxiv.org

Diffusion models for adversarial purification

W Nie, B Guo, Y Huang, C **ao, A Vahdat… - arxiv preprint arxiv …, 2022 - arxiv.org

Adversarial purification refers to a class of defense methods that remove adversarial
perturbations using a generative model. These methods do not make assumptions on the …

Guardar Citar Citado por 495 Artículos relacionados Las 8 versiones Versión en HTML

[Free GPT-4]

[PDF] mlr.press

Understanding the robustness in vision transformers

D Zhou, Z Yu, E **e, C **ao… - International …, 2022 - proceedings.mlr.press

Recent studies show that Vision Transformers (ViTs) exhibit strong robustness against
various corruptions. Although this property is partly attributed to the self-attention …

Guardar Citar Citado por 211 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]

[PDF] neurips.cc

Diffusion visual counterfactual explanations

M Augustin, V Boreiko, F Croce… - Advances in Neural …, 2022 - proceedings.neurips.cc

Abstract Visual Counterfactual Explanations (VCEs) are an important tool to understand the
decisions of an image classifier. They are “small” but “realistic” semantic changes of the …

Guardar Citar Citado por 75 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

On the adversarial robustness of vision transformers

R Shao, Z Shi, J Yi, PY Chen, CJ Hsieh - arxiv preprint arxiv:2103.15670, 2021 - arxiv.org

Following the success in advancing natural language processing and understanding,
transformers are expected to bring revolutionary changes to computer vision. This work …

Guardar Citar Citado por 232 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]

[PDF] thecvf.com

Towards robust vision transformer

X Mao, G Qi, Y Chen, X Li, R Duan… - Proceedings of the …, 2022 - openaccess.thecvf.com

Abstract Recent advances on Vision Transformer (ViT) and its improved variants have
shown that self-attention-based networks surpass traditional Convolutional Neural Networks …

Guardar Citar Citado por 233 Artículos relacionados Las 8 versiones Versión en HTML

[Free GPT-4]

[PDF] thecvf.com

Understanding the Robustness of 3D Object Detection With Bird's-Eye-View Representations in Autonomous Driving

Z Zhu, Y Zhang, H Chen, Y Dong… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract 3D object detection is an essential perception task in autonomous driving to
understand the environments. The Bird's-Eye-View (BEV) representations have significantly …

Guardar Citar Citado por 59 Artículos relacionados Las 5 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

A comprehensive study on robustness of image classification models: Benchmarking and rethinking

C Liu, Y Dong, W **ang, X Yang, H Su, J Zhu… - International Journal of …, 2024 - Springer

The robustness of deep neural networks is frequently compromised when faced with
adversarial examples, common corruptions, and distribution shifts, posing a significant …

Guardar Citar Citado por 67 Artículos relacionados Las 2 versiones

[Free GPT-4]

[PDF] thecvf.com

Mult: An end-to-end multitask learning transformer

D Bhattacharjee, T Zhang… - Proceedings of the …, 2022 - openaccess.thecvf.com

We propose an end-to-end Multitask Learning Transformer framework, named MulT, to
simultaneously learn multiple high-level vision tasks, including depth estimation, semantic …

Guardar Citar Citado por 96 Artículos relacionados Las 9 versiones Versión en HTML

Crear alerta

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

Are transformers more robust than cnns?

Comparing vision transformers and convolutional neural networks for image classification: A literature review

Transformer for object detection: Review and benchmark

Diffusion models for adversarial purification

Understanding the robustness in vision transformers

Diffusion visual counterfactual explanations

On the adversarial robustness of vision transformers

Towards robust vision transformer

Understanding the Robustness of 3D Object Detection With Bird's-Eye-View Representations in Autonomous Driving

A comprehensive study on robustness of image classification models: Benchmarking and rethinking

Mult: An end-to-end multitask learning transformer