- Academic Search

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - International Journal of …, 2024 - Springer

Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …

Save Cite Cited by 605 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Transformers in medical imaging: A survey

F Shamshad, S Khan, SW Zamir, MH Khan… - Medical Image …, 2023 - Elsevier

Following unprecedented success on the natural language tasks, Transformers have been
successfully applied to several computer vision problems, achieving state-of-the-art results …

Save Cite Cited by 756 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Segnext: Rethinking convolutional attention design for semantic segmentation

MH Guo, CZ Lu, Q Hou, Z Liu… - Advances in Neural …, 2022 - proceedings.neurips.cc

We present SegNeXt, a simple convolutional network architecture for semantic
segmentation. Recent transformer-based models have dominated the field of se-mantic …

Save Cite Cited by 697 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] springer.com

Visual attention network

MH Guo, CZ Lu, ZN Liu, MM Cheng, SM Hu - Computational Visual Media, 2023 - Springer

While originally designed for natural language processing tasks, the self-attention
mechanism has recently taken various computer vision areas by storm. However, the 2D …

Save Cite Cited by 792 Related articles All 8 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Large selective kernel network for remote sensing object detection

Y Li, Q Hou, Z Zheng, MM Cheng… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recent research on remote sensing object detection has largely focused on improving the
representation of oriented bounding boxes but has overlooked the unique prior knowledge …

Save Cite Cited by 354 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] peerj.com

The multi-modal fusion in visual question answering: a review of attention mechanisms

S Lu, M Liu, L Yin, Z Yin, X Liu, W Zheng - PeerJ Computer Science, 2023 - peerj.com

Abstract Visual Question Answering (VQA) is a significant cross-disciplinary issue in the
fields of computer vision and natural language processing that requires a computer to output …

Save Cite Cited by 217 Related articles All 8 versions Free GPT-4 Cached

[Free GPT-4]

[PDF] neurips.cc

Spike-driven transformer

M Yao, J Hu, Z Zhou, L Yuan, Y Tian… - Advances in neural …, 2024 - proceedings.neurips.cc

Abstract Spiking Neural Networks (SNNs) provide an energy-efficient deep learning option
due to their unique spike-based event-driven (ie, spike-driven) paradigm. In this paper, we …

Save Cite Cited by 115 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mdpi.com

Convolutional neural networks: A survey

M Krichen - Computers, 2023 - mdpi.com

Artificial intelligence (AI) has become a cornerstone of modern technology, revolutionizing
industries from healthcare to finance. Convolutional neural networks (CNNs) are a subset of …

Save Cite Cited by 304 Related articles All 4 versions Free GPT-4 Cached

[Free GPT-4]

[PDF] springer.com

Large-scale multi-modal pre-trained models: A comprehensive survey

X Wang, G Chen, G Qian, P Gao, XY Wei… - Machine Intelligence …, 2023 - Springer

With the urgent demand for generalized deep models, many pre-trained big models are
proposed, such as bidirectional encoder representations (BERT), vision transformer (ViT) …

Save Cite Cited by 190 Related articles All 8 versions Free GPT-4

[Free GPT-4]

[PDF] ieee.org

YOLOv5-Tassel: Detecting tassels in RGB UAV imagery with improved YOLOv5 based on transfer learning

W Liu, K Quijano, MM Crawford - IEEE Journal of Selected …, 2022 - ieeexplore.ieee.org

Unmanned aerial vehicles (UAVs) equipped with lightweight sensors, such as RGB cameras
and LiDAR, have significant potential in precision agriculture, including object detection …

Save Cite Cited by 233 Related articles All 4 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Attention mechanisms in computer vision: A survey

A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

Transformers in medical imaging: A survey

Segnext: Rethinking convolutional attention design for semantic segmentation

Visual attention network

Large selective kernel network for remote sensing object detection

The multi-modal fusion in visual question answering: a review of attention mechanisms

Spike-driven transformer

Convolutional neural networks: A survey

Large-scale multi-modal pre-trained models: A comprehensive survey

YOLOv5-Tassel: Detecting tassels in RGB UAV imagery with improved YOLOv5 based on transfer learning