Transformers in medical imaging: A survey

F Shamshad, S Khan, SW Zamir, MH Khan… - Medical image …, 2023 - Elsevier
Following unprecedented success on the natural language tasks, Transformers have been
successfully applied to several computer vision problems, achieving state-of-the-art results …

Vision Transformers in medical computer vision—A contemplative retrospection

A Parvaiz, MA Khalid, R Zafar, H Ameer, M Ali… - … Applications of Artificial …, 2023 - Elsevier
Abstract Vision Transformers (ViTs), with the magnificent potential to unravel the information
contained within images, have evolved as one of the most contemporary and dominant …

Cross-city matters: A multimodal remote sensing benchmark dataset for cross-city semantic segmentation using high-resolution domain adaptation networks

D Hong, B Zhang, H Li, Y Li, J Yao, C Li… - Remote Sensing of …, 2023 - Elsevier
Artificial intelligence (AI) approaches nowadays have gained remarkable success in single-
modality-dominated remote sensing (RS) applications, especially with an emphasis on …

Alphapose: Whole-body regional multi-person pose estimation and tracking in real-time

HS Fang, J Li, H Tang, C Xu, H Zhu… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Accurate whole-body multi-person pose estimation and tracking is an important yet
challenging topic in computer vision. To capture the subtle actions of humans for complex …

Fastvit: A fast hybrid vision transformer using structural reparameterization

PKA Vasu, J Gabriel, J Zhu, O Tuzel… - Proceedings of the …, 2023 - openaccess.thecvf.com
The recent amalgamation of transformer and convolutional designs has led to steady
improvements in accuracy and efficiency of the models. In this work, we introduce FastViT, a …

Images speak in images: A generalist painter for in-context visual learning

X Wang, W Wang, Y Cao, C Shen… - Proceedings of the …, 2023 - openaccess.thecvf.com
In-context learning, as a new paradigm in NLP, allows the model to rapidly adapt to various
tasks with only a handful of prompts and examples. But in computer vision, the difficulties for …

Effective whole-body pose estimation with two-stages distillation

Z Yang, A Zeng, C Yuan, Y Li - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Whole-body pose estimation localizes the human body, hand, face, and foot keypoints in an
image. This task is challenging due to multi-scale body parts, fine-grained localization for …

Bedlam: A synthetic dataset of bodies exhibiting detailed lifelike animated motion

MJ Black, P Patel, J Tesch… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
We show, for the first time, that neural networks trained only on synthetic data achieve state-
of-the-art accuracy on the problem of 3D human pose and shape (HPS) estimation from real …

Mpdiou: a loss for efficient and accurate bounding box regression

S Ma, Y Xu - arxiv preprint arxiv:2307.07662, 2023 - arxiv.org
Bounding box regression (BBR) has been widely used in object detection and instance
segmentation, which is an important step in object localization. However, most of the existing …

Bevfusion: Multi-task multi-sensor fusion with unified bird's-eye view representation

Z Liu, H Tang, A Amini, X Yang, H Mao… - … on robotics and …, 2023 - ieeexplore.ieee.org
Multi-sensor fusion is essential for an accurate and reliable autonomous driving system.
Recent approaches are based on point-level fusion: augmenting the LiDAR point cloud with …