- Academic Search

Lagre Referanse Sitert av 182 Beslektede artikler Alle 6 versjoner

Vision Transformers in medical computer vision—A contemplative retrospection

A Parvaiz, MA Khalid, R Zafar, H Ameer, M Ali… - … Applications of Artificial …, 2023 - Elsevier

Abstract Vision Transformers (ViTs), with the magnificent potential to unravel the information
contained within images, have evolved as one of the most contemporary and dominant …

Cross-city matters: A multimodal remote sensing benchmark dataset for cross-city semantic segmentation using high-resolution domain adaptation networks

D Hong, B Zhang, H Li, Y Li, J Yao, C Li… - Remote Sensing of …, 2023 - Elsevier

Artificial intelligence (AI) approaches nowadays have gained remarkable success in single-
modality-dominated remote sensing (RS) applications, especially with an emphasis on …

Lagre Referanse Sitert av 341 Beslektede artikler Alle 5 versjoner

Lagre Referanse Sitert av 548 Beslektede artikler Alle 9 versjoner

Alphapose: Whole-body regional multi-person pose estimation and tracking in real-time

HS Fang, J Li, H Tang, C Xu, H Zhu… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Accurate whole-body multi-person pose estimation and tracking is an important yet
challenging topic in computer vision. To capture the subtle actions of humans for complex …

Lagre Referanse Sitert av 191 Beslektede artikler Alle 7 versjoner HTML-versjon

Fastvit: A fast hybrid vision transformer using structural reparameterization

PKA Vasu, J Gabriel, J Zhu, O Tuzel… - Proceedings of the …, 2023 - openaccess.thecvf.com

The recent amalgamation of transformer and convolutional designs has led to steady
improvements in accuracy and efficiency of the models. In this work, we introduce FastViT, a …

Lagre Referanse Sitert av 249 Beslektede artikler Alle 6 versjoner HTML-versjon

Images speak in images: A generalist painter for in-context visual learning

X Wang, W Wang, Y Cao, C Shen… - Proceedings of the …, 2023 - openaccess.thecvf.com

In-context learning, as a new paradigm in NLP, allows the model to rapidly adapt to various
tasks with only a handful of prompts and examples. But in computer vision, the difficulties for …

Lagre Referanse Sitert av 143 Beslektede artikler Alle 6 versjoner HTML-versjon

Effective whole-body pose estimation with two-stages distillation

Z Yang, A Zeng, C Yuan, Y Li - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Whole-body pose estimation localizes the human body, hand, face, and foot keypoints in an
image. This task is challenging due to multi-scale body parts, fine-grained localization for …

Lagre Referanse Sitert av 124 Beslektede artikler Alle 6 versjoner HTML-versjon

Bedlam: A synthetic dataset of bodies exhibiting detailed lifelike animated motion

MJ Black, P Patel, J Tesch… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

We show, for the first time, that neural networks trained only on synthetic data achieve state-
of-the-art accuracy on the problem of 3D human pose and shape (HPS) estimation from real …

Lagre Referanse Sitert av 274 Beslektede artikler Alle 2 versjoner HTML-versjon

Mpdiou: a loss for efficient and accurate bounding box regression

S Ma, Y Xu - arxiv preprint arxiv:2307.07662, 2023 - arxiv.org

Bounding box regression (BBR) has been widely used in object detection and instance
segmentation, which is an important step in object localization. However, most of the existing …