FAIR1M: A benchmark dataset for fine-grained object recognition in high-resolution remote sensing imagery

X Sun, P Wang, Z Yan, F Xu, R Wang, W Diao… - ISPRS Journal of …, 2022 - Elsevier
With the rapid development of deep learning, many deep learning-based approaches have
made great achievements in object detection tasks. It is generally known that deep learning …

Vision transformer adapter for dense predictions

Z Chen, Y Duan, W Wang, J He, T Lu, J Dai… - arxiv preprint arxiv …, 2022 - arxiv.org
This work investigates a simple yet powerful adapter for Vision Transformer (ViT). Unlike
recent visual transformers that introduce vision-specific inductive biases into their …

You only look one-level feature

Q Chen, Y Wang, T Yang, X Zhang… - Proceedings of the …, 2021 - openaccess.thecvf.com
This paper revisits feature pyramids networks (FPN) for one-stage detectors and points out
that the success of FPN is due to its divide-and-conquer solution to the optimization problem …

Boundary IoU: Improving object-centric image segmentation evaluation

B Cheng, R Girshick, P Dollár… - Proceedings of the …, 2021 - openaccess.thecvf.com
Abstract We present Boundary IoU (Intersection-over-Union), a new segmentation
evaluation measure focused on boundary quality. We perform an extensive analysis across …

The effectiveness of MAE pre-pretraining for billion-scale pretraining

M Singh, Q Duval, KV Alwala, H Fan… - Proceedings of the …, 2023 - openaccess.thecvf.com
This paper revisits the standard pretrain-then-finetune paradigm used in computer vision for
visual recognition tasks. Typically, state-of-the-art foundation models are pretrained using …

Delving into localization errors for monocular 3d object detection

X Ma, Y Zhang, D Xu, D Zhou, S Yi… - Proceedings of the …, 2021 - openaccess.thecvf.com
Estimating 3D bounding boxes from monocular images is an essential component in
autonomous driving, while accurate 3D object detection from this kind of data is very …

Sparse instance activation for real-time instance segmentation

T Cheng, X Wang, S Chen, W Zhang… - Proceedings of the …, 2022 - openaccess.thecvf.com
In this paper, we propose a conceptually novel, efficient, and fully convolutional framework
for real-time instance segmentation. Previously, most instance segmentation methods …

Benchmarking detection transfer learning with vision transformers

Y Li, S **e, X Chen, P Dollar, K He… - arxiv preprint arxiv …, 2021 - arxiv.org
Object detection is a central downstream task used to test if pre-trained network parameters
confer benefits, such as improved accuracy or training speed. The complexity of object …

V3det: Vast vocabulary visual detection dataset

J Wang, P Zhang, T Chu, Y Cao… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent advances in detecting arbitrary objects in the real world are trained and evaluated
on object detection datasets with a relatively restricted vocabulary. To facilitate the …

Occluded video instance segmentation: A benchmark

J Qi, Y Gao, Y Hu, X Wang, X Liu, X Bai… - International Journal of …, 2022 - Springer
Can our video understanding systems perceive objects when a heavy occlusion exists in a
scene? To answer this question, we collect a large-scale dataset called OVIS for occluded …