FAIR1M: A benchmark dataset for fine-grained object recognition in high-resolution remote sensing imagery
With the rapid development of deep learning, many deep learning-based approaches have
made great achievements in object detection tasks. It is generally known that deep learning …
made great achievements in object detection tasks. It is generally known that deep learning …
Vision transformer adapter for dense predictions
This work investigates a simple yet powerful adapter for Vision Transformer (ViT). Unlike
recent visual transformers that introduce vision-specific inductive biases into their …
recent visual transformers that introduce vision-specific inductive biases into their …
You only look one-level feature
This paper revisits feature pyramids networks (FPN) for one-stage detectors and points out
that the success of FPN is due to its divide-and-conquer solution to the optimization problem …
that the success of FPN is due to its divide-and-conquer solution to the optimization problem …
Boundary IoU: Improving object-centric image segmentation evaluation
Abstract We present Boundary IoU (Intersection-over-Union), a new segmentation
evaluation measure focused on boundary quality. We perform an extensive analysis across …
evaluation measure focused on boundary quality. We perform an extensive analysis across …
The effectiveness of MAE pre-pretraining for billion-scale pretraining
This paper revisits the standard pretrain-then-finetune paradigm used in computer vision for
visual recognition tasks. Typically, state-of-the-art foundation models are pretrained using …
visual recognition tasks. Typically, state-of-the-art foundation models are pretrained using …
Delving into localization errors for monocular 3d object detection
Estimating 3D bounding boxes from monocular images is an essential component in
autonomous driving, while accurate 3D object detection from this kind of data is very …
autonomous driving, while accurate 3D object detection from this kind of data is very …
Sparse instance activation for real-time instance segmentation
In this paper, we propose a conceptually novel, efficient, and fully convolutional framework
for real-time instance segmentation. Previously, most instance segmentation methods …
for real-time instance segmentation. Previously, most instance segmentation methods …
Benchmarking detection transfer learning with vision transformers
Object detection is a central downstream task used to test if pre-trained network parameters
confer benefits, such as improved accuracy or training speed. The complexity of object …
confer benefits, such as improved accuracy or training speed. The complexity of object …
V3det: Vast vocabulary visual detection dataset
Recent advances in detecting arbitrary objects in the real world are trained and evaluated
on object detection datasets with a relatively restricted vocabulary. To facilitate the …
on object detection datasets with a relatively restricted vocabulary. To facilitate the …
Occluded video instance segmentation: A benchmark
Can our video understanding systems perceive objects when a heavy occlusion exists in a
scene? To answer this question, we collect a large-scale dataset called OVIS for occluded …
scene? To answer this question, we collect a large-scale dataset called OVIS for occluded …