Towards open vocabulary learning: A survey

J Wu, X Li, S Xu, H Yuan, H Ding… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
In the field of visual scene understanding, deep neural networks have made impressive
advancements in various core tasks like segmentation, tracking, and detection. However …

Deep long-tailed learning: A survey

Y Zhang, B Kang, B Hooi, S Yan… - IEEE transactions on …, 2023 - ieeexplore.ieee.org
Deep long-tailed learning, one of the most challenging problems in visual recognition, aims
to train well-performing deep models from a large number of images that follow a long-tailed …

Simple copy-paste is a strong data augmentation method for instance segmentation

G Ghiasi, Y Cui, A Srinivas, R Qian… - Proceedings of the …, 2021 - openaccess.thecvf.com
Building instance segmentation models that are data-efficient and can handle rare object
categories is an important challenge in computer vision. Leveraging data augmentations is a …

Distribution alignment: A unified framework for long-tail visual recognition

S Zhang, Z Li, S Yan, X He… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Despite the success of the deep neural networks, it remains challenging to effectively build a
system for long-tail visual recognition tasks. To address this problem, we first investigate the …

Cross-modal causal relational reasoning for event-level visual question answering

Y Liu, G Li, L Lin - IEEE Transactions on Pattern Analysis and …, 2023 - ieeexplore.ieee.org
Existing visual question answering methods often suffer from cross-modal spurious
correlations and oversimplified event-level reasoning processes that fail to capture event …

A survey on long-tailed visual recognition

L Yang, H Jiang, Q Song, J Guo - International Journal of Computer Vision, 2022 - Springer
The heavy reliance on data is one of the major reasons that currently limit the development
of deep learning. Data quality directly dominates the effect of deep learning models, and the …

Self-supervised learning is more robust to dataset imbalance

H Liu, JZ HaoChen, A Gaidon, T Ma - arxiv preprint arxiv:2110.05025, 2021 - arxiv.org
Self-supervised learning (SSL) is a scalable way to learn general visual representations
since it learns without labels. However, large-scale unlabeled datasets in the wild often have …

Digeo: Discriminative geometry-aware learning for generalized few-shot object detection

J Ma, Y Niu, J Xu, S Huang, G Han… - Proceedings of the …, 2023 - openaccess.thecvf.com
Generalized few-shot object detection aims to achieve precise detection on both base
classes with abundant annotations and novel classes with limited training data. Existing …

Seesaw loss for long-tailed instance segmentation

J Wang, W Zhang, Y Zang, Y Cao… - Proceedings of the …, 2021 - openaccess.thecvf.com
Instance segmentation has witnessed a remarkable progress on class-balanced
benchmarks. However, they fail to perform as accurately in real-world scenarios, where the …

Ace: Ally complementary experts for solving long-tailed recognition in one-shot

J Cai, Y Wang, JN Hwang - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
One-stage long-tailed recognition methods improve the overall performance in a" seesaw"
manner, ie, either sacrifice the head's accuracy for better tail classification or elevate the …