Vision-language pre-training: Basics, recent advances, and future trends

Z Gan, L Li, C Li, L Wang, Z Liu… - Foundations and Trends …, 2022 - nowpublishers.com
This monograph surveys vision-language pre-training (VLP) methods for multimodal
intelligence that have been developed in the last few years. We group these approaches …

A review of generalized zero-shot learning methods

F Pourpanah, M Abdar, Y Luo, X Zhou… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Generalized zero-shot learning (GZSL) aims to train a model for classifying data samples
under the condition that some output classes are unknown during supervised learning. To …

Open-vocabulary object detection via vision and language knowledge distillation

X Gu, TY Lin, W Kuo, Y Cui - arxiv preprint arxiv:2104.13921, 2021 - arxiv.org
We aim at advancing open-vocabulary object detection, which detects objects described by
arbitrary text inputs. The fundamental challenge is the availability of training data. It is costly …

Decoupling zero-shot semantic segmentation

J Ding, N Xue, GS **a, D Dai - Proceedings of the IEEE/CVF …, 2022 - openaccess.thecvf.com
Zero-shot semantic segmentation (ZS3) aims to segment the novel categories that have not
been seen in the training. Existing works formulate ZS3 as a pixel-level zero-shot …

Elevater: A benchmark and toolkit for evaluating language-augmented visual models

C Li, H Liu, L Li, P Zhang, J Aneja… - Advances in …, 2022 - proceedings.neurips.cc
Learning visual representations from natural language supervision has recently shown great
promise in a number of pioneering works. In general, these language-augmented visual …

Promptdet: Towards open-vocabulary detection using uncurated images

C Feng, Y Zhong, Z Jie, X Chu, H Ren, X Wei… - … on Computer Vision, 2022 - Springer
The goal of this work is to establish a scalable pipeline for expanding an object detector
towards novel/unseen categories, using zero manual annotations. To achieve that, we make …

A survey of zero-shot learning: Settings, methods, and applications

W Wang, VW Zheng, H Yu, C Miao - ACM Transactions on Intelligent …, 2019 - dl.acm.org
Most machine-learning methods focus on classifying instances whose classes have already
been seen in training. In practice, many applications require classifying instances whose …

Krisp: Integrating implicit and symbolic knowledge for open-domain knowledge-based vqa

K Marino, X Chen, D Parikh, A Gupta… - Proceedings of the …, 2021 - openaccess.thecvf.com
One of the most challenging question types in VQA is when answering the question requires
outside knowledge not present in the image. In this work we study open-domain knowledge …

Feature generating networks for zero-shot learning

Y **an, T Lorenz, B Schiele… - Proceedings of the IEEE …, 2018 - openaccess.thecvf.com
Suffering from the extreme training data imbalance between seen and unseen classes, most
of existing state-of-the-art approaches fail to achieve satisfactory results for the challenging …

Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly

Y **an, CH Lampert, B Schiele… - IEEE transactions on …, 2018 - ieeexplore.ieee.org
Due to the importance of zero-shot learning, ie, classifying images where there is a lack of
labeled training data, the number of proposed approaches has recently increased steadily …