A review of human activity recognition methods
Recognizing human activities from video sequences or still images is a challenging task due
to problems, such as background clutter, partial occlusion, changes in scale, viewpoint …
to problems, such as background clutter, partial occlusion, changes in scale, viewpoint …
Recent advances in zero-shot recognition: Toward data-efficient understanding of visual content
With the recent renaissance of deep convolutional neural networks (CNNs), encouraging
breakthroughs have been achieved on the supervised recognition tasks, where each class …
breakthroughs have been achieved on the supervised recognition tasks, where each class …
Aligning bag of regions for open-vocabulary object detection
Pre-trained vision-language models (VLMs) learn to align vision and language
representations on large-scale datasets, where each image-text pair usually contains a bag …
representations on large-scale datasets, where each image-text pair usually contains a bag …
Unified contrastive learning in image-text-label space
Visual recognition is recently learned via either supervised learning on human-annotated
image-label data or language-image contrastive learning with webly-crawled image-text …
image-label data or language-image contrastive learning with webly-crawled image-text …
Open-vocabulary object detection via vision and language knowledge distillation
We aim at advancing open-vocabulary object detection, which detects objects described by
arbitrary text inputs. The fundamental challenge is the availability of training data. It is costly …
arbitrary text inputs. The fundamental challenge is the availability of training data. It is costly …
Decoupling zero-shot semantic segmentation
Zero-shot semantic segmentation (ZS3) aims to segment the novel categories that have not
been seen in the training. Existing works formulate ZS3 as a pixel-level zero-shot …
been seen in the training. Existing works formulate ZS3 as a pixel-level zero-shot …
A survey of zero-shot learning: Settings, methods, and applications
Most machine-learning methods focus on classifying instances whose classes have already
been seen in training. In practice, many applications require classifying instances whose …
been seen in training. In practice, many applications require classifying instances whose …
f-vaegan-d2: A feature generating framework for any-shot learning
When labeled training data is scarce, a promising data augmentation approach is to
generate visual features of unknown classes using their attributes. To learn the class …
generate visual features of unknown classes using their attributes. To learn the class …
Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly
Due to the importance of zero-shot learning, ie, classifying images where there is a lack of
labeled training data, the number of proposed approaches has recently increased steadily …
labeled training data, the number of proposed approaches has recently increased steadily …
Zero-shot recognition via semantic embeddings and knowledge graphs
We consider the problem of zero-shot recognition: learning a visual classifier for a category
with zero training examples, just using the word embedding of the category and its …
with zero training examples, just using the word embedding of the category and its …