From show to tell: A survey on deep learning-based image captioning

M Stefanini, M Cornia, L Baraldi… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Connecting Vision and Language plays an essential role in Generative Intelligence. For this
reason, large research efforts have been devoted to image captioning, ie describing images …

A comprehensive survey of scene graphs: Generation and application

X Chang, P Ren, P Xu, Z Li, X Chen… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Scene graph is a structured representation of a scene that can clearly express the objects,
attributes, and relationships between objects in the scene. As computer vision technology …

Training-free structured diffusion guidance for compositional text-to-image synthesis

W Feng, X He, TJ Fu, V Jampani, A Akula… - arxiv preprint arxiv …, 2022 - arxiv.org
Large-scale diffusion models have achieved state-of-the-art results on text-to-image
synthesis (T2I) tasks. Despite their ability to generate high-quality yet creative images, we …

Stacked hybrid-attention and group collaborative learning for unbiased scene graph generation

X Dong, T Gan, X Song, J Wu… - Proceedings of the …, 2022 - openaccess.thecvf.com
Abstract Scene Graph Generation, which generally follows a regular encoder-decoder
pipeline, aims to first encode the visual contents within the given image and then parse them …

A survey on graph neural networks and graph transformers in computer vision: A task-oriented perspective

C Chen, Y Wu, Q Dai, HY Zhou, M Xu… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Graph Neural Networks (GNNs) have gained momentum in graph representation learning
and boosted the state of the art in a variety of areas, such as data mining (eg, social network …

Devil's on the edges: Selective quad attention for scene graph generation

D Jung, S Kim, WH Kim, M Cho - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Scene graph generation aims to construct a semantic graph structure from an image such
that its nodes and edges respectively represent objects and their relationships. One of the …

Ppdl: Predicate probability distribution based loss for unbiased scene graph generation

W Li, H Zhang, Q Bai, G Zhao… - Proceedings of the …, 2022 - openaccess.thecvf.com
Abstract Scene Graph Generation (SGG) has attracted more and more attention from visual
researchers in recent years, since Scene Graph (SG) is valuable in many downstream tasks …

Learning to generate scene graph from natural language supervision

Y Zhong, J Shi, J Yang, C Xu… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Learning from image-text data has demonstrated recent success for many recognition tasks,
yet is currently limited to visual features or individual visual concepts such as objects. In this …

Instance relation graph guided source-free domain adaptive object detection

V VS, P Oza, VM Patel - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Abstract Unsupervised Domain Adaptation (UDA) is an effective approach to tackle the issue
of domain shift. Specifically, UDA methods try to align the source and target representations …

Learning to generate language-supervised and open-vocabulary scene graph using pre-trained visual-semantic space

Y Zhang, Y Pan, T Yao, R Huang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Scene graph generation (SGG) aims to abstract an image into a graph structure, by
representing objects as graph nodes and their relations as labeled edges. However, two …