Multimodal learning with graphs
Artificial intelligence for graphs has achieved remarkable success in modelling complex
systems, ranging from dynamic networks in biology to interacting particle systems in physics …
systems, ranging from dynamic networks in biology to interacting particle systems in physics …
Cris: Clip-driven referring image segmentation
Referring image segmentation aims to segment a referent via a natural linguistic expression.
Due to the distinct data properties between text and image, it is challenging for a network to …
Due to the distinct data properties between text and image, it is challenging for a network to …
Segvit: Semantic segmentation with plain vision transformers
We explore the capability of plain Vision Transformers (ViTs) for semantic segmentation and
propose the SegViT. Previous ViT-based segmentation networks usually learn a pixel-level …
propose the SegViT. Previous ViT-based segmentation networks usually learn a pixel-level …
Cgnet: A light-weight context guided network for semantic segmentation
The demand of applying semantic segmentation model on mobile devices has been
increasing rapidly. Current state-of-the-art networks have enormous amount of parameters …
increasing rapidly. Current state-of-the-art networks have enormous amount of parameters …
Cigar: Cross-modality graph reasoning for domain adaptive object detection
Unsupervised domain adaptive object detection (UDA-OD) aims to learn a detector by
generalizing knowledge from a labeled source domain to an unlabeled target domain …
generalizing knowledge from a labeled source domain to an unlabeled target domain …
Segvit v2: Exploring efficient and continual semantic segmentation with plain vision transformers
This paper investigates the capability of plain Vision Transformers (ViTs) for semantic
segmentation using the encoder–decoder framework and introduce SegViTv2. In this study …
segmentation using the encoder–decoder framework and introduce SegViTv2. In this study …
Exploiting edge-oriented reasoning for 3d point-based scene graph analysis
Scene understanding is a critical problem in computer vision. In this paper, we propose a 3D
point-based scene graph generation (SGGpoint) framework to effectively bridge perception …
point-based scene graph generation (SGGpoint) framework to effectively bridge perception …
Structtoken: Rethinking semantic segmentation with structural prior
In previous deep-learning-based methods, semantic segmentation has been regarded as a
static or dynamic per-pixel classification task, ie, classify each pixel representation to a …
static or dynamic per-pixel classification task, ie, classify each pixel representation to a …
An Overview of Text-based Person Search: Recent Advances and Future Directions
Due to the practical significance in smart video surveillance systems, Text-Based Person
Search (TBPS) has been one of the research hotspots recently, which refers to searching for …
Search (TBPS) has been one of the research hotspots recently, which refers to searching for …
Fully transformer networks for semantic image segmentation
Transformers have shown impressive performance in various natural language processing
and computer vision tasks, due to the capability of modeling long-range dependencies …
and computer vision tasks, due to the capability of modeling long-range dependencies …