Deep learning technique for human parsing: A survey and outlook
Human parsing aims to partition humans in image or video into multiple pixel-level semantic
parts. In the last decade, it has gained significantly increased interest in the computer vision …
parts. In the last decade, it has gained significantly increased interest in the computer vision …
Transformer-based dual relation graph for multi-label image recognition
The simultaneous recognition of multiple objects in one image remains a challenging task,
spanning multiple events in the recognition field such as various object scales, inconsistent …
spanning multiple events in the recognition field such as various object scales, inconsistent …
Bilateral attention network for RGB-D salient object detection
RGB-D salient object detection (SOD) aims to segment the most attractive objects in a pair of
cross-modal RGB and depth images. Currently, most existing RGB-D SOD methods focus on …
cross-modal RGB and depth images. Currently, most existing RGB-D SOD methods focus on …
Complementary trilateral decoder for fast and accurate salient object detection
Salient object detection (SOD) has made great progress, but most of existing SOD methods
focus more on performance than efficiency. Besides, the U-shape structure exists some …
focus more on performance than efficiency. Besides, the U-shape structure exists some …
Logicseg: Parsing visual semantics with neural logic learning and reasoning
Current high-performance semantic segmentation models are purely data-driven sub-
symbolic approaches and blind to the structured nature of the visual world. This is in stark …
symbolic approaches and blind to the structured nature of the visual world. This is in stark …
PGDENet: Progressive guided fusion and depth enhancement network for RGB-D indoor scene parsing
W Zhou, E Yang, J Lei, J Wan… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Scene parsing is a fundamental task in computer vision. Various RGB-D (color and depth)
scene parsing methods based on fully convolutional networks have achieved excellent …
scene parsing methods based on fully convolutional networks have achieved excellent …
Part-aware panoptic segmentation
In this work, we introduce the new scene understanding task of Part-aware Panoptic
Segmentation (PPS), which aims to understand a scene at multiple levels of abstraction, and …
Segmentation (PPS), which aims to understand a scene at multiple levels of abstraction, and …
Semantic hierarchy-aware segmentation
Humans are able to recognize structured relations in observation, allowing us to decompose
complex scenes into simpler parts and abstract the visual world at multiple levels. However …
complex scenes into simpler parts and abstract the visual world at multiple levels. However …
Panoptic-partformer: Learning a unified model for panoptic part segmentation
Abstract Panoptic Part Segmentation (PPS) aims to unify panoptic segmentation and part
segmentation into one task. Previous work mainly utilizes separated approaches to handle …
segmentation into one task. Previous work mainly utilizes separated approaches to handle …
Mg-llava: Towards multi-granularity visual instruction tuning
Multi-modal large language models (MLLMs) have made significant strides in various visual
understanding tasks. However, the majority of these models are constrained to process low …
understanding tasks. However, the majority of these models are constrained to process low …