A review on 2D instance segmentation based on deep neural networks
W Gu, S Bai, L Kong - Image and Vision Computing, 2022 - Elsevier
Image instance segmentation involves labeling pixels of images with classes and instances,
which is one of the pivotal technologies in many domains, such as natural scenes …
which is one of the pivotal technologies in many domains, such as natural scenes …
Generalized decoding for pixel, image, and language
We present X-Decoder, a generalized decoding model that can predict pixel-level
segmentation and language tokens seamlessly. X-Decoder takes as input two types of …
segmentation and language tokens seamlessly. X-Decoder takes as input two types of …
Multimodal foundation models: From specialists to general-purpose assistants
Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …
methods to data compression. Recent advances in statistical machine learning have opened …
Petr: Position embedding transformation for multi-view 3d object detection
In this paper, we develop position embedding transformation (PETR) for multi-view 3D
object detection. PETR encodes the position information of 3D coordinates into image …
object detection. PETR encodes the position information of 3D coordinates into image …
Transformer-based visual segmentation: A survey
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …
segments or groups. This technique has numerous real-world applications, such as …
Detrs with hybrid matching
One-to-one set matching is a key design for DETR to establish its end-to-end capability, so
that object detection does not require a hand-crafted NMS (non-maximum suppression) to …
that object detection does not require a hand-crafted NMS (non-maximum suppression) to …
A survey of visual transformers
Transformer, an attention-based encoder–decoder model, has already revolutionized the
field of natural language processing (NLP). Inspired by such significant achievements, some …
field of natural language processing (NLP). Inspired by such significant achievements, some …
A survey on vision transformer
Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …
network mainly based on the self-attention mechanism. Thanks to its strong representation …
Rank-DETR for high quality object detection
Modern detection transformers (DETRs) use a set of object queries to predict a list of
bounding boxes, sort them by their classification confidence scores, and select the top …
bounding boxes, sort them by their classification confidence scores, and select the top …
A survey on visual transformer
Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …
network mainly based on the self-attention mechanism. Thanks to its strong representation …