- Academic Search

E Song, W Chai, G Wang, Y Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com

Recently integrating video foundation models and large language models to build a video
understanding system can overcome the limitations of specific pre-defined vision tasks. Yet …

Save Cite Cited by 182 Related articles All 3 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

Save Cite Cited by 127 Related articles All 3 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Dvis: Decoupled video instance segmentation framework

T Zhang, X Tian, Y Wu, S Ji, X Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Video instance segmentation (VIS) is a critical task with diverse applications, including
autonomous driving and video editing. Existing methods often underperform on complex …

Save Cite Cited by 51 Related articles All 7 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Improving video segmentation via dynamic anchor queries

Y Zhou, T Zhang, S Ji, S Yan, X Li - European Conference on Computer …, 2024 - Springer

Modern video segmentation methods adopt feature transitions between anchor and target
queries to perform cross-frame object association. The smooth feature transitions between …

Save Cite Cited by 7 Related articles All 3 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

General and Task-Oriented Video Segmentation

M Chen, L Li, W Wang, R Quan, Y Yang - European Conference on …, 2024 - Springer

We present GvSeg, ag eneral v ideo seg mentation framework for addressing four different
video segmentation tasks (ie., instance, semantic, panoptic, and exemplar-guided) while …

Save Cite Cited by 4 Related articles All 7 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Unified embedding alignment for open-vocabulary video instance segmentation

H Fang, P Wu, Y Li, X Zhang, X Lu - European Conference on Computer …, 2024 - Springer

Abstract Open-Vocabulary Video Instance Segmentation (VIS) is attracting increasing
attention due to its ability to segment and track arbitrary objects. However, the recent Open …

Save Cite Cited by 3 Related articles All 8 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Language-driven visual consensus for zero-shot semantic segmentation

Z Zhang, W Ke, Y Zhu, X Liang, J Liu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

The pre-trained vision-language model, exemplified by CLIP [1], advances zero-shot
semantic segmentation by aligning visual features with class embeddings through a …

Save Cite Cited by 4 Related articles All 2 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Dvis++: Improved decoupled framework for universal video segmentation

T Zhang, X Tian, Y Zhou, S Ji, X Wang, X Tao… - arxiv preprint arxiv …, 2023 - arxiv.org

We present the\textbf {D} ecoupled\textbf {VI} deo\textbf {S} egmentation (DVIS) framework, a
novel approach for the challenging task of universal video segmentation, including video …

Save Cite Cited by 15 Related articles All 2 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

ChatVTG: Video Temporal Grounding via Chat with Video Dialogue Large Language Models

M Qu, X Chen, W Liu, A Li… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Abstract Video Temporal Grounding (VTG) aims to ground specific segments within an
untrimmed video corresponding to the given natural language query. Existing VTG methods …

Save Cite Cited by 7 Related articles View as HTML

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Scale-aware token-matching for transformer-based object detector

A Jung, S Hong, Y Hyun - Pattern Recognition Letters, 2024 - Elsevier

Owing to the advancements in deep learning, object detection has made significant
progress in estimating the positions and classes of multiple objects within an image …

Save Cite Cited by 1 Related articles All 4 versions Free GPT-4 DeepSeek

Create alert

Cite

Advanced search

Saved to My library

Ctvis: Consistent training for online video instance segmentation

Moviechat: From dense token to sparse memory for long video understanding

Transformer-based visual segmentation: A survey

Dvis: Decoupled video instance segmentation framework

Improving video segmentation via dynamic anchor queries

General and Task-Oriented Video Segmentation

Unified embedding alignment for open-vocabulary video instance segmentation

Language-driven visual consensus for zero-shot semantic segmentation

Dvis++: Improved decoupled framework for universal video segmentation

ChatVTG: Video Temporal Grounding via Chat with Video Dialogue Large Language Models

[HTML][HTML] Scale-aware token-matching for transformer-based object detector