- Academic Search

D Zhang, J Han, G Cheng… - IEEE transactions on …, 2021 - ieeexplore.ieee.org

As an emerging and challenging problem in the computer vision community, weakly
supervised object localization and detection plays an important role for develo** new …

Lưu Trích dẫn Trích dẫn 333 bài viết Bài viết có liên quan Tất cả 11 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Advances in deep concealed scene understanding

DP Fan, GP Ji, P Xu, MM Cheng, C Sakaridis… - Visual Intelligence, 2023 - Springer

Concealed scene understanding (CSU) is a hot computer vision topic aiming to perceive
objects exhibiting camouflage. The current boom in terms of techniques and applications …

Lưu Trích dẫn Trích dẫn 80 bài viết Bài viết có liên quan Tất cả 6 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Anydoor: Zero-shot object-level image customization

X Chen, L Huang, Y Liu, Y Shen… - Proceedings of the …, 2024 - openaccess.thecvf.com

This work presents AnyDoor a diffusion-based image generator with the power to teleport
target objects to new scenes at user-specified locations with desired shapes. Instead of …

Lưu Trích dẫn Trích dẫn 218 bài viết Bài viết có liên quan Tất cả 7 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Tracking anything with decoupled video segmentation

HK Cheng, SW Oh, B Price… - Proceedings of the …, 2023 - openaccess.thecvf.com

Training data for video segmentation are expensive to annotate. This impedes extensions of
end-to-end algorithms to new video segmentation tasks, especially in large-vocabulary …

Lưu Trích dẫn Trích dẫn 142 bài viết Bài viết có liên quan Tất cả 7 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Segment anything is not always perfect: An investigation of sam on different real-world applications

W Ji, J Li, Q Bi, T Liu, W Li, L Cheng - 2024 - Springer

Abstract Recently, Meta AI Research approaches a general, promptable segment anything
model (SAM) pre-trained on an unprecedentedly large segmentation dataset (SA-1B) …

Lưu Trích dẫn Trích dẫn 196 bài viết Bài viết có liên quan Tất cả 9 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Xmem: Long-term video object segmentation with an atkinson-shiffrin memory model

HK Cheng, AG Schwing - European Conference on Computer Vision, 2022 - Springer

We present XMem, a video object segmentation architecture for long videos with unified
feature memory stores inspired by the Atkinson-Shiffrin memory model. Prior work on video …

Lưu Trích dẫn Trích dẫn 418 bài viết Bài viết có liên quan Tất cả 8 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Mvimgnet: A large-scale dataset of multi-view images

X Yu, M Xu, Y Zhang, H Liu, C Ye… - Proceedings of the …, 2023 - openaccess.thecvf.com

Being data-driven is one of the most iconic properties of deep learning algorithms. The birth
of ImageNet drives a remarkable trend of" learning from large-scale data" in computer vision …

Lưu Trích dẫn Trích dẫn 154 bài viết Bài viết có liên quan Tất cả 6 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Visual attention network

MH Guo, CZ Lu, ZN Liu, MM Cheng, SM Hu - Computational visual media, 2023 - Springer

While originally designed for natural language processing tasks, the self-attention
mechanism has recently taken various computer vision areas by storm. However, the 2D …

Lưu Trích dẫn Trích dẫn 802 bài viết Bài viết có liên quan Tất cả 11 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Putting the object back into video object segmentation

HK Cheng, SW Oh, B Price, JY Lee… - Proceedings of the …, 2024 - openaccess.thecvf.com

We present Cutie a video object segmentation (VOS) network with object-level memory
reading which puts the object representation from memory back into the video object …

Lưu Trích dẫn Trích dẫn 77 bài viết Bài viết có liên quan Tất cả 6 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Visionllm v2: An end-to-end generalist multimodal large language model for hundreds of vision-language tasks

J Wu, M Zhong, S **ng, Z Lai, Z Liu… - Advances in …, 2025 - proceedings.neurips.cc

We present VisionLLM v2, an end-to-end generalist multimodal large model (MLLM) that
unifies visual perception, understanding, and generation within a single framework. Unlike …

Lưu Trích dẫn Trích dẫn 33 bài viết Bài viết có liên quan Tất cả 5 phiên bản Xem dạng HTML

Tạo thông báo

Trích dẫn

Tìm kiếm nâng cao

Đã lưu vào Thư viện của tôi

Learning to detect salient objects with image-level supervision

Weakly supervised object localization and detection: A survey

Advances in deep concealed scene understanding

Anydoor: Zero-shot object-level image customization

Tracking anything with decoupled video segmentation

Segment anything is not always perfect: An investigation of sam on different real-world applications

Xmem: Long-term video object segmentation with an atkinson-shiffrin memory model

Mvimgnet: A large-scale dataset of multi-view images

Visual attention network

Putting the object back into video object segmentation

Visionllm v2: An end-to-end generalist multimodal large language model for hundreds of vision-language tasks