Google Académico

Y Xu, M Zhang, X Yang, C Xu - IEEE Transactions on Image …, 2024 - ieeexplore.ieee.org

We explore multi-modal contextual knowledge learned through multi-modal masked
language modeling to provide explicit localization guidance for novel classes in open …

Guardar Citar Citado por 4 Artículos relacionados Las 2 versiones

Transferable Unintentional Action Localization with Language-guided Intention Translation

J Xu, Y Rao, J Zhou, J Lu - IEEE Transactions on Pattern …, 2025 - ieeexplore.ieee.org

Unintentional action localization (UAL) is a challenging task that requires reasoning about
action intention clues to detect the temporal locations of unintentional action occurrences in …

Guardar Citar Artículos relacionados Las 2 versiones

[Free GPT-4]

[PDF] arxiv.org

Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection

Y Cao, Y Zeng, H Xu, D Xu - arxiv preprint arxiv:2406.00830, 2024 - arxiv.org

Open-vocabulary 3D Object Detection (OV-3DDet) addresses the detection of objects from
an arbitrary list of novel categories in 3D scenes, which remains a very challenging problem …

Guardar Citar Citado por 4 Artículos relacionados Las 2 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

AD-DINO: Attention-Dynamic DINO for Distance-Aware Embodied Reference Understanding

H Guo, W Fan, B Wei, J Zhu, J Tian, C Yi… - arxiv preprint arxiv …, 2024 - arxiv.org

Embodied reference understanding is crucial for intelligent agents to predict referents based
on human intention through gesture signals and language descriptions. This paper …

Guardar Citar Artículos relacionados Las 2 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

CP-DETR: Concept Prompt Guide DETR Toward Stronger Universal Object Detection

Q Chen, W **, J Ge, M Liu, Y Yan, J Jiang, L Yu… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent research on universal object detection aims to introduce language in a SoTA closed-
set detector and then generalize the open-set concepts by constructing large-scale (text …

Guardar Citar Artículos relacionados Las 2 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

HA-FGOVD: Highlighting Fine-grained Attributes via Explicit Linear Composition for Open-Vocabulary Object Detection

Y Ma, M Liu, C Zhu, XC Yin - arxiv preprint arxiv:2409.16136, 2024 - arxiv.org

Open-vocabulary object detection (OVD) models are considered to be Large Multi-modal
Models (LMM), due to their extensive training data and a large number of parameters …

Guardar Citar Artículos relacionados Las 2 versiones Versión en HTML

Crear alerta

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection

Exploring multi-modal contextual knowledge for open-vocabulary object detection

Transferable Unintentional Action Localization with Language-guided Intention Translation

Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection

AD-DINO: Attention-Dynamic DINO for Distance-Aware Embodied Reference Understanding

CP-DETR: Concept Prompt Guide DETR Toward Stronger Universal Object Detection

HA-FGOVD: Highlighting Fine-grained Attributes via Explicit Linear Composition for Open-Vocabulary Object Detection