- Academic Search

C Zhu, L Chen - IEEE Transactions on Pattern Analysis and …, 2024 - ieeexplore.ieee.org

As the most fundamental scene understanding tasks, object detection and segmentation
have made tremendous progress in deep learning era. Due to the expensive manual …

บันทึก อ้างอิง อ้างโดย24 บทความที่เกี่ยวข้อง ทั้งหมด 8 ฉบับ

Segment anything in 3d with nerfs

J Cen, Z Zhou, J Fang, W Shen, L **e… - Advances in …, 2023 - proceedings.neurips.cc

Abstract Recently, the Segment Anything Model (SAM) emerged as a powerful vision
foundation model which is capable to segment anything in 2D images. This paper aims to …

บันทึก อ้างอิง อ้างโดย141 บทความที่เกี่ยวข้อง ทั้งหมด 5 ฉบับ แคช

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Conceptgraphs: Open-vocabulary 3d scene graphs for perception and planning

Q Gu, A Kuwajerwala, S Morin… - … on Robotics and …, 2024 - ieeexplore.ieee.org

For robots to perform a wide variety of tasks, they require a 3D representation of the world
that is semantically rich, yet compact and efficient for task-driven perception and planning …

บันทึก อ้างอิง อ้างโดย144 บทความที่เกี่ยวข้อง ทั้งหมด 7 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Towards open vocabulary learning: A survey

J Wu, X Li, S Xu, H Yuan, H Ding… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

In the field of visual scene understanding, deep neural networks have made impressive
advancements in various core tasks like segmentation, tracking, and detection. However …

บันทึก อ้างอิง อ้างโดย128 บทความที่เกี่ยวข้อง ทั้งหมด 13 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Ll3da: Visual interactive instruction tuning for omni-3d understanding reasoning and planning

S Chen, X Chen, C Zhang, M Li, G Yu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Recent progress in Large Multimodal Models (LMM) has opened up great
possibilities for various applications in the field of human-machine interactions. However …

บันทึก อ้างอิง อ้างโดย65 บทความที่เกี่ยวข้อง ทั้งหมด 6 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Openshape: Scaling up 3d shape representation towards open-world understanding

M Liu, R Shi, K Kuang, Y Zhu, X Li… - Advances in neural …, 2023 - proceedings.neurips.cc

We introduce OpenShape, a method for learning multi-modal joint representations of text,
image, and point clouds. We adopt the commonly used multi-modal contrastive learning …

บันทึก อ้างอิง อ้างโดย104 บทความที่เกี่ยวข้อง ทั้งหมด 7 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Openmask3d: Open-vocabulary 3d instance segmentation

A Takmaz, E Fedele, RW Sumner, M Pollefeys… - arxiv preprint arxiv …, 2023 - arxiv.org

We introduce the task of open-vocabulary 3D instance segmentation. Current approaches
for 3D instance segmentation can typically only recognize object categories from a pre …

บันทึก อ้างอิง อ้างโดย139 บทความที่เกี่ยวข้อง ทั้งหมด 8 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Shapellm: Universal 3d object understanding for embodied interaction

Z Qi, R Dong, S Zhang, H Geng, C Han, Z Ge… - … on Computer Vision, 2024 - Springer

This paper presents ShapeLLM, the first 3D Multimodal Large Language Model (LLM)
designed for embodied interaction, exploring a universal 3D object understanding with 3D …

บันทึก อ้างอิง อ้างโดย45 บทความที่เกี่ยวข้อง ทั้งหมด 5 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Embodiedscan: A holistic multi-modal 3d perception suite towards embodied ai

T Wang, X Mao, C Zhu, R Xu, R Lyu… - Proceedings of the …, 2024 - openaccess.thecvf.com

In the realm of computer vision and robotics embodied agents are expected to explore their
environment and carry out human instructions. This necessitates the ability to fully …

บันทึก อ้างอิง อ้างโดย47 บทความที่เกี่ยวข้อง ทั้งหมด 7 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Language embedded 3d gaussians for open-vocabulary scene understanding

JC Shi, M Wang, HB Duan… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Open-vocabulary querying in 3D space is challenging but essential for scene understanding
tasks such as object localization and segmentation. Language-embedded scene …

บันทึก อ้างอิง อ้างโดย50 บทความที่เกี่ยวข้อง ทั้งหมด 6 ฉบับ ดูในรูปแบบ HTML

สร้างการแจ้งเตือน

อ้างอิง

การค้นหาขั้นสูง

บันทึกไปยังคลังของฉันแล้ว

Pla: Language-driven open-vocabulary 3d scene understanding

A survey on open-vocabulary detection and segmentation: Past, present, and future

Segment anything in 3d with nerfs

Conceptgraphs: Open-vocabulary 3d scene graphs for perception and planning

Towards open vocabulary learning: A survey

Ll3da: Visual interactive instruction tuning for omni-3d understanding reasoning and planning

Openshape: Scaling up 3d shape representation towards open-world understanding

Openmask3d: Open-vocabulary 3d instance segmentation

Shapellm: Universal 3d object understanding for embodied interaction

Embodiedscan: A holistic multi-modal 3d perception suite towards embodied ai

Language embedded 3d gaussians for open-vocabulary scene understanding