Μελετητής Google

Z Qi, R Dong, S Zhang, H Geng, C Han, Z Ge… - … on Computer Vision, 2024 - Springer

This paper presents ShapeLLM, the first 3D Multimodal Large Language Model (LLM)
designed for embodied interaction, exploring a universal 3D object understanding with 3D …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 41 Σχετικά άρθρα Όλες οι 2 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

4d contrastive superflows are dense 3d representation learners

X Xu, L Kong, H Shuai, W Zhang, L Pan, K Chen… - … on Computer Vision, 2024 - Springer

In the realm of autonomous driving, accurate 3D perception is the foundation. However,
develo** such models relies on extensive human annotations–a process that is both …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 2 Σχετικά άρθρα Όλες οι 9 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Hand-Centric Motion Refinement for 3D Hand-Object Interaction via Hierarchical Spatial-Temporal Modeling

Y Hao, J Zhang, T Zhuo, F Wen, H Fan - Proceedings of the AAAI …, 2024 - ojs.aaai.org

Hands are the main medium when people interact with the world. Generating proper 3D
motion for hand-object interaction is vital for applications such as virtual reality and robotics …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 2 Σχετικά άρθρα Όλες οι 3 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Eqvafford: Se (3) equivariance for point-level affordance learning

Y Chen, C Tie, R Wu, H Dong - arxiv preprint arxiv:2408.01953, 2024 - arxiv.org

Humans perceive and interact with the world with the awareness of equivariance, facilitating
us in manipulating different objects in diverse poses. For robotic manipulation, such …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 3 Σχετικά άρθρα Όλες οι 2 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

MAMBA4D: Efficient Long-Sequence Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models

J Liu, J Han, L Liu, AI Aviles-Rivero, C Jiang… - arxiv preprint arxiv …, 2024 - arxiv.org

Point cloud videos effectively capture real-world spatial geometries and temporal dynamics,
which are essential for enabling intelligent agents to understand the dynamically changing …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 7 Σχετικά άρθρα Όλες οι 2 εκδοχές Προβολή ως HTML

Δημιουργία ειδοποίησης

Παράθεση

Σύνθετη αναζήτηση

Αποθηκεύτηκε στη Βιβλιοθήκη μου

LeaF: Learning Frames for 4D Point Cloud Sequence Understanding

Shapellm: Universal 3d object understanding for embodied interaction

4d contrastive superflows are dense 3d representation learners

Hand-Centric Motion Refinement for 3D Hand-Object Interaction via Hierarchical Spatial-Temporal Modeling

Eqvafford: Se (3) equivariance for point-level affordance learning

MAMBA4D: Efficient Long-Sequence Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models