CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets
In the realm of digital creativity, our potential to craft intricate 3D worlds from imagination is
often hampered by the limitations of existing digital tools, which demand extensive expertise …
often hampered by the limitations of existing digital tools, which demand extensive expertise …
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding Reasoning and Planning
Abstract Recent progress in Large Multimodal Models (LMM) has opened up great
possibilities for various applications in the field of human-machine interactions. However …
possibilities for various applications in the field of human-machine interactions. However …
Motionchain: Conversational motion controllers via multimodal prompts
Recent advancements in language models have demonstrated their adeptness in
conducting multi-turn dialogues and retaining conversational context. However, this …
conducting multi-turn dialogues and retaining conversational context. However, this …
SemGrasp : Semantic Grasp Generation via Language Aligned Discretization
Generating natural human grasps necessitates consideration of not just object geometry but
also semantic information. Solely depending on object shape for grasp generation confines …
also semantic information. Solely depending on object shape for grasp generation confines …
PM-INR: Prior-Rich Multi-Modal Implicit Large-Scale Scene Neural Representation
Recent advancements in implicit neural representations have contributed to high-fidelity
surface reconstruction and photorealistic novel view synthesis. However, with the expansion …
surface reconstruction and photorealistic novel view synthesis. However, with the expansion …
LLMs Meet Multimodal Generation and Editing: A Survey
With the recent advancement in large language models (LLMs), there is a growing interest in
combining LLMs with multimodal learning. Previous surveys of multimodal large language …
combining LLMs with multimodal learning. Previous surveys of multimodal large language …
MeshXL: Neural Coordinate Field for Generative 3D Foundation Models
The polygon mesh representation of 3D data exhibits great flexibility, fast rendering speed,
and storage efficiency, which is widely preferred in various applications. However, given its …
and storage efficiency, which is widely preferred in various applications. However, given its …
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
As large language models (LLMs) evolve, their integration with 3D spatial data (3D-LLMs)
has seen rapid progress, offering unprecedented capabilities for understanding and …
has seen rapid progress, offering unprecedented capabilities for understanding and …
M3DBench: Towards Omni 3D Assistant with Interleaved Multi-modal Instructions
Recently, the understanding of the 3D world has garnered increased attention, facilitating
autonomous agents to perform further decision-making. However, the majority of existing 3D …
autonomous agents to perform further decision-making. However, the majority of existing 3D …
Text‐to‐Microstructure Generation Using Generative Deep Learning
Designing novel materials is greatly dependent on understanding the design principles,
physical mechanisms, and modeling methods of material microstructures, requiring …
physical mechanisms, and modeling methods of material microstructures, requiring …