Real-world robot applications of foundation models: A review

K Kawaharazuka, T Matsushima… - Advanced …, 2024 - Taylor & Francis
Recent developments in foundation models, like Large Language Models (LLMs) and Vision-
Language Models (VLMs), trained on extensive data, facilitate flexible application across …

Human motion generation: A survey

W Zhu, X Ma, D Ro, H Ci, J Zhang, J Shi… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Human motion generation aims to generate natural human pose sequences and shows
immense potential for real-world applications. Substantial progress has been made recently …

Motiongpt: Human motion as a foreign language

B Jiang, X Chen, W Liu, J Yu, G Yu… - Advances in Neural …, 2023 - proceedings.neurips.cc
Though the advancement of pre-trained large language models unfolds, the exploration of
building a unified model for language and other multimodal data, such as motion, remains …

Motion-x: A large-scale 3d expressive whole-body human motion dataset

J Lin, A Zeng, S Lu, Y Cai, R Zhang… - Advances in Neural …, 2023 - proceedings.neurips.cc
In this paper, we present Motion-X, a large-scale 3D expressive whole-body motion dataset.
Existing motion datasets predominantly contain body-only poses, lacking facial expressions …

Interdiff: Generating 3d human-object interactions with physics-informed diffusion

S Xu, Z Li, YX Wang, LY Gui - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
This paper addresses a novel task of anticipating 3D human-object interactions (HOIs). Most
existing research on HOI synthesis lacks comprehensive whole-body interactions with …

Remodiffuse: Retrieval-augmented motion diffusion model

M Zhang, X Guo, L Pan, Z Cai, F Hong… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract 3D human motion generation is crucial for creative industry. Recent advances rely
on generative models with domain knowledge for text-driven motion generation, leading to …

Michelangelo: Conditional 3d shape generation based on shape-image-text aligned latent representation

Z Zhao, W Liu, X Chen, X Zeng… - Advances in …, 2024 - proceedings.neurips.cc
We present a novel alignment-before-generation approach to tackle the challenging task of
generating general 3D shapes based on 2D images or texts. Directly learning a conditional …

Motionlcm: Real-time controllable motion generation via latent consistency model

W Dai, LH Chen, J Wang, J Liu, B Dai… - European Conference on …, 2024 - Springer
This work introduces MotionLCM, extending controllable motion generation to a real-time
level. Existing methods for spatial-temporal control in text-conditioned motion generation …

Intergen: Diffusion-based multi-human motion generation under complex interactions

H Liang, W Zhang, W Li, J Yu, L Xu - International Journal of Computer …, 2024 - Springer
We have recently seen tremendous progress in diffusion advances for generating realistic
human motions. Yet, they largely disregard the multi-human interactions. In this paper, we …

Emdm: Efficient motion diffusion model for fast and high-quality motion generation

W Zhou, Z Dou, Z Cao, Z Liao, J Wang, W Wang… - … on Computer Vision, 2024 - Springer
Abstract We introduce Efficient Motion Diffusion Model (EMDM) for fast and high-quality
human motion generation. Current state-of-the-art generative diffusion models have …