Large motion model for unified multi-modal motion generation

M Zhang, D **, C Gu, F Hong, Z Cai, J Huang… - … on Computer Vision, 2024 - Springer
Human motion generation, a cornerstone technique in animation and video production, has
widespread applications in various tasks like text-to-motion and music-to-dance. Previous …

Blenderalchemy: Editing 3d graphics with vision-language models

I Huang, G Yang, L Guibas - European Conference on Computer Vision, 2024 - Springer
Graphics design is important for various applications, including movie production and game
design. To create a high-quality scene, designers usually need to spend hours in software …

MotionFix: Text-driven 3d human motion editing

N Athanasiou, A Cseke, M Diomataris… - SIGGRAPH Asia 2024 …, 2024 - dl.acm.org
The focus of this paper is 3D motion editing. Given a 3D human motion and a textual
description of the desired modification, our goal is to generate an edited motion as …

Monkey see, monkey do: Harnessing self-attention in motion diffusion for zero-shot motion transfer

S Raab, I Gat, N Sala, G Tevet… - SIGGRAPH Asia 2024 …, 2024 - dl.acm.org
Given the remarkable results of motion synthesis with diffusion models, a natural question
arises: how can we effectively leverage these models for motion editing? Existing diffusion …

Unimotion: Unifying 3d human motion synthesis and understanding

C Li, J Chibane, Y He, N Pearl, A Geiger… - arxiv preprint arxiv …, 2024 - arxiv.org
We introduce Unimotion, the first unified multi-task human motion model capable of both
flexible motion control and frame-level motion understanding. While existing works control …

CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control

G Tevet, S Raab, S Cohan, D Reda, Z Luo… - arxiv preprint arxiv …, 2024 - arxiv.org
Motion diffusion models and Reinforcement Learning (RL) based control for physics-based
simulations have complementary strengths for human motion generation. The former is …

Text-controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion

X Wang, Z Kang, Y Mu - arxiv preprint arxiv:2404.11375, 2024 - arxiv.org
Human motion understanding is a fundamental task with diverse practical applications,
facilitated by the availability of large-scale motion capture datasets. Recent studies focus on …

Versatile Motion Language Models for Multi-Turn Interactive Agents

J Park, S Choi, S Yun - arxiv preprint arxiv:2410.05628, 2024 - arxiv.org
Recent advancements in large language models (LLMs) have greatly enhanced their ability
to generate natural and contextually relevant text, making AI interactions more human-like …

MotionLLM: Multimodal Motion-Language Learning with Large Language Models

Q Wu, Y Zhao, Y Wang, YW Tai, CK Tang - arxiv preprint arxiv …, 2024 - arxiv.org
Recent advancements in Multimodal Large Language Models (MM-LLMs) have
demonstrated promising potential in terms of generalization and robustness when applied to …

Pay Attention and Move Better: Harnessing Attention for Interactive Motion Generation and Training-free Editing

LH Chen, S Lu, W Dai, Z Dou, X Ju, J Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
This research delves into the problem of interactive editing of human motion generation.
Previous motion diffusion models lack explicit modeling of the word-level text-motion …