Instruction-guided editing controls for images and multimedia: A survey in llm era

TT Nguyen, Z Ren, T Pham, PL Nguyen, H Yin… - arxiv preprint arxiv …, 2024 - arxiv.org
The rapid advancement of large language models (LLMs) and multimodal learning has
transformed digital content creation and manipulation. Traditional visual editing tools require …

ParSEL: Parameterized shape editing with language

A Ganeshan, R Huang, X Xu, RK Jones… - ACM Transactions on …, 2024 - dl.acm.org
The ability to edit 3D assets with natural language presents a compelling paradigm to aid in
the democratization of 3D content creation. However, while natural language is often …

GRS: Generating Robotic Simulation Tasks from Real-World Images

A Zook, FY Sun, J Spjut, V Blukis, S Birchfield… - arxiv preprint arxiv …, 2024 - arxiv.org
We introduce GRS (Generating Robotic Simulation tasks), a novel system to address the
challenge of real-to-sim in robotics, computer vision, and AR/VR. GRS enables the creation …

Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models

R Wu, W Su, J Liao - arxiv preprint arxiv:2411.16602, 2024 - arxiv.org
Scalable Vector Graphics (SVG) has become the de facto standard for vector graphics in
digital design, offering resolution independence and precise control over individual …

DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation

W Zhao, YP Cao, J Xu, Y Dong, Y Shan - arxiv preprint arxiv:2412.15200, 2024 - arxiv.org
Procedural Content Generation (PCG) is powerful in creating high-quality 3D contents, yet
controlling it to produce desired shapes is difficult and often requires extensive parameter …