3d gaussian splatting: Survey, technologies, challenges, and opportunities

Y Bao, T Ding, J Huo, Y Liu, Y Li, W Li… - IEEE Transactions on …, 2025 - ieeexplore.ieee.org
3D Gaussian Splatting (3DGS) has emerged as a prominent technique with the potential to
become a mainstream method for 3D representations. It can effectively transform multi-view …

Viewcrafter: Taming video diffusion models for high-fidelity novel view synthesis

W Yu, J **ng, L Yuan, W Hu, X Li, Z Huang… - arxiv preprint arxiv …, 2024 - arxiv.org
Despite recent advancements in neural 3D reconstruction, the dependence on dense multi-
view captures restricts their broader applicability. In this work, we propose\textbf …

Vd3d: Taming large video diffusion transformers for 3d camera control

S Bahmani, I Skorokhodov, A Siarohin… - arxiv preprint arxiv …, 2024 - arxiv.org
Modern text-to-video synthesis models demonstrate coherent, photorealistic generation of
complex videos from a text description. However, most existing models lack fine-grained …

Efficient diffusion models: A comprehensive survey from principles to practices

Z Ma, Y Zhang, G Jia, L Zhao, Y Ma, M Ma… - arxiv preprint arxiv …, 2024 - arxiv.org
As one of the most popular and sought-after generative models in the recent years, diffusion
models have sparked the interests of many researchers and steadily shown excellent …

Images that sound: Composing images and sounds on a single canvas

Z Chen, D Geng, A Owens - Advances in Neural …, 2025 - proceedings.neurips.cc
Spectrograms are 2D representations of sound that look very different from the images found
in our visual world. And natural images, when played as spectrograms, make unnatural …

Reconx: Reconstruct any scene from sparse views with video diffusion model

F Liu, W Sun, H Wang, Y Wang, H Sun, J Ye… - arxiv preprint arxiv …, 2024 - arxiv.org
Advancements in 3D scene reconstruction have transformed 2D images from the real world
into 3D models, producing realistic 3D results from hundreds of input photos. Despite great …

No pose, no problem: Surprisingly simple 3d gaussian splats from sparse unposed images

B Ye, S Liu, H Xu, X Li, M Pollefeys, MH Yang… - arxiv preprint arxiv …, 2024 - arxiv.org
We introduce NoPoSplat, a feed-forward model capable of reconstructing 3D scenes
parameterized by 3D Gaussians from\textit {unposed} sparse multi-view images. Our model …

Gaussianobject: High-quality 3d object reconstruction from four views with gaussian splatting

C Yang, S Li, J Fang, R Liang, L **e, X Zhang… - ACM Transactions on …, 2024 - dl.acm.org
Reconstructing and rendering 3D objects from highly sparse views is of critical importance
for promoting applications of 3D vision techniques and improving user experience …

Drivedreamer4d: World models are effective data machines for 4d driving scene representation

G Zhao, C Ni, X Wang, Z Zhu, X Zhang, Y Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
Closed-loop simulation is essential for advancing end-to-end autonomous driving systems.
Contemporary sensor simulation methods, such as NeRF and 3DGS, rely predominantly on …

[PDF][PDF] Emergence of hidden capabilities: Exploring learning dynamics in concept space

CF Park, M Okawa, A Lee… - The Thirty-eighth …, 2024 - proceedings.neurips.cc
Modern generative models demonstrate impressive capabilities, likely stemming from an
ability to identify and manipulate abstract concepts underlying their training data. However …