Robot learning in the era of foundation models: A survey

X **
Y Zheng, X Chen, Y Zheng, S Gu… - IEEE Robotics and …, 2024‏ - ieeexplore.ieee.org
Constructing a 3D scene capable of accommodating open-ended language queries, is a
pivotal pursuit in the domain of robotics, which facilitates robots in executing object …

MTMamba: Enhancing multi-task dense scene understanding by mamba-based decoders

B Lin, W Jiang, P Chen, Y Zhang, S Liu… - European Conference on …, 2024‏ - Springer
Multi-task dense scene understanding, which learns a model for multiple dense prediction
tasks, has a wide range of application scenarios. Modeling long-range dependency and …

Fmgs: Foundation model embedded 3d gaussian splatting for holistic 3d scene understanding

X Zuo, P Samangouei, Y Zhou, Y Di, M Li - International Journal of …, 2024‏ - Springer
Precisely perceiving the geometric and semantic properties of real-world 3D objects is
crucial for the continued evolution of augmented reality and robotic applications. To this end …

Semantically-aware neural radiance fields for visual scene understanding: A comprehensive review

TAQ Nguyen, A Bourki, M Macudzinski… - arxiv preprint arxiv …, 2024‏ - arxiv.org
This review thoroughly examines the role of semantically-aware Neural Radiance Fields
(NeRFs) in visual scene understanding, covering an analysis of over 250 scholarly papers. It …

Nerf-mae: Masked autoencoders for self-supervised 3d representation learning for neural radiance fields

MZ Irshad, S Zakharov, V Guizilini, A Gaidon… - … on Computer Vision, 2024‏ - Springer
Neural fields excel in computer vision and robotics due to their ability to understand the 3D
visual world such as inferring semantics, geometry, and dynamics. Given the capabilities of …