- Academic Search

Y Liu, W Chen, Y Bai, X Liang, G Li, W Gao… - ar**: Segment and edit anything in 3d scenes

M Ye, M Danelljan, F Yu, L Ke - European Conference on Computer …, 2024 - Springer

Abstract The recent Gaussian Splatting achieves high-quality and real-time novel-view
synthesis of the 3D scenes. However, it is solely concentrated on the appearance and …

Save Cite Cited by 110 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] ieee.org

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

Save Cite Cited by 119 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

3d-vista: Pre-trained transformer for 3d vision and text alignment

Z Zhu, X Ma, Y Chen, Z Deng… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract 3D vision-language grounding (3D-VL) is an emerging field that aims to connect the
3D physical world with natural language, which is crucial for achieving embodied …

Save Cite Cited by 99 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Nerflets: Local radiance fields for efficient structure-aware 3d scene representation from 2d supervision

X Zhang, A Kundu, T Funkhouser… - Proceedings of the …, 2023 - openaccess.thecvf.com

We address efficient and structure-aware 3D scene representation from images. Nerflets are
our key contribution--a set of local neural radiance fields that together represent a scene …

Save Cite Cited by 49 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding

B Jia, Y Chen, H Yu, Y Wang, X Niu, T Liu, Q Li… - … on Computer Vision, 2024 - Springer

Abstract 3D vision-language (3D-VL) grounding, which aims to align language with 3D
physical environments, stands as a cornerstone in develo** embodied agents. In …

Save Cite Cited by 40 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

A survey on open-vocabulary detection and segmentation: Past, present, and future

C Zhu, L Chen - IEEE Transactions on Pattern Analysis and …, 2024 - ieeexplore.ieee.org

As the most fundamental scene understanding tasks, object detection and segmentation
have made tremendous progress in deep learning era. Due to the expensive manual …

Save Cite Cited by 24 Related articles All 7 versions Free GPT-4

Interactive medical image annotation using improved Attention U-net with compound geodesic distance

Y Zhang, J Chen, X Ma, G Wang, UA Bhatti… - Expert systems with …, 2024 - Elsevier

Accurate and massive medical image annotation data is crucial for diagnosis, surgical
planning, and deep learning in the development of medical images. However, creating large …

Save Cite Cited by 58 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Mask-attention-free transformer for 3d instance segmentation

X Lai, Y Yuan, R Chu, Y Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recently, transformer-based methods have dominated 3D instance segmentation, where
mask attention is commonly involved. Specifically, object queries are guided by the initial …

Save Cite Cited by 30 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Human-centric scene understanding for 3d large-scale scenarios

Y Xu, P Cong, Y Yao, R Chen, Y Hou… - Proceedings of the …, 2023 - openaccess.thecvf.com

Human-centric scene understanding is significant for real-world applications, but it is
extremely challenging due to the existence of diverse human poses and actions, complex …

Save Cite Cited by 23 Related articles All 6 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Mask3d: Mask transformer for 3d semantic instance segmentation

Aligning cyber space with physical world: A comprehensive survey on embodied ai

Transformer-based visual segmentation: A survey

3d-vista: Pre-trained transformer for 3d vision and text alignment

Nerflets: Local radiance fields for efficient structure-aware 3d scene representation from 2d supervision

SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding

A survey on open-vocabulary detection and segmentation: Past, present, and future

Interactive medical image annotation using improved Attention U-net with compound geodesic distance

Mask-attention-free transformer for 3d instance segmentation

Human-centric scene understanding for 3d large-scale scenarios