[PDF][PDF] Crema: Multimodal compositional video reasoning via efficient modular adaptation and fusion

S Yu, J Yoon, M Bansal - arxiv preprint arxiv:2402.05889, 2024 - southnlp.github.io
Despite impressive advancements in multimodal compositional reasoning approaches, they
are still limited in their flexibility and efficiency by processing fixed modality inputs while …

[HTML][HTML] Comprehensive review on 3D point cloud segmentation in plants

H Song, W Wen, S Wu, X Guo - Artificial Intelligence in Agriculture, 2025 - Elsevier
Segmentation of three-dimensional (3D) point clouds is fundamental in comprehending
unstructured structural and morphological data. It plays a critical role in research related to …

Point-to-pixel prompting for point cloud analysis with pre-trained image models

Z Wang, Y Rao, X Yu, J Zhou… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Nowadays, pre-training big models on large-scale datasets has achieved great success and
dominated many downstream tasks in natural language processing and 2D vision, while pre …

RGB-D Cube R-CNN: 3D Object Detection with Selective Modality Dropout

J Piekenbrinck, A Hermans… - Proceedings of the …, 2024 - openaccess.thecvf.com
In this paper we create an RGB-D 3D object detector targeted at indoor robotics use cases
where one modality may be unavailable due to a specific sensor setup or a sensor failure …

Weakly Supervised Point Cloud Semantic Segmentation via Artificial Oracle

H Kweon, J Kim, KJ Yoon - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
Manual annotation of every point in a point cloud is a costly and labor-intensive process.
While weakly supervised point cloud semantic segmentation (WSPCSS) with sparse …

Masked Image Modeling: A Survey

V Hondru, FA Croitoru, S Minaee, RT Ionescu… - arxiv preprint arxiv …, 2024 - arxiv.org
In this work, we survey recent studies on masked image modeling (MIM), an approach that
emerged as a powerful self-supervised learning technique in computer vision. The MIM task …

Infrastructure 3D Target detection based on multi-mode fusion for intelligent and connected vehicles

X Zhang, L He, R Lv, C **, Y Wang - IEEE Access, 2023 - ieeexplore.ieee.org
Autonomous driving technology faces significant safety challenges due to the lack of a
global perspective and the limitations of long-range perception capabilities. It is widely …