Generalized decoding for pixel, image, and language

X Zou, ZY Dou, J Yang, Z Gan, L Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present X-Decoder, a generalized decoding model that can predict pixel-level
segmentation and language tokens seamlessly. X-Decoder takes as input two types of …

Multimodal foundation models: From specialists to general-purpose assistants

C Li, Z Gan, Z Yang, J Yang, L Li… - … and Trends® in …, 2024 - nowpublishers.com
Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …

Open-Vocabulary Instance Segmentation-Boundary IS-Goal

Q Tang - Chinese Conference on Pattern Recognition and …, 2024 - Springer
Accurate delineation of boundaries and instance semantics is crucial for tasks like object
localization in robotic arm gras**, and vehicle and pedestrian detection in autonomous …

Window to Wall Ratio Detection using SegFormer

Z De Simone, S Biswas, O Wu - arxiv preprint arxiv:2406.02706, 2024 - arxiv.org
Window to Wall Ratios (WWR) are key to assessing the energy, daylight and ventilation
performance of buildings. Studies have shown that window area has a large impact on …