Core challenges in embodied vision-language planning

J Francis, N Kitamura, F Labelle, X Lu, I Navarro… - Journal of Artificial …, 2022 - jair.org
Recent advances in the areas of multimodal machine learning and artificial intelligence (AI)
have led to the development of challenging tasks at the intersection of Computer Vision …

Less is more: Generating grounded navigation instructions from landmarks

S Wang, C Montgomery, J Orbay… - Proceedings of the …, 2022 - openaccess.thecvf.com
We study the automatic generation of navigation instructions from 360-degree images
captured on indoor routes. Existing generators suffer from poor visual grounding, causing …

Successfully Guiding Humans with Imperfect Instructions by Highlighting Potential Errors and Suggesting Corrections

L Zhao, K Nguyen, H Daumé III - arxiv preprint arxiv:2402.16973, 2024 - arxiv.org
Language models will inevitably err in situations with which they are unfamiliar. However, by
effectively communicating uncertainties, they can still guide humans toward making sound …