Vision-and-language navigation: A survey of tasks, methods, and future directions J Gu, E Stefani, Q Wu, J Thomason, XE Wang ACL 2022, 2022 | 128 | 2022 |
Memformer: The memory-augmented transformer Q Wu, Z Lan, J Gu, Z Yu | 69* | 2020 |
Chaincqg: Flow-aware conversational question generation J Gu, M Mirshekari, Z Yu, A Sisto EACL 2021, 2021 | 39 | 2021 |
Photoswap: Personalized subject swapping in images J Gu, Y Wang, N Zhao, TJ Fu, W Xiong, Q Liu, Z Zhang, H Zhang, J Zhang, ... Advances in Neural Information Processing Systems 36, 35202-35217, 2023 | 31 | 2023 |
Jarvis: A neuro-symbolic commonsense reasoning framework for conversational embodied agents K Zheng, K Zhou, J Gu, Y Fan, J Wang, Z Di, X He, XE Wang arXiv preprint arXiv:2208.13266, 2022 | 30 | 2022 |
Muffin or chihuahua? challenging multimodal large language models with multipanel vqa Y Fan, J Gu, K Zhou, Q Yan, S Jiang, CC Kuo, X Guan, XE Wang arXiv preprint arXiv:2401.15847, 2024 | 20 | 2024 |
A tailored pre-training model for task-oriented dialog generation J Gu, Q Wu, C Wu, W Shi, Z Yu ACL 2021, 2020 | 18* | 2020 |
Llms assist nlp researchers: Critique paper (meta-) reviewing J Du, Y Wang, W Zhao, Z Deng, S Liu, R Lou, HP Zou, PN Venkit, ... arXiv preprint arXiv:2406.16253, 2024 | 16 | 2024 |
Perception score: A learned metric for open-ended text generation evaluation J Gu, Q Wu, Z Yu Proceedings of the AAAI Conference on Artificial Intelligence 35 (14), 12902 …, 2021 | 14 | 2021 |
SwapAnything: Enabling Arbitrary Object Swapping in Personalized Image Editing J Gu, N Zhao, W Xiong, Q Liu, Z Zhang, H Zhang, J Zhang, HJ Jung, ... European Conference on Computer Vision, 402-418, 2024 | 13 | 2024 |
Data annealing for informal language understanding tasks J Gu, Z Yu ACL 2020 Findings, 2020 | 12 | 2020 |
Temporalbench: Benchmarking fine-grained temporal understanding for multimodal video models M Cai, R Tan, J Zhang, B Zou, K Zhang, F Yao, F Zhu, J Gu, Y Zhong, ... arXiv preprint arXiv:2410.10818, 2024 | 6 | 2024 |
R2H: Building multimodal navigation helpers that respond to help requests Y Fan, J Gu, K Zheng, XE Wang arXiv preprint arXiv:2305.14260, 2023 | 4 | 2023 |
Conquest: Contextual question paraphrasing through answer-aware synthetic question generation M Mirshekari, J Gu, A Sisto Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021 …, 2021 | 4 | 2021 |
Temporalbench: Towards fine-grained temporal understanding for multimodal video models M Cai, R Tan, J Zhang, B Zou, K Zhang, F Yao, F Zhu, J Gu, Y Zhong, ... | 3 | 2024 |
Via: A spatiotemporal video adaptation framework for global and local video editing J Gu, Y Fang, I Skorokhodov, P Wonka, X Du, S Tulyakov, XE Wang arXiv, 2024 | 1 | 2024 |
EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing K Zheng, X Chen, X He, J Gu, L Li, Z Yang, K Lin, J Wang, L Wang, ... arXiv preprint arXiv:2410.12836, 2024 | | 2024 |
VIA: Unified Spatiotemporal Video Adaptation Framework for Global and Local Video Editing J Gu, Y Fang, I Skorokhodov, P Wonka, X Du, S Tulyakov, XE Wang arXiv preprint arXiv:2406.12831, 2024 | | 2024 |
SlugJARVIS: Multimodal Commonsense Knowledge-based Embodied AI for SimBot Challenge J Gu, K Zheng, KZY Fan, XHJWZ Di, XE Wang Alexa Prize SimBot Challenge Proceedings, 2023 | | 2023 |