UrbanCLIP: Learning Text-enhanced Urban Region Profiling with Contrastive Language-Image Pretraining from the Web Y Yan, H Wen, S Zhong, W Chen, H Chen, Q Wen, R Zimmermann, ... ACM The Web Conference (WWW 2024 Oral), 2023 | 54* | 2023 |
Deep Learning for Cross-Domain Data Fusion in Urban Computing: Taxonomy, Advances, and Outlook X Zou*, Y Yan*, X Hao, Y Hu, H Wen, E Liu, J Zhang, Y Li, T Li, Y Zheng, ... Information Fusion (IF=15), 2024 | 28 | 2024 |
UrbanVLP: A Multi-Granularity Vision-Language Pre-Trained Foundation Model for Urban Indicator Prediction X Hao, W Chen, Y Yan, S Zhong, K Wang, Q Wen, Y Liang Annual AAAI Conference on Artificial Intelligence (AAAI 2025, Special Track …, 2024 | 14 | 2024 |
Explainable and interpretable multimodal large language models: A comprehensive survey Y Dang, K Huang, J Huo, Y Yan, S Huang, D Liu, M Gao, J Zhang, C Qian, ... arXiv preprint arXiv:2412.02104, 2024 | 11 | 2024 |
MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model J Huo, Y Yan, B Hu, Y Yue, X Hu Conference on Empirical Methods in Natural Language Processing (EMNLP 2024 Main), 2024 | 11 | 2024 |
Reefknot: A comprehensive benchmark for relation hallucination evaluation, analysis and mitigation in multimodal large language models K Zheng, J Chen, Y Yan, X Zou, X Hu arXiv preprint arXiv:2408.09429, 2024 | 10 | 2024 |
Errorradar: Benchmarking complex mathematical reasoning of multimodal large language models via error detection Y Yan, S Wang, J Huo, H Li, B Li, J Su, X Gao, YF Zhang, T Xu, Z Chu, ... arXiv preprint arXiv:2410.04509, 2024 | 9* | 2024 |
Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models X Zou, Y Wang, Y Yan, S Huang, K Zheng, J Chen, C Tang, X Hu arXiv preprint arXiv:2410.03577, 2024 | 7 | 2024 |
GeoReasoner: Reasoning On Geospatially Grounded Context For Natural Language Understanding Y Yan, J Lee Conference on Information and Knowledge Management (CIKM 2024, Best Short …, 2024 | 7 | 2024 |
Mitigating modality prior-induced hallucinations in multimodal large language models via deciphering attention causality G Zhou, Y Yan, X Zou, K Wang, A Liu, X Hu International Conference on Learning Representations (ICLR 2025), 2024 | 6 | 2024 |
Miner: Mining the underlying pattern of modality-specific neurons in multimodal large language models K Huang, J Huo, Y Yan, K Wang, Y Yue, X Hu arXiv preprint arXiv:2410.04819, 2024 | 5 | 2024 |
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning Y Yan, S Wang, J Huo, J Ye, Z Chu, X Hu, PS Yu, C Gomes, B Selman, ... arXiv preprint arXiv:2502.02871, 2025 | 4 | 2025 |
UrbanCross: Enhancing Satellite Image-Text Retrieval with Cross-Domain Adaptation S Zhong, X Hao, Y Yan, Y Zhang, Y Song, Y Liang ACM International Conference on Multimedia (MM 2024), 2024 | 4 | 2024 |
Position: LLMs Can be Good Tutors in Foreign Language Education J Ye, S Wang, D Zou, Y Yan, K Wang, HT Zheng, Z Xu, I King, PS Yu, ... arXiv preprint arXiv:2502.05467, 2025 | 3 | 2025 |
Exploring response uncertainty in mllms: An empirical evaluation under misleading scenarios Y Dang, M Gao, Y Yan, X Zou, Y Gu, A Liu, X Hu arXiv preprint arXiv:2411.02708, 2024 | 3 | 2024 |
FastMem: Fast Memorization of Prompt Improves Context Awareness of Large Language Models J Zhu, S Liu, Y Yu, B Tang, Y Yan, Z Li, F Xiong, T Xu, MB Blaschko Conference on Empirical Methods in Natural Language Processing (EMNLP 2024 …, 2024 | 3 | 2024 |
Learning geospatial region embedding with heterogeneous graph X Zou, J Huang, X Hao, Y Yang, H Wen, Y Yan, C Huang, Y Liang arXiv preprint arXiv:2405.14135, 2024 | 3 | 2024 |
Learning Robust Anymodal Segmentor with Unimodal and Cross-modal Distillation X Zheng, H Xue, J Chen, Y Yan, L Jiang, Y Lyu, K Yang, L Zhang, X Hu arXiv preprint arXiv:2411.17141, 2024 | 1 | 2024 |
SAFEERASER: Enhancing Safety in Multimodal Large Language Models through Multimodal Machine Unlearning J Chen, Z Deng, K Zheng, Y Yan, S Liu, PJ Wu, P Jiang, J Liu, X Hu arXiv preprint arXiv:2502.12520, 2025 | | 2025 |
EssayJudge: A Multi-Granular Benchmark for Assessing Automated Essay Scoring Capabilities of Multimodal Large Language Models J Su*, Y Yan*, F Fu, H Zhang, J Ye, X Liu, J Huo, H Zhou, X Hu arXiv preprint arXiv:2502.11916, 2025 | | 2025 |