MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark D Chen, R Chen, S Zhang, Y Liu, Y Wang, H Zhou, Q Zhang, P Zhou, ... ICML 2024 Oral, 2024 | 48 | 2024 |
LLM-as-a-Coauthor: Can Mixed Human-Written and Machine-Generated Text Be Detected? C Gao, D Chen, Q Zhang, Y Huang, Y Huang, Z Sun, S Zhang, W Li, Z Fu, ... NAACL 2024 (Findings), 2024 | 32* | 2024 |
GUI-WORLD: A Video Benchmark and Dataset for GUI-oriented Understanding D Chen, Y Huang, S Wu, J Tang, L Chen, Y Bai, Z He, C Wang, H Zhou, ... ICLR 2025, 2024 | 24* | 2024 |
Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge J Ye, Y Wang, Y Huang, D Chen, Q Zhang, N Moniz, T Gao, W Geyer, ... ICLR 2025, 2024 | 16 | 2024 |
Unigen: A unified framework for textual dataset generation using large language models S Wu, Y Huang, C Gao, D Chen, Q Zhang, Y Wan, T Zhou, X Zhang, ... ICLR 2025, 2024 | 12 | 2024 |
ObscurePrompt: Jailbreaking Large Language Models via Obscure Input Y Huang, J Tang, D Chen, B Tang, Y Wan, L Sun, X Zhang arXiv preprint arXiv:2406.13662, 2024 | 10 | 2024 |
HonestLLM: Toward an Honest and Helpful Large Language Model C Gao, Y Huang, S Wu, D Chen, Q Zhang, Z Fu, Y Wan, X Zhang, L Sun NeurIPS 2024, 2024 | 8* | 2024 |
The Impact of Large Language Models in Academia: from Writing to Speaking M Geng, C Chen, Y Wu, Y Wan, P Zhou, D Chen Solar Workshop @ NeurIPS 2024, 2024 | 5 | 2024 |
Self-Cognition in Large Language Models: An Exploratory Study D Chen, J Shi, Y Wan, P Zhou, NZ Gong, L Sun Large Language Models and Cognition Workshop@ICML 2024, 2024 | 5 | 2024 |
Evaluating the validity of word-level adversarial attacks with large language models H Zhou, Z Wang, H Wang, D Chen, W Mu, F Zhang ACL 2024 (Findings), 2024 | 5 | 2024 |
Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment D Chen, R Chen, S Pu, Z Liu, Y Wu, C Chen, B Liu, Y Huang, Y Wan, ... ICLR 2025, 2024 | 1 | 2024 |
nvAgent: Automated Data Visualization from Natural Language via Collaborative Agent Workflow G Ouyang, J Chen, Z Nie, Y Gui, Y Wan, H Zhang, D Chen arXiv preprint arXiv:2502.05036, 2025 | | 2025 |
REALEDIT: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations P Sushko, A Bharadwaj, ZY Lim, V Ilin, B Caffee, D Chen, M Salehi, ... arXiv preprint arXiv:2502.03629, 2025 | | 2025 |
Perception Tokens Enhance Visual Reasoning in Multimodal Language Models M Bigverdi, Z Luo, CY Hsieh, E Shen, D Chen, LG Shapiro, R Krishna arXiv preprint arXiv:2412.03548, 2024 | | 2024 |
WebCode2M: A Real-World Dataset for Code Generation from Webpage Designs Y Gui, Z Li, Y Wan, Y Shi, H Zhang, Y Su, B Chen, D Chen, S Wu, X Zhou, ... THE WEB CONFERENCE 2025, 0 | | |
UICopilot: Automating UI Synthesis via Hierarchical Code Generation from Webpage Designs Y Gui, Y Wan, Z Li, Z Zhang, D Chen, H Zhang, Y Su, B Chen, X Zhou, ... THE WEB CONFERENCE 2025, 0 | | |