T2Ranking: A large-scale Chinese Benchmark for Passage Ranking JM Xiaohui Xie, Qian Dong, Bingning Wang, Feiyang Lv, Ting Yao, Weinan Gan ... SIGIR-2023, 2023 | 42* | 2023 |
ToolACE: Enhancing Function Calling with Accuracy, Complexity, and Diversity W Liu, X Huang, X Zeng, X Hao, S Yu, D Li, S Wang, W Gan, Z Liu, Y Yu, ... ICLR-2025, 2024 | 11* | 2024 |
Gui agents with foundation models: A comprehensive survey S Wang, W Liu, J Chen, Y Zhou, W Gan, X Zeng, Y Che, S Yu, X Hao, ... arXiv preprint arXiv:2411.04890, 2024 | 5 | 2024 |
ACEBench: Who Wins the Match Point in Tool Learning? C Chen, X Hao, W Liu, X Huang, X Zeng, S Yu, D Li, S Wang, W Gan, ... arXiv preprint arXiv:2501.12851, 2025 | | 2025 |