NegotiationToM: A Benchmark for Stress-testing Machine Theory of Mind on Negotiation Surrounding C Chan, C Jiayang, Y Yim, Z Deng, W Fan, H Li, X Liu, H Zhang, W Wang, ... arXiv preprint arXiv:2404.13627, 2024 | 10 | 2024 |
Text-Tuple-Table: Towards Information Integration in Text-to-Table Generation via Global Tuple Extraction Z Deng, C Chan, W Wang, Y Sun, W Fan, T Zheng, Y Yim, Y Song arXiv preprint arXiv:2404.14215, 2024 | 8* | 2024 |
Evaluating and enhancing llms agent based on theory of mind in guandan: A multi-player cooperative game under imperfect information Y Yim, C Chan, T Shi, Z Deng, W Fan, T Zheng, Y Song arXiv preprint arXiv:2408.02559, 2024 | 4 | 2024 |
Actplan-1k: Benchmarking the procedural planning ability of visual language models in household activities Y Su, Z Ling, H Shi, J Cheng, Y Yim, Y Song arXiv preprint arXiv:2410.03907, 2024 | 1 | 2024 |
Clr-fact: Evaluating the complex logical reasoning capability of large language models over factual knowledge T Zheng, J Bai, Y Wang, T Fang, Y Guo, Y Yim, Y Song arXiv preprint arXiv:2407.20564, 2024 | 1 | 2024 |
Audience Persona Knowledge-Aligned Prompt Tuning Method for Online Debate C Chan, J Cheng, X Liu, Y Yim, Y Jiang, Z Deng, H Li, Y Song, GY Wong, ... ECAI, 2024 | 1 | 2024 |
Distilling Multi-Step Reasoning Capabilities into Smaller Language Model Y Yim, Z Wang Proceedings of the 2024 16th International Conference on Machine Learning …, 2024 | | 2024 |