D Yang, E Kleinman, C Harteveld - arxiv preprint arxiv:2411.00308, 2024 - arxiv.org
Due to GPT's impressive generative capabilities, its applications in games are expanding
rapidly. To offer researchers a comprehensive understanding of the current applications and …

Codenames as a Benchmark for Large Language Models

M Stephenson, M Sidji, B Ronval - arxiv preprint arxiv:2412.11373, 2024 - arxiv.org
In this paper, we propose the use of the popular word-based board game Codenames as a
suitable benchmark for evaluating the reasoning capabilities of Large Language Models …