From generation to judgment: Opportunities and challenges of llm-as-a-judge

D Li, B Jiang, L Huang, A Beigi, C Zhao, Z Tan… - arxiv preprint arxiv …, 2024 - arxiv.org
Assessment and evaluation have long been critical challenges in artificial intelligence (AI)
and natural language processing (NLP). However, traditional methods, whether matching …

Generalist virtual agents: A survey on autonomous agents across digital platforms

M Gao, W Bu, B Miao, Y Wu, Y Li, J Li, S Tang… - arxiv preprint arxiv …, 2024 - arxiv.org
In this paper, we introduce the Generalist Virtual Agent (GVA), an autonomous entity
engineered to function across diverse digital platforms and environments, assisting users by …

A Survey on Multi-Generative Agent System: Recent Advances and New Frontiers

S Chen, Y Liu, W Han, W Zhang, T Liu - arxiv preprint arxiv:2412.17481, 2024 - arxiv.org
Multi-generative agent systems (MGASs) have become a research hotspot since the rise of
large language models (LLMs). However, with the continuous influx of new related works …