Ad-llm: Benchmarking large language models for anomaly detection

T Yang, Y Nian, S Li, R Xu, Y Li, J Li, Z **ao… - arxiv preprint arxiv …, 2024 - arxiv.org
Anomaly detection (AD) is an important machine learning task with many real-world uses,
including fraud detection, medical diagnosis, and industrial monitoring. Within natural …

A Large-scale Empirical Study on Large Language Models for Election Prediction

C Yu, Z Weng, Y Li, Z Li, X Hu, Y Zhao - arxiv preprint arxiv:2412.15291, 2024 - arxiv.org
Can Large Language Models (LLMs) accurately predict election outcomes? While LLMs
have demonstrated impressive performance in healthcare, legal analysis, and creative …

On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

Y Huang, C Gao, S Wu, H Wang, X Wang… - arxiv preprint arxiv …, 2025 - arxiv.org
Generative Foundation Models (GenFMs) have emerged as transformative tools. However,
their widespread adoption raises critical concerns regarding trustworthiness across …

Benchmarking LLMs for Political Science: A United Nations Perspective

Y Liang, L Yang, C Wang, C **a, R Meng, X Xu… - arxiv preprint arxiv …, 2025 - arxiv.org
Large Language Models (LLMs) have achieved significant advances in natural language
processing, yet their potential for high-stake political decision-making remains largely …