The rise and potential of large language model based agents: A survey

Z **, W Chen, X Guo, W He, Y Ding, B Hong… - Science China …, 2025 - Springer
For a long time, researchers have sought artificial intelligence (AI) that matches or exceeds
human intelligence. AI agents, which are artificial entities capable of sensing the …

Explainable ai: A review of machine learning interpretability methods

P Linardatos, V Papastefanopoulos, S Kotsiantis - Entropy, 2020 - mdpi.com
Recent advances in artificial intelligence (AI) have led to its widespread industrial adoption,
with machine learning systems demonstrating superhuman performance in a significant …

Can large language models be an alternative to human evaluations?

CH Chiang, H Lee - arxiv preprint arxiv:2305.01937, 2023 - arxiv.org
Human evaluation is indispensable and inevitable for assessing the quality of texts
generated by machine learning models or written by humans. However, human evaluation is …

A survey on LLM-generated text detection: Necessity, methods, and future directions

J Wu, S Yang, R Zhan, Y Yuan, LS Chao… - Computational …, 2025 - direct.mit.edu
The remarkable ability of large language models (LLMs) to comprehend, interpret, and
generate complex language has rapidly integrated LLM-generated text into various aspects …

Smoothllm: Defending large language models against jailbreaking attacks

A Robey, E Wong, H Hassani, GJ Pappas - arxiv preprint arxiv …, 2023 - arxiv.org
Despite efforts to align large language models (LLMs) with human values, widely-used
LLMs such as GPT, Llama, Claude, and PaLM are susceptible to jailbreaking attacks …

Mgtbench: Benchmarking machine-generated text detection

X He, X Shen, Z Chen, M Backes, Y Zhang - Proceedings of the 2024 on …, 2024 - dl.acm.org
Nowadays, powerful large language models (LLMs) such as ChatGPT have demonstrated
revolutionary power in a variety of natural language processing (NLP) tasks such as text …

A survey of safety and trustworthiness of large language models through the lens of verification and validation

X Huang, W Ruan, W Huang, G **, Y Dong… - Artificial Intelligence …, 2024 - Springer
Large language models (LLMs) have exploded a new heatwave of AI for their ability to
engage end-users in human-level conversations with detailed and articulate answers across …

Textattack: A framework for adversarial attacks, data augmentation, and adversarial training in nlp

JX Morris, E Lifland, JY Yoo, J Grigsby, D **… - arxiv preprint arxiv …, 2020 - arxiv.org
While there has been substantial research using adversarial attacks to analyze NLP models,
each attack is implemented in its own code repository. It remains challenging to develop …

Bert-attack: Adversarial attack against bert using bert

L Li, R Ma, Q Guo, X Xue, X Qiu - arxiv preprint arxiv:2004.09984, 2020 - arxiv.org
Adversarial attacks for discrete data (such as texts) have been proved significantly more
challenging than continuous data (such as images) since it is difficult to generate adversarial …

Bae: Bert-based adversarial examples for text classification

S Garg, G Ramakrishnan - arxiv preprint arxiv:2004.01970, 2020 - arxiv.org
Modern text classification models are susceptible to adversarial examples, perturbed
versions of the original text indiscernible by humans which get misclassified by the model …