- Academic Search

Z **, W Chen, X Guo, W He, Y Ding, B Hong… - Science China …, 2025 - Springer

For a long time, researchers have sought artificial intelligence (AI) that matches or exceeds
human intelligence. AI agents, which are artificial entities capable of sensing the …

保存引用被引用次数：735 相关文章所有 4 个版本

[Free GPT-4]

[PDF] mdpi.com

Explainable ai: A review of machine learning interpretability methods

P Linardatos, V Papastefanopoulos, S Kotsiantis - Entropy, 2020 - mdpi.com

Recent advances in artificial intelligence (AI) have led to its widespread industrial adoption,
with machine learning systems demonstrating superhuman performance in a significant …

保存引用被引用次数：2643 相关文章所有 12 个版本网页快照

[Free GPT-4]

[PDF] arxiv.org

Can large language models be an alternative to human evaluations?

CH Chiang, H Lee - arxiv preprint arxiv:2305.01937, 2023 - arxiv.org

Human evaluation is indispensable and inevitable for assessing the quality of texts
generated by machine learning models or written by humans. However, human evaluation is …

保存引用被引用次数：474 相关文章所有 5 个版本 HTML 版

[Free GPT-4]

[PDF] mit.edu

A survey on LLM-generated text detection: Necessity, methods, and future directions

J Wu, S Yang, R Zhan, Y Yuan, LS Chao… - Computational …, 2025 - direct.mit.edu

The remarkable ability of large language models (LLMs) to comprehend, interpret, and
generate complex language has rapidly integrated LLM-generated text into various aspects …

保存引用被引用次数：116 相关文章所有 2 个版本

[Free GPT-4]

[PDF] arxiv.org

Smoothllm: Defending large language models against jailbreaking attacks

A Robey, E Wong, H Hassani, GJ Pappas - arxiv preprint arxiv …, 2023 - arxiv.org

Despite efforts to align large language models (LLMs) with human values, widely-used
LLMs such as GPT, Llama, Claude, and PaLM are susceptible to jailbreaking attacks …

保存引用被引用次数：214 相关文章所有 4 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Mgtbench: Benchmarking machine-generated text detection

X He, X Shen, Z Chen, M Backes, Y Zhang - Proceedings of the 2024 on …, 2024 - dl.acm.org

Nowadays, powerful large language models (LLMs) such as ChatGPT have demonstrated
revolutionary power in a variety of natural language processing (NLP) tasks such as text …

保存引用被引用次数：97 相关文章所有 3 个版本

[Free GPT-4]

[PDF] springer.com

A survey of safety and trustworthiness of large language models through the lens of verification and validation

X Huang, W Ruan, W Huang, G **, Y Dong… - Artificial Intelligence …, 2024 - Springer

Large language models (LLMs) have exploded a new heatwave of AI for their ability to
engage end-users in human-level conversations with detailed and articulate answers across …

保存引用被引用次数：96 相关文章所有 6 个版本

[Free GPT-4]

[PDF] arxiv.org

Textattack: A framework for adversarial attacks, data augmentation, and adversarial training in nlp

JX Morris, E Lifland, JY Yoo, J Grigsby, D **… - arxiv preprint arxiv …, 2020 - arxiv.org

While there has been substantial research using adversarial attacks to analyze NLP models,
each attack is implemented in its own code repository. It remains challenging to develop …

保存引用被引用次数：773 相关文章所有 5 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Bert-attack: Adversarial attack against bert using bert

L Li, R Ma, Q Guo, X Xue, X Qiu - arxiv preprint arxiv:2004.09984, 2020 - arxiv.org

Adversarial attacks for discrete data (such as texts) have been proved significantly more
challenging than continuous data (such as images) since it is difficult to generate adversarial …

保存引用被引用次数：724 相关文章所有 6 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Bae: Bert-based adversarial examples for text classification

S Garg, G Ramakrishnan - arxiv preprint arxiv:2004.01970, 2020 - arxiv.org

Modern text classification models are susceptible to adversarial examples, perturbed
versions of the original text indiscernible by humans which get misclassified by the model …

保存引用被引用次数：609 相关文章所有 4 个版本 HTML 版

创建快讯

引用

高级搜索

已保存到“我的图书馆”

Generating natural language adversarial examples through probability weighted word saliency

The rise and potential of large language model based agents: A survey

Explainable ai: A review of machine learning interpretability methods

Can large language models be an alternative to human evaluations?

A survey on LLM-generated text detection: Necessity, methods, and future directions

Smoothllm: Defending large language models against jailbreaking attacks

Mgtbench: Benchmarking machine-generated text detection

A survey of safety and trustworthiness of large language models through the lens of verification and validation

Textattack: A framework for adversarial attacks, data augmentation, and adversarial training in nlp

Bert-attack: Adversarial attack against bert using bert

Bae: Bert-based adversarial examples for text classification