Ammus: A survey of transformer-based pretrained models in natural language processing

KS Kalyan, A Rajasekharan, S Sangeetha - arxiv preprint arxiv …, 2021 - arxiv.org
Transformer-based pretrained language models (T-PTLMs) have achieved great success in
almost every NLP task. The evolution of these models started with GPT and BERT. These …

Large language models for cyber security: A systematic literature review

HX Xu, SA Wang, N Li, K Wang, Y Zhao, K Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
The rapid advancement of Large Language Models (LLMs) has opened up new
opportunities for leveraging artificial intelligence in various domains, including cybersecurity …

Artificial intelligence for the metaverse: A survey

T Huynh-The, QV Pham, XQ Pham, TT Nguyen… - … Applications of Artificial …, 2023 - Elsevier
Along with the massive growth of the Internet from the 1990s until now, various innovative
technologies have been created to bring users breathtaking experiences with more virtual …

Charformer: Fast character transformers via gradient-based subword tokenization

Y Tay, VQ Tran, S Ruder, J Gupta, HW Chung… - arxiv preprint arxiv …, 2021 - arxiv.org
State-of-the-art models in natural language processing rely on separate rigid subword
tokenization algorithms, which limit their generalization ability and adaptation to new …

Between words and characters: A brief history of open-vocabulary modeling and tokenization in NLP

SJ Mielke, Z Alyafeai, E Salesky, C Raffel… - arxiv preprint arxiv …, 2021 - arxiv.org
What are the units of text that we want to model? From bytes to multi-word expressions, text
can be analyzed and generated at many granularities. Until recently, most natural language …

Better robustness by more coverage: Adversarial training with mixup augmentation for robust fine-tuning

C Si, Z Zhang, F Qi, Z Liu, Y Wang, Q Liu… - arxiv preprint arxiv …, 2020 - arxiv.org
Pretrained language models (PLMs) perform poorly under adversarial attacks. To improve
the adversarial robustness, adversarial data augmentation (ADA) has been widely adopted …

Analogy generation by prompting large language models: A case study of instructgpt

B Bhavya, J **ong, CX Zhai - arxiv preprint arxiv:2210.04186, 2022 - arxiv.org
We propose a novel application of prompting Pre-trained Language Models (PLMs) to
generate analogies and study how to design effective prompts for two task settings …

Bridging the gap between indexing and retrieval for differentiable search index with query generation

S Zhuang, H Ren, L Shou, J Pei, M Gong… - arxiv preprint arxiv …, 2022 - arxiv.org
The Differentiable Search Index (DSI) is an emerging paradigm for information retrieval.
Unlike traditional retrieval architectures where index and retrieval are two different and …

PMANet: Malicious URL detection via post-trained language model guided multi-level feature attention network

R Liu, Y Wang, H Xu, Z Qin, F Zhang, Y Liu, Z Cao - Information Fusion, 2025 - Elsevier
The expansion of the Internet has led to the widespread proliferation of malicious URLs,
becoming a primary vector for cyber threats. Detecting malicious URLs is now essential for …

Square one bias in NLP: Towards a multi-dimensional exploration of the research manifold

S Ruder, I Vulić, A Søgaard - arxiv preprint arxiv:2206.09755, 2022 - arxiv.org
The prototypical NLP experiment trains a standard architecture on labeled English data and
optimizes for accuracy, without accounting for other dimensions such as fairness …