Language model tokenizers introduce unfairness between languages

A Petrov, E La Malfa, P Torr… - Advances in neural …, 2023 - proceedings.neurips.cc
Recent language models have shown impressive multilingual performance, even when not
explicitly trained for it. Despite this, there are concerns about the quality of their outputs …

Welm: A well-read pre-trained language model for chinese

H Su, X Zhou, H Yu, X Shen, Y Chen, Z Zhu… - arxiv preprint arxiv …, 2022 - arxiv.org
Large Language Models pre-trained with self-supervised learning have demonstrated
impressive zero-shot generalization capabilities on a wide spectrum of tasks. In this work …

Internlm-law: An open source chinese legal large language model

Z Fei, S Zhang, X Shen, D Zhu, X Wang, M Cao… - arxiv preprint arxiv …, 2024 - arxiv.org
While large language models (LLMs) have showcased impressive capabilities, they struggle
with addressing legal queries due to the intricate complexities and specialized expertise …

Cbas: Character-level backdoor attacks against chinese pre-trained language models

X He, F Hao, T Gu, L Chang - ACM Transactions on Privacy and Security, 2024 - dl.acm.org
Pre-trained language models (PLMs) aim to assist computers in various domains to provide
natural and efficient language interaction and text processing capabilities. However, recent …

Comparing explanation faithfulness between multilingual and monolingual fine-tuned language models

Z Zhao, N Aletras - arxiv preprint arxiv:2403.12809, 2024 - arxiv.org
In many real natural language processing application scenarios, practitioners not only aim to
maximize predictive performance but also seek faithful explanations for the model …

Enhancing pre-trained language models with Chinese character morphological knowledge

Z Zheng, X Wu, X Liu - Information Processing & Management, 2025 - Elsevier
Pre-trained language models (PLMs) have demonstrated success in Chinese natural
language processing (NLP) tasks by acquiring high-quality representations through …

A comprehensive evaluation of parameter-efficient fine-tuning on software engineering tasks

W Zou, Q Li, J Ge, C Li, X Shen, L Huang… - arxiv preprint arxiv …, 2023 - arxiv.org
Pre-trained models (PTMs) have achieved great success in various Software Engineering
(SE) downstream tasks following the``pre-train then fine-tune''paradigm. As fully fine-tuning …

Self-training improves few-shot learning in legal artificial intelligence tasks

Y Zhou, Y Qin, R Huang, Y Chen, C Lin… - Artificial Intelligence and …, 2024 - Springer
As the labeling costs in legal artificial intelligence tasks are expensive. Therefore, it
becomes a challenge to utilize low cost to train a robust model. In this paper, we propose a …

Enhance robustness of language models against variation attack through graph integration

Z **ong, L Qing, Y Kang, J Liu, H Li, C Sun… - arxiv preprint arxiv …, 2024 - arxiv.org
The widespread use of pre-trained language models (PLMs) in natural language processing
(NLP) has greatly improved performance outcomes. However, these models' vulnerability to …

PROTECT: Parameter-Efficient Tuning for Few-Shot Robust Chinese Text Correction

X Feng, T Gu, L Chang, X Liu - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org
Non-normative texts and euphemisms are widely spread on the Internet, making it more
difficult to moderate the content. These phenomena result from misspelling errors or …