Large language models for data annotation: A survey

Z Tan, A Beigi, S Wang, R Guo, A Bhattacharjee… - arxiv preprint arxiv …, 2024 - arxiv.org
Data annotation is the labeling or tagging of raw data with relevant information, essential for
improving the efficacy of machine learning models. The process, however, is labor-intensive …

Large language models for data annotation and synthesis: A survey

Z Tan, D Li, S Wang, A Beigi, B Jiang… - Proceedings of the …, 2024 - aclanthology.org
Data annotation and synthesis generally refers to the labeling or generating of raw data with
relevant information, which could be used for improving the efficacy of machine learning …

Sparsity-guided holistic explanation for llms with interpretable inference-time intervention

Z Tan, T Chen, Z Zhang, H Liu - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
Abstract Large Language Models (LLMs) have achieved unprecedented breakthroughs in
various natural language processing domains. However, the enigmatic``black-box''nature of …

Disinformation detection: An evolving challenge in the age of llms

B Jiang, Z Tan, A Nirmal, H Liu - Proceedings of the 2024 SIAM International …, 2024 - SIAM
The advent of generative Large Language Models (LLMs) such as ChatGPT has catalyzed
transformative advancements across multiple domains. However, alongside these …

Ceb: Compositional evaluation benchmark for fairness in large language models

S Wang, P Wang, T Zhou, Y Dong, Z Tan… - arxiv preprint arxiv …, 2024 - arxiv.org
As Large Language Models (LLMs) are increasingly deployed to handle various natural
language processing (NLP) tasks, concerns regarding the potential negative societal …

Exploring large language models for feature selection: A data-centric perspective

D Li, Z Tan, H Liu - ACM SIGKDD Explorations Newsletter, 2025 - dl.acm.org
The rapid advancement of Large Language Models (LLMs) has significantly influenced
various domains, leveraging their exceptional few-shot and zero-shot learning capabilities …

Hide and seek in noise labels: Noise-robust collaborative active learning with LLMs-powered assistance

B Yuan, Y Chen, Y Zhang, W Jiang - Proceedings of the 62nd …, 2024 - aclanthology.org
Learning from noisy labels (LNL) is a challenge that arises in many real-world scenarios
where collected training data can contain incorrect or corrupted labels. Most existing …

Catching chameleons: Detecting evolving disinformation generated using large language models

B Jiang, C Zhao, Z Tan, H Liu - 2024 IEEE 6th International …, 2024 - ieeexplore.ieee.org
Despite recent advancements in detecting disinformation generated by large language
models (LLMs), current efforts overlook the ever-evolving nature of this disinformation. In this …

Towards robust and generalized parameter-efficient fine-tuning for noisy label learning

Y Kim, J Kim, SK Lee - Proceedings of the 62nd Annual Meeting of …, 2024 - aclanthology.org
Parameter-efficient fine-tuning (PEFT) has enabled the efficient optimization of cumbersome
language models in real-world settings. However, as datasets in such environments often …

Constructing Concept-Based Models to Mitigate Spurious Correlations with Minimal Human Effort

J Kim, Z Wang, Q Qiu - European Conference on Computer Vision, 2024 - Springer
Enhancing model interpretability can address spurious correlations by revealing how
models draw their predictions. Concept Bottleneck Models (CBMs) can provide a principled …