The emerging trends of multi-label learning

W Liu, H Wang, X Shen… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Exabytes of data are generated daily by humans, leading to the growing needs for new
efforts in dealing with the grand challenges for multi-label learning brought by big data. For …

Few-shot learning for medical text: A review of advances, trends, and opportunities

Y Ge, Y Guo, S Das, MA Al-Garadi, A Sarker - Journal of Biomedical …, 2023 - Elsevier
Background: Few-shot learning (FSL) is a class of machine learning methods that require
small numbers of labeled instances for training. With many medical topics having limited …

A survey on text classification: From traditional to deep learning

Q Li, H Peng, J Li, C **a, R Yang, L Sun… - ACM Transactions on …, 2022 - dl.acm.org
Text classification is the most fundamental and essential task in natural language
processing. The last decade has seen a surge of research in this area due to the …

LEGAL-BERT: The muppets straight out of law school

I Chalkidis, M Fergadiotis, P Malakasiotis… - arxiv preprint arxiv …, 2020 - arxiv.org
BERT has achieved impressive performance in several NLP tasks. However, there has been
limited investigation on its adaptation guidelines in specialised domains. Here we focus on …

LexGLUE: A benchmark dataset for legal language understanding in English

I Chalkidis, A Jana, D Hartung, M Bommarito… - arxiv preprint arxiv …, 2021 - arxiv.org
Laws and their interpretations, legal arguments and agreements\are typically expressed in
writing, leading to the production of vast corpora of legal text. Their analysis, which is at the …

How does NLP benefit legal system: A summary of legal artificial intelligence

H Zhong, C **ao, C Tu, T Zhang, Z Liu… - arxiv preprint arxiv …, 2020 - arxiv.org
Legal Artificial Intelligence (LegalAI) focuses on applying the technology of artificial
intelligence, especially natural language processing, to benefit tasks in the legal domain. In …

Pile of law: Learning responsible data filtering from the law and a 256gb open-source legal dataset

P Henderson, M Krass, L Zheng… - Advances in …, 2022 - proceedings.neurips.cc
One concern with the rise of large language models lies with their potential for significant
harm, particularly from pretraining on biased, obscene, copyrighted, and private information …

CUAD: an expert-annotated NLP dataset for legal contract review

D Hendrycks, C Burns, A Chen, S Ball - arxiv preprint arxiv:2103.06268, 2021 - arxiv.org
Many specialized domains remain untouched by deep learning, as large labeled datasets
require expensive expert annotators. We address this bottleneck within the legal domain by …

Privacy risks of general-purpose language models

X Pan, M Zhang, S Ji, M Yang - 2020 IEEE Symposium on …, 2020 - ieeexplore.ieee.org
Recently, a new paradigm of building general-purpose language models (eg, Google's Bert
and OpenAI's GPT-2) in Natural Language Processing (NLP) for text feature extraction, a …

A survey on text classification: From shallow to deep learning

Q Li, H Peng, J Li, C **a, R Yang, L Sun, PS Yu… - arxiv preprint arxiv …, 2020 - arxiv.org
Text classification is the most fundamental and essential task in natural language
processing. The last decade has seen a surge of research in this area due to the …