A Survey on Self-supervised Learning: Algorithms, Applications, and Future Trends

J Gui, T Chen, J Zhang, Q Cao, Z Sun… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Deep supervised learning algorithms typically require a large volume of labeled data to
achieve satisfactory performance. However, the process of collecting and labeling such data …

A primer on contrastive pretraining in language processing: Methods, lessons learned, and perspectives

N Rethmeier, I Augenstein - ACM Computing Surveys, 2023 - dl.acm.org
Modern natural language processing (NLP) methods employ self-supervised pretraining
objectives such as masked language modeling to boost the performance of various …

State-of-the-art generalisation research in NLP: a taxonomy and review

D Hupkes, M Giulianelli, V Dankers, M Artetxe… - arxiv preprint arxiv …, 2022 - arxiv.org
The ability to generalise well is one of the primary desiderata of natural language
processing (NLP). Yet, what'good generalisation'entails and how it should be evaluated is …

Label-specific document representation for multi-label text classification

L **ao, X Huang, B Chen, L **g - Proceedings of the 2019 …, 2019 - aclanthology.org
Multi-label text classification (MLTC) aims to tag most relevant labels for the given document.
In this paper, we propose a Label-Specific Attention Network (LSAN) to learn a label-specific …

Few-shot cross-lingual stance detection with sentiment-based pre-training

M Hardalov, A Arora, P Nakov… - Proceedings of the AAAI …, 2022 - ojs.aaai.org
The goal of stance detection is to determine the viewpoint expressed in a piece of text
towards a target. These viewpoints or contexts are often expressed in many different …

Neurjudge: A circumstance-aware neural framework for legal judgment prediction

L Yue, Q Liu, B **, H Wu, K Zhang, Y An… - Proceedings of the 44th …, 2021 - dl.acm.org
Legal Judgment Prediction is a fundamental task in legal intelligence of the civil law system,
which aims to automatically predict the judgment results of multiple subtasks, such as …

SWSR: A Chinese dataset and lexicon for online sexism detection

A Jiang, X Yang, Y Liu, A Zubiaga - Online Social Networks and Media, 2022 - Elsevier
Online sexism has become an increasing concern in social media platforms as it has
affected the healthy development of the Internet and can have negative effects in society …

Linking cve's to mitre att&ck techniques

A Kuppa, L Aouad, NA Le-Khac - … of the 16th International Conference on …, 2021 - dl.acm.org
The MITRE Corporation is a non-profit organization that has made substantial efforts into
creating and maintaining knowledge bases relevant to cybersecurity and has been widely …

BERT-XML: Large scale automated ICD coding using BERT pretraining

Z Zhang, J Liu, N Razavian - arxiv preprint arxiv:2006.03685, 2020 - arxiv.org
Clinical interactions are initially recorded and documented in free text medical notes. ICD
coding is the task of classifying and coding all diagnoses, symptoms and procedures …

Annobert: Effectively representing multiple annotators' label choices to improve hate speech detection

W Yin, V Agarwal, A Jiang, A Zubiaga… - Proceedings of the …, 2023 - ojs.aaai.org
Supervised machine learning approaches often rely on a" ground truth" label. However,
obtaining one label through majority voting ignores the important subjectivity information in …