Improving named entity recognition by external context retrieving and cooperative learning

X Wang, Y Jiang, N Bach, T Wang, Z Huang… - arxiv preprint arxiv …, 2021 - arxiv.org
Recent advances in Named Entity Recognition (NER) show that document-level contexts
can significantly improve model performance. In many application scenarios, however, such …

Automated concatenation of embeddings for structured prediction

X Wang, Y Jiang, N Bach, T Wang, Z Huang… - arxiv preprint arxiv …, 2020 - arxiv.org
Pretrained contextualized embeddings are powerful word representations for structured
prediction tasks. Recent work found that better word representations can be obtained by …

A primer on pretrained multilingual language models

S Doddapaneni, G Ramesh, MM Khapra… - arxiv preprint arxiv …, 2021 - arxiv.org
Multilingual Language Models (\MLLMs) such as mBERT, XLM, XLM-R,\textit {etc.} have
emerged as a viable option for bringing the power of pretraining to a large number of …

Few-shot class-incremental learning for named entity recognition

R Wang, T Yu, H Zhao, S Kim, S Mitra… - Proceedings of the …, 2022 - aclanthology.org
Previous work of class-incremental learning for Named Entity Recognition (NER) relies on
the assumption that there exists abundance of labeled data for the training of new classes. In …

[PDF][PDF] 知识蒸馏研究综述

黄震华, 杨顺志, 林威, 倪娟, 孙圣力, 陈运文, 汤庸 - 计算机学报, 2022 - 159.226.43.17
摘要高性能的深度学**网络通常是计算型和参数密集型的, 难以应用于资源受限的边缘设备.
为了能够在低资源设备上运行深度学**模型, 需要研发高效的小规模网络 …

Cross-lingual cross-modal consolidation for effective multilingual video corpus moment retrieval

J Liu, T Yu, H Peng, M Sun, P Li - Findings of the Association for …, 2022 - aclanthology.org
Existing multilingual video corpus moment retrieval (mVCMR) methods are mainly based on
a two-stream structure. The visual stream utilizes the visual content in the video to estimate …

Multi-teacher distillation with single model for neural machine translation

X Liang, L Wu, J Li, T Qin, M Zhang… - IEEE/ACM Transactions …, 2022 - ieeexplore.ieee.org
Knowledge distillation (KD) is an effective strategy for neural machine translation (NMT) to
improve the performance of a student model. Usually, the teacher can guide the student to …

The Zeno's Paradox ofLow-Resource'Languages

HH Nigatu, AL Tonja, B Rosman, T Solorio… - arxiv preprint arxiv …, 2024 - arxiv.org
The disparity in the languages commonly studied in Natural Language Processing (NLP) is
typically reflected by referring to languages as low vs high-resourced. However, there is …

[HTML][HTML] Classifier-adaptation knowledge distillation framework for relation extraction and event detection with imbalanced data

D Song, J Xu, J Pang, H Huang - Information Sciences, 2021 - Elsevier
Fundamental information extraction tasks, such as relation extraction and event detection,
suffer from a data imbalance problem. To alleviate this problem, existing methods rely mostly …

Boosting lightweight CNNs through network pruning and knowledge distillation for SAR target recognition

Z Wang, L Du, Y Li - IEEE Journal of Selected Topics in Applied …, 2021 - ieeexplore.ieee.org
Deep convolutional neural networks (CNNs) have yielded unusually brilliant results in
synthetic aperture radar (SAR) target recognition. However, overparameterization is a widely …