Improving named entity recognition by external context retrieving and cooperative learning
Recent advances in Named Entity Recognition (NER) show that document-level contexts
can significantly improve model performance. In many application scenarios, however, such …
can significantly improve model performance. In many application scenarios, however, such …
Automated concatenation of embeddings for structured prediction
Pretrained contextualized embeddings are powerful word representations for structured
prediction tasks. Recent work found that better word representations can be obtained by …
prediction tasks. Recent work found that better word representations can be obtained by …
A primer on pretrained multilingual language models
Multilingual Language Models (\MLLMs) such as mBERT, XLM, XLM-R,\textit {etc.} have
emerged as a viable option for bringing the power of pretraining to a large number of …
emerged as a viable option for bringing the power of pretraining to a large number of …
Few-shot class-incremental learning for named entity recognition
Previous work of class-incremental learning for Named Entity Recognition (NER) relies on
the assumption that there exists abundance of labeled data for the training of new classes. In …
the assumption that there exists abundance of labeled data for the training of new classes. In …
[PDF][PDF] 知识蒸馏研究综述
黄震华, 杨顺志, 林威, 倪娟, 孙圣力, 陈运文, 汤庸 - 计算机学报, 2022 - 159.226.43.17
摘要高性能的深度学**网络通常是计算型和参数密集型的, 难以应用于资源受限的边缘设备.
为了能够在低资源设备上运行深度学**模型, 需要研发高效的小规模网络 …
为了能够在低资源设备上运行深度学**模型, 需要研发高效的小规模网络 …
Cross-lingual cross-modal consolidation for effective multilingual video corpus moment retrieval
Existing multilingual video corpus moment retrieval (mVCMR) methods are mainly based on
a two-stream structure. The visual stream utilizes the visual content in the video to estimate …
a two-stream structure. The visual stream utilizes the visual content in the video to estimate …
Multi-teacher distillation with single model for neural machine translation
Knowledge distillation (KD) is an effective strategy for neural machine translation (NMT) to
improve the performance of a student model. Usually, the teacher can guide the student to …
improve the performance of a student model. Usually, the teacher can guide the student to …
The Zeno's Paradox ofLow-Resource'Languages
The disparity in the languages commonly studied in Natural Language Processing (NLP) is
typically reflected by referring to languages as low vs high-resourced. However, there is …
typically reflected by referring to languages as low vs high-resourced. However, there is …
[HTML][HTML] Classifier-adaptation knowledge distillation framework for relation extraction and event detection with imbalanced data
D Song, J Xu, J Pang, H Huang - Information Sciences, 2021 - Elsevier
Fundamental information extraction tasks, such as relation extraction and event detection,
suffer from a data imbalance problem. To alleviate this problem, existing methods rely mostly …
suffer from a data imbalance problem. To alleviate this problem, existing methods rely mostly …
Boosting lightweight CNNs through network pruning and knowledge distillation for SAR target recognition
Z Wang, L Du, Y Li - IEEE Journal of Selected Topics in Applied …, 2021 - ieeexplore.ieee.org
Deep convolutional neural networks (CNNs) have yielded unusually brilliant results in
synthetic aperture radar (SAR) target recognition. However, overparameterization is a widely …
synthetic aperture radar (SAR) target recognition. However, overparameterization is a widely …