Zerogen: Efficient zero-shot learning via dataset generation

J Ye, J Gao, Q Li, H Xu, J Feng, Z Wu, T Yu… - ar** for knowledge distillation
MA Haidar, N Anchuri, M Rezagholizadeh… - arxiv preprint arxiv …, 2021 - arxiv.org
Intermediate layer knowledge distillation (KD) can improve the standard KD technique
(which only targets the output of teacher and student models) especially over large pre …

SEML: Self-Supervised Information-Enhanced Meta-learning for Few-Shot Text Classification

H Li, G Huang, Y Li, X Zhang, Y Wang, J Li - International Journal of …, 2023 - Springer
Training a deep-learning text classification model usually requires a large amount of labeled
data, yet labeling data are usually labor-intensive and time-consuming. Few-shot text …

Improving question answering performance using knowledge distillation and active learning

Y Boreshban, SM Mirbostani, G Ghassem-Sani… - … Applications of Artificial …, 2023 - Elsevier
Contemporary question answering (QA) systems, including Transformer-based
architectures, suffer from increasing computational and model complexity which render them …