Transformers aftermath: Current research and rising trends

ESD Reis, CAD Costa, DED Silveira… - Communications of the …, 2021 - dl.acm.org
Transformers aftermath: current research and rising trends Page 1 154 COMMUNICATIONS
OF THE ACM | APRIL 2021 | VOL. 64 | NO. 4 review articles NATURAL LANGUAGE …

Structure-level knowledge distillation for multilingual sequence labeling

X Wang, Y Jiang, N Bach, T Wang, F Huang… - arxiv preprint arxiv …, 2020 - arxiv.org
Multilingual sequence labeling is a task of predicting label sequences using a single unified
model for multiple languages. Compared with relying on multiple monolingual models, using …

[PDF][PDF] Self-attentive Biaffine Dependency Parsing.

Y Li, Z Li, M Zhang, R Wang, S Li, L Si - IJCAI, 2019 - ijcai.org
The current state-of-the-art dependency parsing approaches employ BiLSTMs to encode
input sentences. Motivated by the success of the transformer-based machine translation, this …

Scalable syntax-aware language models using knowledge distillation

A Kuncoro, C Dyer, L Rimell, S Clark… - arxiv preprint arxiv …, 2019 - arxiv.org
Prior work has shown that, on small amounts of training data, syntactic neural language
models learn structurally sensitive generalisations more successfully than sequential …

Evaluating explanation methods for neural machine translation

J Li, L Liu, H Li, G Li, G Huang, S Shi - arxiv preprint arxiv:2005.01672, 2020 - arxiv.org
Recently many efforts have been devoted to interpreting the black-box NMT models, but little
progress has been made on metrics to evaluate explanation methods. Word Alignment Error …

Distilling neural networks for greener and faster dependency parsing

M Anderson, C Gómez-Rodríguez - arxiv preprint arxiv:2006.00844, 2020 - arxiv.org
The carbon footprint of natural language processing research has been increasing in recent
years due to its reliance on large and inefficient neural network implementations. Distillation …

Improved training of mixture-of-experts language gans

Y Chai, Q Yin, J Zhang - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org
Despite the dramatic success in image generation, Generative Adversarial Networks (GANs)
still face great challenges in text generation. The difficulty in generator training arises from …

Predicting events in moba games: Prediction, attribution, and evaluation

Z Yang, Y Wang, P Li, S Lin, S Shi… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
The multiplayer online battle arena (MOBA) games have become increasingly popular in
recent years. Consequently, many efforts have been devoted to providing pregame or in …

RU-SURE? uncertainty-aware code suggestions by maximizing utility across random user intents

DD Johnson, D Tarlow, C Walder - arxiv preprint arxiv:2303.00732, 2023 - arxiv.org
Large language models show impressive results at predicting structured text such as code,
but also commonly introduce errors and hallucinations in their output. When used to assist …

Knowledge base embedding by cooperative knowledge distillation

R Sourty, JG Moreno, FP Servant… - … Linguistics (COLING 2020), 2020 - hal.science
Knowledge bases are increasingly exploited as gold standard data sources which benefit
various knowledge-driven NLP tasks. In this paper, we explore a new research direction to …