Mobilebert: a compact task-agnostic bert for resource-limited devices
Natural Language Processing (NLP) has recently achieved great success by using huge pre-
trained models with hundreds of millions of parameters. However, these models suffer from …
trained models with hundreds of millions of parameters. However, these models suffer from …
Llm inference unveiled: Survey and roofline model insights
The field of efficient Large Language Model (LLM) inference is rapidly evolving, presenting a
unique blend of opportunities and challenges. Although the field has expanded and is …
unique blend of opportunities and challenges. Although the field has expanded and is …
A survey on transformer compression
Large models based on the Transformer architecture play increasingly vital roles in artificial
intelligence, particularly within the realms of natural language processing (NLP) and …
intelligence, particularly within the realms of natural language processing (NLP) and …
Speculative decoding with big little decoder
The recent emergence of Large Language Models based on the Transformer architecture
has enabled dramatic advancements in the field of Natural Language Processing. However …
has enabled dramatic advancements in the field of Natural Language Processing. However …
A survey on non-autoregressive generation for neural machine translation and beyond
Non-autoregressive (NAR) generation, which is first proposed in neural machine translation
(NMT) to speed up inference, has attracted much attention in both machine learning and …
(NMT) to speed up inference, has attracted much attention in both machine learning and …
Glancing transformer for non-autoregressive neural machine translation
Recent work on non-autoregressive neural machine translation (NAT) aims at improving the
efficiency by parallel decoding without sacrificing the quality. However, existing NAT …
efficiency by parallel decoding without sacrificing the quality. However, existing NAT …
Step-unrolled denoising autoencoders for text generation
In this paper we propose a new generative model of text, Step-unrolled Denoising
Autoencoder (SUNDAE), that does not rely on autoregressive models. Similarly to denoising …
Autoencoder (SUNDAE), that does not rely on autoregressive models. Similarly to denoising …
Deep encoder, shallow decoder: Reevaluating non-autoregressive machine translation
Much recent effort has been invested in non-autoregressive neural machine translation,
which appears to be an efficient alternative to state-of-the-art autoregressive machine …
which appears to be an efficient alternative to state-of-the-art autoregressive machine …
Non-autoregressive machine translation with latent alignments
This paper presents two strong methods, CTC and Imputer, for non-autoregressive machine
translation that model latent alignments with dynamic programming. We revisit CTC for …
translation that model latent alignments with dynamic programming. We revisit CTC for …
Order-agnostic cross entropy for non-autoregressive machine translation
We propose a new training objective named order-agnostic cross entropy (OaXE) for fully
non-autoregressive translation (NAT) models. OaXE improves the standard cross-entropy …
non-autoregressive translation (NAT) models. OaXE improves the standard cross-entropy …