Attention mechanism in neural networks: where it comes and where it goes
D Soydaner - Neural Computing and Applications, 2022 - Springer
A long time ago in the machine learning literature, the idea of incorporating a mechanism
inspired by the human visual system into neural networks was introduced. This idea is …
inspired by the human visual system into neural networks was introduced. This idea is …
A survey on document-level neural machine translation: Methods and evaluation
Machine translation (MT) is an important task in natural language processing (NLP), as it
automates the translation process and reduces the reliance on human translators. With the …
automates the translation process and reduces the reliance on human translators. With the …
ETC: Encoding long and structured inputs in transformers
Transformer models have advanced the state of the art in many Natural Language
Processing (NLP) tasks. In this paper, we present a new Transformer architecture, Extended …
Processing (NLP) tasks. In this paper, we present a new Transformer architecture, Extended …
A survey on green deep learning
In recent years, larger and deeper models are springing up and continuously pushing state-
of-the-art (SOTA) results across various fields like natural language processing (NLP) and …
of-the-art (SOTA) results across various fields like natural language processing (NLP) and …
Adaptively sparse transformers
Attention mechanisms have become ubiquitous in NLP. Recent architectures, notably the
Transformer, learn powerful context-aware word representations through layered, multi …
Transformer, learn powerful context-aware word representations through layered, multi …
Scientific credibility of machine translation research: A meta-evaluation of 769 papers
This paper presents the first large-scale meta-evaluation of machine translation (MT). We
annotated MT evaluations conducted in 769 research papers published from 2010 to 2020 …
annotated MT evaluations conducted in 769 research papers published from 2010 to 2020 …
A study on relu and softmax in transformer
The Transformer architecture consists of self-attention and feed-forward networks (FFNs)
which can be viewed as key-value memories according to previous works. However, FFN …
which can be viewed as key-value memories according to previous works. However, FFN …
G-transformer for document-level machine translation
Document-level MT models are still far from satisfactory. Existing work extend translation unit
from single sentence to multiple sentences. However, study shows that when we further …
from single sentence to multiple sentences. However, study shows that when we further …
A simple and effective unified encoder for document-level machine translation
Most of the existing models for document-level machine translation adopt dual-encoder
structures. The representation of the source sentences and the document-level contexts are …
structures. The representation of the source sentences and the document-level contexts are …
Towards making the most of context in neural machine translation
Document-level machine translation manages to outperform sentence level models by a
small margin, but have failed to be widely adopted. We argue that previous research did not …
small margin, but have failed to be widely adopted. We argue that previous research did not …