Google znalac

M Hassanin, S Anwar, I Radwan, FS Khan, A Mian - Information Fusion, 2024 - Elsevier

Inspired by the human cognitive system, attention is a mechanism that imitates the human
cognitive awareness about specific information, amplifying critical details to focus more on …

Spremi Citiraj Spominje se 172 puta Srodni članci Svih 9 inačica

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Attention, please! A survey of neural attention models in deep learning

A de Santana Correia, EL Colombini - Artificial Intelligence Review, 2022 - Springer

In humans, Attention is a core property of all perceptual and cognitive operations. Given our
limited ability to process competing sources, attention mechanisms select, modulate, and …

Spremi Citiraj Spominje se 230 puta Srodni članci Svih 9 inačica

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Incorporating bert into neural machine translation

J Zhu, Y **a, L Wu, D He, T Qin, W Zhou, H Li… - arxiv preprint arxiv …, 2020 - arxiv.org

The recently proposed BERT has shown great power on a variety of natural language
understanding tasks, such as text classification, reading comprehension, etc. However, how …

Spremi Citiraj Spominje se 494 puta Srodni članci Svih 5 inačica Prikaži kao HTML

Variational attention-based interpretable transformer network for rotary machine fault diagnosis

Y Li, Z Zhou, C Sun, X Chen… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Deep learning technology provides a promising approach for rotary machine fault diagnosis
(RMFD), where vibration signals are commonly utilized as input of a deep network model to …

Spremi Citiraj Spominje se 90 puta Srodni članci Svih 3 inačica

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?

J Bastings, K Filippova - arxiv preprint arxiv:2010.05607, 2020 - arxiv.org

There is a recent surge of interest in using attention as explanation of model predictions,
with mixed evidence on whether attention can be used as such. While attention conveniently …

Spremi Citiraj Spominje se 230 puta Srodni članci Svih 4 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Gmnn: Graph markov neural networks

M Qu, Y Bengio, J Tang - International conference on …, 2019 - proceedings.mlr.press

This paper studies semi-supervised object classification in relational data, which is a
fundamental problem in relational data modeling. The problem has been extensively studied …

Spremi Citiraj Spominje se 341 puta Srodni članci Svih 9 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Fixup initialization: Residual learning without normalization

H Zhang, YN Dauphin, T Ma - arxiv preprint arxiv:1901.09321, 2019 - arxiv.org

Normalization layers are a staple in state-of-the-art deep neural network architectures. They
are widely believed to stabilize training, enable higher learning rate, accelerate …

Spremi Citiraj Spominje se 398 puta Srodni članci Svih 5 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Adaptively sparse transformers

GM Correia, V Niculae, AFT Martins - arxiv preprint arxiv:1909.00015, 2019 - arxiv.org

Attention mechanisms have become ubiquitous in NLP. Recent architectures, notably the
Transformer, learn powerful context-aware word representations through layered, multi …

Spremi Citiraj Spominje se 287 puta Srodni članci Svih 3 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Sequential latent knowledge selection for knowledge-grounded dialogue

B Kim, J Ahn, G Kim - arxiv preprint arxiv:2002.07510, 2020 - arxiv.org

Knowledge-grounded dialogue is a task of generating an informative response based on
both discourse context and external knowledge. As we focus on better modeling the …

Spremi Citiraj Spominje se 183 puta Srodni članci Svih 4 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Explicit sparse transformer: Concentrated attention through explicit selection

G Zhao, J Lin, Z Zhang, X Ren, Q Su, X Sun - arxiv preprint arxiv …, 2019 - arxiv.org

Self-attention based Transformer has demonstrated the state-of-the-art performances in a
number of natural language processing tasks. Self-attention is able to model long-term …

Spremi Citiraj Spominje se 137 puta Srodni članci Svih 3 inačica Prikaži kao HTML

Stvori obavijest

Citiraj

Napredno pretraživanje

Spremljeno u Moju knjižnicu

Latent alignment and variational attention

Visual attention methods in deep learning: An in-depth survey

Attention, please! A survey of neural attention models in deep learning

Incorporating bert into neural machine translation

Variational attention-based interpretable transformer network for rotary machine fault diagnosis

The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?

Gmnn: Graph markov neural networks

Fixup initialization: Residual learning without normalization

Adaptively sparse transformers

Sequential latent knowledge selection for knowledge-grounded dialogue

Explicit sparse transformer: Concentrated attention through explicit selection