Attention mechanism in neural networks: where it comes and where it goes

D Soydaner - Neural Computing and Applications, 2022‏ - Springer
A long time ago in the machine learning literature, the idea of incorporating a mechanism
inspired by the human visual system into neural networks was introduced. This idea is …

Efficient transformers: A survey

Y Tay, M Dehghani, D Bahri, D Metzler - ACM Computing Surveys, 2022‏ - dl.acm.org
Transformer model architectures have garnered immense interest lately due to their
effectiveness across a range of domains like language, vision, and reinforcement learning …

Hierarchically gated recurrent neural network for sequence modeling

Z Qin, S Yang, Y Zhong - Advances in Neural Information …, 2023‏ - proceedings.neurips.cc
Transformers have surpassed RNNs in popularity due to their superior abilities in parallel
training and long-term dependency modeling. Recently, there has been a renewed interest …

Are transformers more robust than cnns?

Y Bai, J Mei, AL Yuille, C **e - Advances in neural …, 2021‏ - proceedings.neurips.cc
Transformer emerges as a powerful tool for visual recognition. In addition to demonstrating
competitive performance on a broad range of visual benchmarks, recent works also argue …