[HTML][HTML] A survey of transformers

T Lin, Y Wang, X Liu, X Qiu - AI open, 2022 - Elsevier
Transformers have achieved great success in many artificial intelligence fields, such as
natural language processing, computer vision, and audio processing. Therefore, it is natural …

Transformers in time-series analysis: A tutorial

S Ahmed, IE Nielsen, A Tripathi, S Siddiqui… - Circuits, Systems, and …, 2023 - Springer
Transformer architectures have widespread applications, particularly in Natural Language
Processing and Computer Vision. Recently, Transformers have been employed in various …

Transformers in time series: A survey

Q Wen, T Zhou, C Zhang, W Chen, Z Ma, J Yan… - arxiv preprint arxiv …, 2022 - arxiv.org
Transformers have achieved superior performances in many tasks in natural language
processing and computer vision, which also triggered great interest in the time series …

An empirical study of training end-to-end vision-and-language transformers

ZY Dou, Y Xu, Z Gan, J Wang, S Wang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Abstract Vision-and-language (VL) pre-training has proven to be highly effective on various
VL downstream tasks. While recent work has shown that fully transformer-based VL models …

Autoformer: Searching transformers for visual recognition

M Chen, H Peng, J Fu, H Ling - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Recently, pure transformer-based models have shown great potentials for vision tasks such
as image classification and detection. However, the design of transformer networks is …

Gshard: Scaling giant models with conditional computation and automatic sharding

D Lepikhin, HJ Lee, Y Xu, D Chen, O Firat… - arxiv preprint arxiv …, 2020 - arxiv.org
Neural network scaling has been critical for improving the model quality in many real-world
machine learning applications with vast amounts of training data and compute. Although this …

Deep modular co-attention networks for visual question answering

Z Yu, J Yu, Y Cui, D Tao, Q Tian - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
Abstract Visual Question Answering (VQA) requires a fine-grained and simultaneous
understanding of both the visual content of images and the textual content of questions …

Learning deep transformer models for machine translation

Q Wang, B Li, T **ao, J Zhu, C Li, DF Wong… - arxiv preprint arxiv …, 2019 - arxiv.org
Transformer is the state-of-the-art model in recent machine translation evaluations. Two
strands of research are promising to improve models of this kind: the first uses wide …

Improving massively multilingual neural machine translation and zero-shot translation

B Zhang, P Williams, I Titov, R Sennrich - arxiv preprint arxiv:2004.11867, 2020 - arxiv.org
Massively multilingual models for neural machine translation (NMT) are theoretically
attractive, but often underperform bilingual models and deliver poor zero-shot translations. In …

Attention in natural language processing

A Galassi, M Lippi, P Torroni - IEEE transactions on neural …, 2020 - ieeexplore.ieee.org
Attention is an increasingly popular mechanism used in a wide range of neural
architectures. The mechanism itself has been realized in a variety of formats. However …