A comprehensive survey on applications of transformers for deep learning tasks

S Islam, H Elmekki, A Elsebai, J Bentahar… - Expert Systems with …, 2024 - Elsevier
Abstract Transformers are Deep Neural Networks (DNN) that utilize a self-attention
mechanism to capture contextual relationships within sequential data. Unlike traditional …

A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Rwkv: Reinventing rnns for the transformer era

B Peng, E Alcaide, Q Anthony, A Albalak… - arxiv preprint arxiv …, 2023 - arxiv.org
Transformers have revolutionized almost all natural language processing (NLP) tasks but
suffer from memory and computational complexity that scales quadratically with sequence …

Resurrecting recurrent neural networks for long sequences

A Orvieto, SL Smith, A Gu, A Fernando… - International …, 2023 - proceedings.mlr.press
Abstract Recurrent Neural Networks (RNNs) offer fast inference on long sequences but are
hard to optimize and slow to train. Deep state-space models (SSMs) have recently been …

itransformer: Inverted transformers are effective for time series forecasting

Y Liu, T Hu, H Zhang, H Wu, S Wang, L Ma… - arxiv preprint arxiv …, 2023 - arxiv.org
The recent boom of linear forecasting models questions the ongoing passion for
architectural modifications of Transformer-based forecasters. These forecasters leverage …

Are transformers effective for time series forecasting?

A Zeng, M Chen, L Zhang, Q Xu - … of the AAAI conference on artificial …, 2023 - ojs.aaai.org
Recently, there has been a surge of Transformer-based solutions for the long-term time
series forecasting (LTSF) task. Despite the growing performance over the past few years, we …

Visual attention network

MH Guo, CZ Lu, ZN Liu, MM Cheng, SM Hu - Computational Visual Media, 2023 - Springer
While originally designed for natural language processing tasks, the self-attention
mechanism has recently taken various computer vision areas by storm. However, the 2D …

Time-llm: Time series forecasting by reprogramming large language models

M **, S Wang, L Ma, Z Chu, JY Zhang, X Shi… - arxiv preprint arxiv …, 2023 - arxiv.org
Time series forecasting holds significant importance in many real-world dynamic systems
and has been extensively studied. Unlike natural language process (NLP) and computer …

Dstagnn: Dynamic spatial-temporal aware graph neural network for traffic flow forecasting

S Lan, Y Ma, W Huang, W Wang… - … on machine learning, 2022 - proceedings.mlr.press
As a typical problem in time series analysis, traffic flow prediction is one of the most
important application fields of machine learning. However, achieving highly accurate traffic …

Efficiently modeling long sequences with structured state spaces

A Gu, K Goel, C Ré - arxiv preprint arxiv:2111.00396, 2021 - arxiv.org
A central goal of sequence modeling is designing a single principled model that can
address sequence data across a range of modalities and tasks, particularly on long-range …