[HTML][HTML] In-depth insights into the application of recurrent neural networks (rnns) in traffic prediction: A comprehensive review

Y He, P Huang, W Hong, Q Luo, L Li, KL Tsui - Algorithms, 2024 - mdpi.com
Traffic prediction is crucial for transportation management and user convenience. With the
rapid development of deep learning techniques, numerous models have emerged for traffic …

From pixels to patients: the evolution and future of deep learning in cancer diagnostics

Y Yang, H Shen, K Chen, X Li - Trends in Molecular Medicine, 2024 - cell.com
Deep learning has revolutionized cancer diagnostics, shifting from pixel-based image
analysis to more comprehensive, patient-centric care. This opinion article explores recent …

Lion: Linear group rnn for 3d object detection in point clouds

Z Liu, J Hou, X Wang, X Ye, J Wang, H Zhao… - arxiv preprint arxiv …, 2024 - arxiv.org
The benefit of transformers in large-scale 3D point cloud perception tasks, such as 3D object
detection, is limited by their quadratic computation cost when modeling long-range …

Synthetic continued pretraining

Z Yang, N Band, S Li, E Candes… - arxiv preprint arxiv …, 2024 - arxiv.org
Pretraining on large-scale, unstructured internet text enables language models to acquire a
significant amount of world knowledge. However, this knowledge acquisition is data …

Longhorn: State space models are amortized online learners

B Liu, R Wang, L Wu, Y Feng, P Stone, Q Liu - arxiv preprint arxiv …, 2024 - arxiv.org
Modern large language models are built on sequence modeling via next-token prediction.
While the Transformer remains the dominant architecture for sequence modeling, its …

Curse of attention: A kernel-based perspective for why transformers fail to generalize on time series forecasting and beyond

Y Ke, Y Liang, Z Shi, Z Song, C Yang - arxiv preprint arxiv:2412.06061, 2024 - arxiv.org
The application of transformer-based models on time series forecasting (TSF) tasks has long
been popular to study. However, many of these works fail to beat the simple linear residual …

Efficiently learning at test-time: Active fine-tuning of llms

J Hübotter, S Bongni, I Hakimi, A Krause - arxiv preprint arxiv:2410.08020, 2024 - arxiv.org
Recent efforts in fine-tuning language models often rely on automatic data selection,
commonly using Nearest Neighbors retrieval from large datasets. However, we theoretically …

Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues

R Grazzi, J Siems, JKH Franke, A Zela, F Hutter… - arxiv preprint arxiv …, 2024 - arxiv.org
Linear Recurrent Neural Networks (LRNNs) such as Mamba, RWKV, GLA, mLSTM, and
DeltaNet have emerged as efficient alternatives to Transformers in large language …

Gated slot attention for efficient linear-time sequence modeling

Y Zhang, S Yang, R Zhu, Y Zhang, L Cui… - arxiv preprint arxiv …, 2024 - arxiv.org
Linear attention Transformers and their gated variants, celebrated for enabling parallel
training and efficient recurrent inference, still fall short in recall-intensive tasks compared to …

Gated Delta Networks: Improving Mamba2 with Delta Rule

S Yang, J Kautz, A Hatamizadeh - arxiv preprint arxiv:2412.06464, 2024 - arxiv.org
Linear Transformers have gained attention as efficient alternatives to standard Transformers,
but their performance in retrieval and long-context tasks has been limited. To address these …