A review of large language models and autonomous agents in chemistry

MC Ramos, CJ Collison, AD White - Chemical Science, 2025 - pubs.rsc.org
Large language models (LLMs) have emerged as powerful tools in chemistry, significantly
impacting molecule design, property prediction, and synthesis optimization. This review …

A state-of-the-art review of long short-term memory models with applications in hydrology and water resources

Z Feng, J Zhang, W Niu - Applied Soft Computing, 2024 - Elsevier
Abstract Long Short-Term Memory (LSTM) has recently emerged as a crucial tool for
scientific research in hydrology and water resources. Despite its widespread use, a …

RULER: What's the Real Context Size of Your Long-Context Language Models?

CP Hsieh, S Sun, S Kriman, S Acharya… - arxiv preprint arxiv …, 2024 - arxiv.org
The needle-in-a-haystack (NIAH) test, which examines the ability to retrieve a piece of
information (the" needle") from long distractor texts (the" haystack"), has been widely …

Learning to (learn at test time): Rnns with expressive hidden states

Y Sun, X Li, K Dalal, J Xu, A Vikram, G Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Self-attention performs well in long context but has quadratic complexity. Existing RNN
layers have linear complexity, but their performance in long context is limited by the …

Flashattention-3: Fast and accurate attention with asynchrony and low-precision

J Shah, G Bikshandi, Y Zhang, V Thakkar… - arxiv preprint arxiv …, 2024 - arxiv.org
Attention, as a core layer of the ubiquitous Transformer architecture, is the bottleneck for
large language models and long-context applications. FlashAttention elaborated an …

Transformers are multi-state rnns

M Oren, M Hassid, N Yarden, Y Adi… - arxiv preprint arxiv …, 2024 - arxiv.org
Transformers are considered conceptually different from the previous generation of state-of-
the-art NLP models-recurrent neural networks (RNNs). In this work, we demonstrate that …

Transformers are SSMs: Generalized models and efficient algorithms through structured state space duality

T Dao, A Gu - arxiv preprint arxiv:2405.21060, 2024 - arxiv.org
While Transformers have been the main architecture behind deep learning's success in
language modeling, state-space models (SSMs) such as Mamba have recently been shown …

Advanced stock price prediction with xlstm-based models: Improving long-term forecasting

X Fan, C Tao, J Zhao - 2024 11th International Conference on …, 2024 - ieeexplore.ieee.org
Stock price prediction has long been a critical area of research in financial modeling. The
inherent complexity of financial markets, characterized by both short-term fluctuations and …

The mamba in the llama: Distilling and accelerating hybrid models

J Wang, D Paliotta, A May, AM Rush, T Dao - arxiv preprint arxiv …, 2024 - arxiv.org
Linear RNN architectures, like Mamba, can be competitive with Transformer models in
language modeling while having advantageous deployment characteristics. Given the focus …

Inference scaling for long-context retrieval augmented generation

Z Yue, H Zhuang, A Bai, K Hui, R Jagerman… - arxiv preprint arxiv …, 2024 - arxiv.org
The scaling of inference computation has unlocked the potential of long-context large
language models (LLMs) across diverse settings. For knowledge-intensive tasks, the …