A review of large language models and autonomous agents in chemistry
Large language models (LLMs) have emerged as powerful tools in chemistry, significantly
impacting molecule design, property prediction, and synthesis optimization. This review …
impacting molecule design, property prediction, and synthesis optimization. This review …
A state-of-the-art review of long short-term memory models with applications in hydrology and water resources
Z Feng, J Zhang, W Niu - Applied Soft Computing, 2024 - Elsevier
Abstract Long Short-Term Memory (LSTM) has recently emerged as a crucial tool for
scientific research in hydrology and water resources. Despite its widespread use, a …
scientific research in hydrology and water resources. Despite its widespread use, a …
RULER: What's the Real Context Size of Your Long-Context Language Models?
The needle-in-a-haystack (NIAH) test, which examines the ability to retrieve a piece of
information (the" needle") from long distractor texts (the" haystack"), has been widely …
information (the" needle") from long distractor texts (the" haystack"), has been widely …
Learning to (learn at test time): Rnns with expressive hidden states
Self-attention performs well in long context but has quadratic complexity. Existing RNN
layers have linear complexity, but their performance in long context is limited by the …
layers have linear complexity, but their performance in long context is limited by the …
Flashattention-3: Fast and accurate attention with asynchrony and low-precision
Attention, as a core layer of the ubiquitous Transformer architecture, is the bottleneck for
large language models and long-context applications. FlashAttention elaborated an …
large language models and long-context applications. FlashAttention elaborated an …
Transformers are multi-state rnns
Transformers are considered conceptually different from the previous generation of state-of-
the-art NLP models-recurrent neural networks (RNNs). In this work, we demonstrate that …
the-art NLP models-recurrent neural networks (RNNs). In this work, we demonstrate that …
Transformers are SSMs: Generalized models and efficient algorithms through structured state space duality
While Transformers have been the main architecture behind deep learning's success in
language modeling, state-space models (SSMs) such as Mamba have recently been shown …
language modeling, state-space models (SSMs) such as Mamba have recently been shown …
Advanced stock price prediction with xlstm-based models: Improving long-term forecasting
X Fan, C Tao, J Zhao - 2024 11th International Conference on …, 2024 - ieeexplore.ieee.org
Stock price prediction has long been a critical area of research in financial modeling. The
inherent complexity of financial markets, characterized by both short-term fluctuations and …
inherent complexity of financial markets, characterized by both short-term fluctuations and …
The mamba in the llama: Distilling and accelerating hybrid models
Linear RNN architectures, like Mamba, can be competitive with Transformer models in
language modeling while having advantageous deployment characteristics. Given the focus …
language modeling while having advantageous deployment characteristics. Given the focus …
Inference scaling for long-context retrieval augmented generation
The scaling of inference computation has unlocked the potential of long-context large
language models (LLMs) across diverse settings. For knowledge-intensive tasks, the …
language models (LLMs) across diverse settings. For knowledge-intensive tasks, the …