xLSTM: Extended Long Short-Term Memory
In the 1990s, the constant error carousel and gating were introduced as the central ideas of
the Long Short-Term Memory (LSTM). Since then, LSTMs have stood the test of time and …
the Long Short-Term Memory (LSTM). Since then, LSTMs have stood the test of time and …
What Makes a High-Quality Training Dataset for Large Language Models: A Practitioners' Perspective
Large Language Models (LLMs) have demonstrated remarkable performance in various
application domains, largely due to their self-supervised pre-training on extensive high …
application domains, largely due to their self-supervised pre-training on extensive high …
Data contamination report from the 2024 CONDA shared task
The 1st Workshop on Data Contamination (CONDA 2024) focuses on all relevant aspects of
data contamination in natural language processing, where data contamination is understood …
data contamination in natural language processing, where data contamination is understood …
Beyond perplexity: Multi-dimensional safety evaluation of llm compression
Increasingly, model compression techniques enable large language models (LLMs) to be
deployed in real-world applications. As a result of this momentum towards local deployment …
deployed in real-world applications. As a result of this momentum towards local deployment …
How to Synthesize Text Data without Model Collapse?
Model collapse in synthetic data indicates that iterative training on self-generated data leads
to a gradual decline in performance. With the proliferation of AI models, synthetic data will …
to a gradual decline in performance. With the proliferation of AI models, synthetic data will …