- Academic Search

M Beck, K Pöppel, M Spanring, A Auer… - arxiv preprint arxiv …, 2024 - arxiv.org

In the 1990s, the constant error carousel and gating were introduced as the central ideas of
the Long Short-Term Memory (LSTM). Since then, LSTMs have stood the test of time and …

Zapisz Cytuj Cytowane przez 174 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] githubusercontent.com

What Makes a High-Quality Training Dataset for Large Language Models: A Practitioners' Perspective

X Yu, Z Zhang, F Niu, X Hu, X **a… - Proceedings of the 39th …, 2024 - dl.acm.org

Large Language Models (LLMs) have demonstrated remarkable performance in various
application domains, largely due to their self-supervised pre-training on extensive high …

Zapisz Cytuj Cytowane przez 3 Powiązane artykuły Wszystkie wersje 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Data contamination report from the 2024 CONDA shared task

O Sainz, I García-Ferrero, A Jacovi, JA Campos… - arxiv preprint arxiv …, 2024 - arxiv.org

The 1st Workshop on Data Contamination (CONDA 2024) focuses on all relevant aspects of
data contamination in natural language processing, where data contamination is understood …

Zapisz Cytuj Cytowane przez 4 Powiązane artykuły Wszystkie wersje 8 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Beyond perplexity: Multi-dimensional safety evaluation of llm compression

Z Xu, A Gupta, T Li, O Bentham, V Srikumar - arxiv preprint arxiv …, 2024 - arxiv.org

Increasingly, model compression techniques enable large language models (LLMs) to be
deployed in real-world applications. As a result of this momentum towards local deployment …

Zapisz Cytuj Cytowane przez 4 Powiązane artykuły Wszystkie wersje 5 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

How to Synthesize Text Data without Model Collapse?

X Zhu, D Cheng, H Li, K Zhang, E Hua, X Lv… - arxiv preprint arxiv …, 2024 - arxiv.org

Model collapse in synthetic data indicates that iterative training on self-generated data leads
to a gradual decline in performance. With the proliferation of AI models, synthetic data will …

Zapisz Cytuj Cytowane przez 1 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Paloma: A benchmark for evaluating language model fit

xLSTM: Extended Long Short-Term Memory

What Makes a High-Quality Training Dataset for Large Language Models: A Practitioners' Perspective

Data contamination report from the 2024 CONDA shared task

Beyond perplexity: Multi-dimensional safety evaluation of llm compression

How to Synthesize Text Data without Model Collapse?