Google Академія

E Elmoznino, T Marty, T Kasetty, L Gagnon… - arxiv preprint arxiv …, 2024 - arxiv.org

A central goal of machine learning is generalization. While the No Free Lunch Theorem
states that we cannot obtain theoretical guarantees for generalization without further …

Зберегти Послатися Цитовано в 2 джерелах Пов’язані статті Кількість версій: 2 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Feature forgetting in continual representation learning

X Zhang, D Dou, J Wu - arxiv preprint arxiv:2205.13359, 2022 - arxiv.org

In continual and lifelong learning, good representation learning can help increase
performance and reduce sample complexity when learning new tasks. There is evidence …

Зберегти Послатися Цитовано в 5 джерелах Пов’язані статті Кількість версій: 3 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Larger Language Models Provably Generalize Better

MA Finzi, S Kapoor, D Granziol, A Gu… - … Conference on Learning … - openreview.net

Why do larger language models generalize better? To explore this question, we develop
generalization bounds on the pretraining objective of large language models (LLMs) in the …

Зберегти Послатися Пов’язані статті Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Information distance for neural network functions

X Zhang, D Dou, J Wu - openreview.net

We provide a practical distance measure in the space of functions parameterized by neural
networks. It is based on the classical information distance, and we propose to replace the …

Зберегти Послатися Пов’язані статті Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Model information as an analysis tool in deep learning

X Zhang, D Hu, X Li, D Dou, J Wu - openreview.net

Information-theoretic perspectives can provide an alternative dimension of analyzing the
learning process and complements usual performance metrics. Recently several works …

Зберегти Послатися Пов’язані статті Показати у форматі HTML

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Measuring information transfer in neural networks

In-context learning and Occam's razor

Feature forgetting in continual representation learning

Larger Language Models Provably Generalize Better

Information distance for neural network functions

Model information as an analysis tool in deep learning