- Academic Search

T Zhang, C Zhao, G Chen, Y Jiang, F Chen - arxiv preprint arxiv …, 2024 - arxiv.org

Learning representations that generalize under distribution shifts is critical for building
robust machine learning models. However, despite significant efforts in recent years …

Zapisz Cytuj Cytowane przez 5 Powiązane artykuły Wszystkie wersje 7 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

All or none: Identifiable linear properties of next-token predictors in language modeling

E Marconato, S Lachapelle, S Weichwald… - arxiv preprint arxiv …, 2024 - arxiv.org

We analyze identifiability as a possible explanation for the ubiquity of linear properties
across language models, such as the vector difference between the representations of" …

Zapisz Cytuj Cytowane przez 2 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Generalization from Starvation: Hints of Universality in LLM Knowledge Graph Learning

DD Baek, Y Li, M Tegmark - arxiv preprint arxiv:2410.08255, 2024 - arxiv.org

Motivated by interpretability and reliability, we investigate how neural networks represent
knowledge during graph learning, We find hints of universality, where equivalent …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Harmonic Loss Trains Interpretable AI Models

DD Baek, Z Liu, R Tyagi, M Tegmark - arxiv preprint arxiv:2502.01628, 2025 - arxiv.org

In this paper, we introduce** harmonic loss** as an alternative to the standard cross-entropy
loss for training neural networks and large language models (LLMs). Harmonic loss enables …

Zapisz Cytuj Powiązane artykuły Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Representational Analysis of Binding in Language Models

Q Dai, B Heinzerling, K Inui - arxiv preprint arxiv:2409.05448, 2024 - arxiv.org

Entity tracking is essential for complex reasoning. To perform in-context entity tracking,
language models (LMs) must bind an entity to its attribute (eg, bind a container to its content) …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On Representational Dissociation of Language and Arithmetic in Large Language Models

R Kisako, T Kuribayashi, R Sasano - arxiv preprint arxiv:2502.11932, 2025 - arxiv.org

The association between language and (non-linguistic) thinking ability in humans has long
been debated, and recently, neuroscientific evidence of brain activity patterns has been …

Zapisz Cytuj Powiązane artykuły Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Think-to-Talk or Talk-to-Think? When LLMs Come Up with an Answer in Multi-Step Reasoning

K Kudo, Y Aoki, T Kuribayashi, S Sone… - arxiv preprint arxiv …, 2024 - arxiv.org

This study investigates the internal reasoning mechanism of language models during
symbolic multi-step reasoning, motivated by the question of whether chain-of-thought (CoT) …

Zapisz Cytuj Cytowane przez 1 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Monotonic representation of numeric properties in language models

Feature contamination: Neural networks learn uncorrelated features and fail to generalize

All or none: Identifiable linear properties of next-token predictors in language modeling

Generalization from Starvation: Hints of Universality in LLM Knowledge Graph Learning

Harmonic Loss Trains Interpretable AI Models

Representational Analysis of Binding in Language Models

On Representational Dissociation of Language and Arithmetic in Large Language Models

Think-to-Talk or Talk-to-Think? When LLMs Come Up with an Answer in Multi-Step Reasoning