Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Pretrained transformer efficiently learns low-dimensional target functions in-context
Transformers can efficiently learn in-context from example demonstrations. Most existing
theoretical analyses studied the in-context learning (ICL) ability of transformers for linear …
theoretical analyses studied the in-context learning (ICL) ability of transformers for linear …
The computational complexity of learning gaussian single-index models
Single-Index Models are high-dimensional regression problems with planted structure,
whereby labels depend on an unknown one-dimensional projection of the input via a …
whereby labels depend on an unknown one-dimensional projection of the input via a …
On the complexity of learning sparse functions with statistical and gradient queries
The goal of this paper is to investigate the complexity of gradient algorithms when learning
sparse functions (juntas). We introduce a type of Statistical Queries ($\mathsf {SQ} $), which …
sparse functions (juntas). We introduce a type of Statistical Queries ($\mathsf {SQ} $), which …
Repetita iuvant: Data repetition allows sgd to learn high-dimensional multi-index functions
Neural networks can identify low-dimensional relevant structures within high-dimensional
noisy data, yet our mathematical understanding of how they do so remains scarce. Here, we …
noisy data, yet our mathematical understanding of how they do so remains scarce. Here, we …
Learning sum of diverse features: computational hardness and efficient gradient-based training for ridge combinations
We study the computational and sample complexity of learning a target function $
f_*:\mathbb {R}^ d\to\mathbb {R} $ with additive structure, that is, $ f_*(x)=\frac {1}{\sqrt …
f_*:\mathbb {R}^ d\to\mathbb {R} $ with additive structure, that is, $ f_*(x)=\frac {1}{\sqrt …
Learning orthogonal multi-index models: A fine-grained information exponent analysis
The information exponent (Ben Arous et al.[2021])--which is equivalent to the lowest degree
in the Hermite expansion of the link function for Gaussian single-index models--has played …
in the Hermite expansion of the link function for Gaussian single-index models--has played …
A random matrix theory perspective on the spectrum of learned features and asymptotic generalization capabilities
A key property of neural networks is their capacity of adapting to data during training. Yet,
our current mathematical understanding of feature learning and its relationship to …
our current mathematical understanding of feature learning and its relationship to …
Learning gaussian multi-index models with gradient flow: Time complexity and directional convergence
This work focuses on the gradient flow dynamics of a neural network model that uses
correlation loss to approximate a multi-index function on high-dimensional standard …
correlation loss to approximate a multi-index function on high-dimensional standard …
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit
We study the problem of gradient descent learning of a single-index target function $
f_*(\boldsymbol {x})=\textstyle\sigma_*\left (\langle\boldsymbol {x},\boldsymbol …
f_*(\boldsymbol {x})=\textstyle\sigma_*\left (\langle\boldsymbol {x},\boldsymbol …
Gradient dynamics for low-rank fine-tuning beyond kernels
LoRA has emerged as one of the de facto methods for fine-tuning foundation models with
low computational cost and memory footprint. The idea is to only train a low-rank …
low computational cost and memory footprint. The idea is to only train a low-rank …