Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Sgd learning on neural networks: leap complexity and saddle-to-saddle dynamics
We investigate the time complexity of SGD learning on fully-connected neural networks with
isotropic data. We put forward a complexity measure,{\it the leap}, which measures how …
isotropic data. We put forward a complexity measure,{\it the leap}, which measures how …
How far can transformers reason? the globality barrier and inductive scratchpad
Can Transformers predict new syllogisms by composing established ones? More generally,
what type of targets can be learned by such models from scratch? Recent works show that …
what type of targets can be learned by such models from scratch? Recent works show that …
Generalization on the unseen, logic reasoning and degree curriculum
This paper considers the learning of logical (Boolean) functions with a focus on the
generalization on the unseen (GOTU) setting, a strong case of out-of-distribution …
generalization on the unseen (GOTU) setting, a strong case of out-of-distribution …
Provable guarantees for neural networks via gradient feature learning
Neural networks have achieved remarkable empirical performance, while the current
theoretical analysis is not adequate for understanding their success, eg, the Neural Tangent …
theoretical analysis is not adequate for understanding their success, eg, the Neural Tangent …
Towards better out-of-distribution generalization of neural algorithmic reasoning tasks
Transfer learning beyond bounded density ratios
We study the fundamental problem of transfer learning where a learning algorithm collects
data from some source distribution $ P $ but needs to perform well with respect to a different …
data from some source distribution $ P $ but needs to perform well with respect to a different …
VarBench: Robust language model benchmarking through dynamic variable perturbation
As large language models achieve impressive scores on traditional benchmarks, an
increasing number of researchers are becoming concerned about benchmark data leakage …
increasing number of researchers are becoming concerned about benchmark data leakage …
[PDF][PDF] Boolformer: Symbolic regression of logic functions with transformers
In this work, we introduce Boolformer, the first Transformer architecture trained to perform
endto-end symbolic regression of Boolean functions. First, we show that it can predict …
endto-end symbolic regression of Boolean functions. First, we show that it can predict …
The Buffer Mechanism for Multi-Step Information Reasoning in Language Models
Z Wang, Y Wang, Z Zhang, Z Zhou, H **, T Hu… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models have consistently struggled with complex reasoning tasks, such as
mathematical problem-solving. Investigating the internal reasoning mechanisms of these …
mathematical problem-solving. Investigating the internal reasoning mechanisms of these …