What can transformers learn in-context? a case study of simple function classes
In-context learning is the ability of a model to condition on a prompt sequence consisting of
in-context examples (input-output pairs corresponding to some task) along with a new query …
in-context examples (input-output pairs corresponding to some task) along with a new query …
The staircase property: How hierarchical structure can guide deep learning
This paper identifies a structural property of data distributions that enables deep neural
networks to learn hierarchically. We define the``staircase''property for functions over the …
networks to learn hierarchically. We define the``staircase''property for functions over the …
Computational complexity of learning neural networks: Smoothness and degeneracy
Understanding when neural networks can be learned efficientlyis a fundamental question in
learning theory. Existing hardness results suggest that assumptions on both the input …
learning theory. Existing hardness results suggest that assumptions on both the input …
Connecting interpretability and robustness in decision trees through separation
Recent research has recognized interpretability and robustness as essential properties of
trustworthy classification. Curiously, a connection between robustness and interpretability …
trustworthy classification. Curiously, a connection between robustness and interpretability …
Top-down induction of decision trees: rigorous guarantees and inherent limitations
Consider the following heuristic for building a decision tree for a function $ f:\{0, 1\}^ n\to\{\pm
1\} $. Place the most influential variable $ x_i $ of $ f $ at the root, and recurse on the …
1\} $. Place the most influential variable $ x_i $ of $ f $ at the root, and recurse on the …
Harnessing the power of choices in decision tree learning
We propose a simple generalization of standard and empirically successful decision tree
learning algorithms such as ID3, C4. 5, and CART. These algorithms, which have been …
learning algorithms such as ID3, C4. 5, and CART. These algorithms, which have been …
Provable guarantees for decision tree induction: the agnostic setting
We give strengthened provable guarantees on the performance of widely employed and
empirically successful {\sl top-down decision tree learning heuristics}. While prior works …
empirically successful {\sl top-down decision tree learning heuristics}. While prior works …
Intelligent Heuristics Are the Future of Computing
Back in 1988, the partial game trees explored by computer chess programs were among the
largest search structures in real-world computing. Because the game tree is too large to be …
largest search structures in real-world computing. Because the game tree is too large to be …
The implications of local correlation on learning some deep functions
It is known that learning deep neural-networks is computationally hard in the worst-case. In
fact, the proofs of such hardness results show that even weakly learning deep networks is …
fact, the proofs of such hardness results show that even weakly learning deep networks is …
Decision tree heuristics can fail, even in the smoothed setting
Greedy decision tree learning heuristics are mainstays of machine learning practice, but
theoretical justification for their empirical success remains elusive. In fact, it has long been …
theoretical justification for their empirical success remains elusive. In fact, it has long been …