Towards understanding ensemble, knowledge distillation and self-distillation in deep learning

Z Allen-Zhu, Y Li - arxiv preprint arxiv:2012.09816, 2020 - arxiv.org
We formally study how ensemble of deep learning models can improve test accuracy, and
how the superior performance of ensemble can be distilled into a single model using …

Orthogonal representations for robust context-dependent task performance in brains and neural networks

T Flesch, K Juechems, T Dumbalska, A Saxe… - Neuron, 2022 - cell.com
How do neural populations code for multiple, potentially conflicting tasks? Here we used
computational simulations involving neural networks to define" lazy" and" rich" coding …