Suivre
Imanol Schlag
Imanol Schlag
ETH AI Center
Adresse e-mail validée de ethz.ch
Titre
Citée par
Citée par
Année
Solving quantitative reasoning problems with language models
A Lewkowycz, A Andreassen, D Dohan, E Dyer, H Michalewski, ...
Advances in Neural Information Processing Systems 35, 3843-3857, 2022
6702022
Linear Transformers are Secretly Fast Weight Programmers
I Schlag*, K Irie*, J Schmidhuber
International Conference on Machine Learning, 9355-9366, 2021
265*2021
Block-Recurrent Transformers
DL Hutchins*, I Schlag*, Y Wu, E Dyer, B Neyshabur
arXiv preprint arXiv:2203.07852, 2022
1192022
Learning to reason with third order tensor products
I Schlag, J Schmidhuber
Advances in neural information processing systems 31, 9981-9993, 2018
852018
Going beyond linear transformers with recurrent fast weight programmers
K Irie*, I Schlag*, R Csordás, J Schmidhuber
Advances in Neural Information Processing Systems 34, 2021
772021
Enhancing the transformer with explicit relational encoding for math problem solving
I Schlag, P Smolensky, R Fernandez, N Jojic, J Schmidhuber, J Gao
arXiv preprint arXiv:1910.06611, 2019
772019
Mindstorms in Natural Language-Based Societies of Mind
M Zhuge, H Liu, F Faccio, DR Ashley, R Csordás, A Gopalakrishnan, ...
arXiv preprint arXiv:2305.17066, 2023
682023
Learning Associative Inference Using Fast Weight Memory
I Schlag, T Munkhdalai, J Schmidhuber
International Conference on Learning Representations, 2021
502021
Ancient Roman coin recognition in the wild using deep learning based recognition of artistically depicted face profiles
I Schlag, O Arandjelovic
Proceedings of the IEEE International Conference on Computer Vision …, 2017
432017
Solving quantitative reasoning problems with language models, 2022
A Lewkowycz, A Andreassen, D Dohan, E Dyer, H Michalewski, ...
URL https://arxiv. org/abs/2206.14858, 0
43*
A Modern Self-Referential Weight Matrix That Learns to Modify Itself
K Irie, I Schlag, R Csordás, J Schmidhuber
Deep RL Workshop NeurIPS 2021, 2021
382021
Gated fast weights for on-the-fly neural program generation
I Schlag, J Schmidhuber
NIPS Metalearning Workshop, 2017
332017
Large Language Model Programs
I Schlag, S Sukhbaatar, A Celikyilmaz, W Yih, J Weston, J Schmidhuber, ...
arXiv preprint arXiv:2305.05364, 2023
212023
The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute
A Stanić, D Ashley, O Serikov, L Kirsch, F Faccio, J Schmidhuber, ...
arXiv preprint arXiv:2309.11197, 2023
82023
Understanding and Minimising Outlier Features in Neural Network Training
B He, L Noci, D Paliotta, I Schlag, T Hofmann
arXiv preprint arXiv:2405.19279, 2024
42024
Block-recurrent transformers (2022)
DL Hutchins, I Schlag, Y Wu, E Dyer, B Neyshabur
URL https://arxiv. org/abs/2203.07852, 0
4
Language Imbalance Can Boost Cross-lingual Generalisation
A Schäfer, S Ravfogel, T Hofmann, T Pimentel, I Schlag
arXiv preprint arXiv:2404.07982, 2024
32024
Navigating Scaling Laws: Accelerating Vision Transformer's Training via Adaptive Strategies
S Anagnostidis, G Bachmann, I Schlag, T Hofmann
arXiv preprint arXiv:2311.03233, 2024
32024
Improving Baselines in the Wild
K Irie, I Schlag, R Csordás, J Schmidhuber
NeurIPS 2021 Workshop on Distribution Shifts: Connecting Methods and …, 2021
32021
INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge
A Romanou, N Foroutan, A Sotnikova, Z Chen, SH Nelaturu, S Singh, ...
arXiv preprint arXiv:2411.19799, 2024
22024
Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.
Articles 1–20