Theo dõi
Imanol Schlag
Imanol Schlag
ETH AI Center
Email được xác minh tại ethz.ch
Tiêu đề
Trích dẫn bởi
Trích dẫn bởi
Năm
Solving quantitative reasoning problems with language models
A Lewkowycz, A Andreassen, D Dohan, E Dyer, H Michalewski, ...
Advances in Neural Information Processing Systems 35, 3843-3857, 2022
7272022
Linear Transformers are Secretly Fast Weight Programmers
I Schlag*, K Irie*, J Schmidhuber
International Conference on Machine Learning, 9355-9366, 2021
269*2021
Block-Recurrent Transformers
DL Hutchins*, I Schlag*, Y Wu, E Dyer, B Neyshabur
arXiv preprint arXiv:2203.07852, 2022
1212022
Learning to reason with third order tensor products
I Schlag, J Schmidhuber
Advances in neural information processing systems 31, 9981-9993, 2018
842018
Going beyond linear transformers with recurrent fast weight programmers
K Irie*, I Schlag*, R Csordás, J Schmidhuber
Advances in Neural Information Processing Systems 34, 2021
772021
Enhancing the transformer with explicit relational encoding for math problem solving
I Schlag, P Smolensky, R Fernandez, N Jojic, J Schmidhuber, J Gao
arXiv preprint arXiv:1910.06611, 2019
762019
Mindstorms in Natural Language-Based Societies of Mind
M Zhuge, H Liu, F Faccio, DR Ashley, R Csordás, A Gopalakrishnan, ...
arXiv preprint arXiv:2305.17066, 2023
722023
Learning Associative Inference Using Fast Weight Memory
I Schlag, T Munkhdalai, J Schmidhuber
International Conference on Learning Representations, 2021
492021
A Modern Self-Referential Weight Matrix That Learns to Modify Itself
K Irie, I Schlag, R Csordás, J Schmidhuber
Deep RL Workshop NeurIPS 2021, 2021
422021
Ancient Roman coin recognition in the wild using deep learning based recognition of artistically depicted face profiles
I Schlag, O Arandjelovic
Proceedings of the IEEE International Conference on Computer Vision …, 2017
422017
Gated fast weights for on-the-fly neural program generation
I Schlag, J Schmidhuber
NIPS Metalearning Workshop, 2017
332017
Large Language Model Programs
I Schlag, S Sukhbaatar, A Celikyilmaz, W Yih, J Weston, J Schmidhuber, ...
arXiv preprint arXiv:2305.05364, 2023
252023
The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute
A Stanić, D Ashley, O Serikov, L Kirsch, F Faccio, J Schmidhuber, ...
arXiv preprint arXiv:2309.11197, 2023
92023
Solving quantitative reasoning problems with language models, 2022
A Lewkowycz, A Andreassen, D Dohan, E Dyer, H Michalewski, ...
URL https://arxiv. org/abs/2206.14858, 0
8
Understanding and Minimising Outlier Features in Neural Network Training
B He, L Noci, D Paliotta, I Schlag, T Hofmann
arXiv preprint arXiv:2405.19279, 2024
52024
Block-recurrent transformers (2022)
DL Hutchins, I Schlag, Y Wu, E Dyer, B Neyshabur
URL https://arxiv. org/abs/2203.07852, 0
5
INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge
A Romanou, N Foroutan, A Sotnikova, Z Chen, SH Nelaturu, S Singh, ...
arXiv preprint arXiv:2411.19799, 2024
42024
The Role of Language Imbalance in Cross-lingual Generalisation: Insights from Cloned Language Experiments
A Schäfer, S Ravfogel, T Hofmann, T Pimentel, I Schlag
arXiv preprint arXiv:2404.07982, 2024
3*2024
Navigating Scaling Laws: Accelerating Vision Transformer's Training via Adaptive Strategies
S Anagnostidis, G Bachmann, I Schlag, T Hofmann
arXiv preprint arXiv:2311.03233, 2024
32024
Improving Baselines in the Wild
K Irie, I Schlag, R Csordás, J Schmidhuber
NeurIPS 2021 Workshop on Distribution Shifts: Connecting Methods and …, 2021
12021
Hệ thống không thể thực hiện thao tác ngay bây giờ. Hãy thử lại sau.
Bài viết 1–20