Data-free quantization through weight equalization and bias correction M Nagel, M Baalen, T Blankevoort, M Welling
Proceedings of the IEEE/CVF international conference on computer vision …, 2019
635 2019 A white paper on neural network quantization M Nagel, M Fournarakis, RA Amjad, Y Bondarenko, M Van Baalen, ...
arXiv preprint arXiv:2106.08295, 2021
622 2021 Up or down? adaptive rounding for post-training quantization M Nagel, RA Amjad, M Van Baalen, C Louizos, T Blankevoort
International conference on machine learning, 7197-7206, 2020
584 2020 Bayesian bits: Unifying quantization and pruning M Van Baalen, C Louizos, M Nagel, RA Amjad, Y Wang, T Blankevoort, ...
Advances in neural information processing systems 33, 5741-5752, 2020
144 2020 Fp8 quantization: The power of the exponent A Kuzmin, M Van Baalen, Y Ren, M Nagel, J Peters, T Blankevoort
Advances in Neural Information Processing Systems 35, 14651-14662, 2022
73 2022 Gradient Regularization for Quantization Robustness M Alizadeh, A Behboodi, M Van Baalen, C Louizos, T Blankevoort, ...
arXiv preprint arXiv:2002.07520, 2020
64 2020 Pruning vs quantization: Which is better? A Kuzmin, M Nagel, M Van Baalen, A Behboodi, T Blankevoort
Advances in neural information processing systems 36, 62414-62427, 2023
50 2023 A white paper on neural network quantization. arXiv 2021 M Nagel, M Fournarakis, RA Amjad, Y Bondarenko, M van Baalen, ...
arXiv preprint arXiv:2106.08295 4, 0
39 FP8 versus INT8 for efficient deep learning inference M Van Baalen, A Kuzmin, SS Nair, Y Ren, E Mahurin, C Patel, ...
arXiv preprint arXiv:2303.17951, 2023
38 2023 The llm surgeon TFA van der Ouderaa, M Nagel, M Van Baalen, YM Asano, T Blankevoort
arXiv preprint arXiv:2312.17244, 2023
30 2023 Deep matrix factorization for recommendation M van Baalen
Master's Thesis, Univ. of Amsterdam, Sep 30, 2016
22 2016 Gptvq: The blessing of dimensionality for llm quantization M Van Baalen, A Kuzmin, M Nagel, P Couperus, C Bastoul, E Mahurin, ...
arXiv preprint arXiv:2402.15319, 2024
19 2024 Cyclical pruning for sparse neural networks S Srinivas, A Kuzmin, M Nagel, M van Baalen, A Skliar, T Blankevoort
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022
19 2022 A practical mixed precision algorithm for post-training quantization NP Pandey, M Nagel, M van Baalen, Y Huang, C Patel, T Blankevoort
arXiv preprint arXiv:2302.05397, 2023
12 2023 Simulated quantization, real power savings M van Baalen, B Kahne, E Mahurin, A Kuzmin, A Skliar, M Nagel, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
10 2022 FP8 versus INT8 for efficient deep learning inference M Baalen, A Kuzmin, SS Nair, Y Ren, E Mahurin, C Patel, S Subramanian, ...
arXiv preprint arXiv:2303.17951, 2023
6 2023 Qbitopt: Fast and accurate bitwidth reallocation during training J Peters, M Fournarakis, M Nagel, M Van Baalen, T Blankevoort
Proceedings of the IEEE/CVF international conference on computer vision …, 2023
6 2023 Quantized sparse weight decomposition for neural network compression A Kuzmin, M van Baalen, M Nagel, A Behboodi
arXiv preprint arXiv:2207.11048, 2022
3 2022 Mixture of cache-conditional experts for efficient mobile device inference A Skliar, T van Rozendaal, R Lepert, T Boinovski, M van Baalen, M Nagel, ...
arXiv preprint arXiv:2412.00099, 2024
2 2024 Rapid switching and multi-adapter fusion via sparse high rank adapters K Bhardwaj, NP Pandey, S Priyadarshi, V Ganapathy, R Esteves, ...
arXiv preprint arXiv:2407.16712, 2024
2 2024