Follow
Mingze Wang
Mingze Wang
School of Mathematical Sciences, Peking University
Verified email at stu.pku.edu.cn - Homepage
Title
Cited by
Cited by
Year
The Alignment Property of SGD Noise and How it Helps Select Flat Minima: A Stability Analysis
L Wu, M Wang, WJ Su
Advances in Neural Information Processing Systems (NeurIPS 2022), 1-25, 2022
43*2022
Understanding Multi-phase Optimization Dynamics and Rich Nonlinear Behaviors of ReLU Networks
M Wang, C Ma
Advances in Neural Information Processing Systems (NeurIPS 2023, Spotlight …, 2023
142023
Generalization Error Bounds for Deep Neural Networks Trained by SGD
M Wang, C Ma
arXiv: 2206.03299, 1-32, 2022
142022
Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks
M Wang, C Ma
Advances in Neural Information Processing Systems (NeurIPS 2022), 1-73, 2022
72022
Are AI-Generated Text Detectors Robust to Adversarial Perturbations?
G Huang, Y Zhang, Z Li, Y You, M Wang, Z Yang
Annual Meeting of the Association for Computational Linguistics (ACL 2024), 1-20, 2024
52024
A Theoretical Analysis of Noise Geometry in Stochastic Gradient Descent
M Wang, L Wu
NeurIPS 2023 Workshop on M3L, 1-30, 2023
5*2023
Understanding the Expressive Power and Mechanisms of Transformer for Sequence Modeling
M Wang, W E
Advances in Neural Information Processing Systems (NeurIPS 2024), 1-76, 2024
42024
Improving Generalization and Convergence by Enhancing Implicit Regularization
M Wang, J Wang, H He, Z Wang, G Huang, F Xiong, Z Li, W E, L Wu
Advances in Neural Information Processing Systems (NeurIPS 2024), 1-44, 2024
32024
Loss Symmetry and Noise Equilibrium of Stochastic Gradient Descent
L Ziyin, M Wang, H Li, L Wu
Advances in Neural Information Processing Systems (NeurIPS 2024), 1-26, 2024
3*2024
How Transformers Get Rich: Approximation and Dynamics Analysis
M Wang, R Yu, W E, L Wu
arXiv preprint arXiv:2410.11474, 1-46, 2024
2*2024
Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling
M Wang, Z Min, L Wu
International Conference on Machine Learning (ICML 2024), 1-38, 2023
22023
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training
Z Zhou*, M Wang*, Y Mao, B Li, J Yan
International Conference on Learning Representations (ICLR 2025, Spotlight …, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–12