Hunyuan-large: An open-source moe model with 52 billion activated parameters by tencent

X Sun, Y Chen, Y Huang, R **e, J Zhu, K Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
In this paper, we introduce Hunyuan-Large, which is currently the largest open-source
Transformer-based mixture of experts model, with a total of 389 billion parameters and 52 …

On scaling up 3d gaussian splatting training

H Zhao, H Weng, D Lu, A Li, J Li, A Panda… - arxiv preprint arxiv …, 2024 - arxiv.org
3D Gaussian Splatting (3DGS) is increasingly popular for 3D reconstruction due to its
superior visual quality and rendering speed. However, 3DGS training currently occurs on a …

Exploring Variance Reduction in Importance Sampling for Efficient DNN Training

T Kutsuna - arxiv preprint arxiv:2501.13296, 2025 - arxiv.org
Importance sampling is widely used to improve the efficiency of deep neural network (DNN)
training by reducing the variance of gradient estimators. However, efficiently assessing the …