Hunyuan-large: An open-source moe model with 52 billion activated parameters by tencent
In this paper, we introduce Hunyuan-Large, which is currently the largest open-source
Transformer-based mixture of experts model, with a total of 389 billion parameters and 52 …
Transformer-based mixture of experts model, with a total of 389 billion parameters and 52 …
On scaling up 3d gaussian splatting training
3D Gaussian Splatting (3DGS) is increasingly popular for 3D reconstruction due to its
superior visual quality and rendering speed. However, 3DGS training currently occurs on a …
superior visual quality and rendering speed. However, 3DGS training currently occurs on a …
Exploring Variance Reduction in Importance Sampling for Efficient DNN Training
T Kutsuna - arxiv preprint arxiv:2501.13296, 2025 - arxiv.org
Importance sampling is widely used to improve the efficiency of deep neural network (DNN)
training by reducing the variance of gradient estimators. However, efficiently assessing the …
training by reducing the variance of gradient estimators. However, efficiently assessing the …