- Academic Search

文章

学术搜索

获得 3 条结果（用时0.03秒）

我的个人学术档案我的图书馆

Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling

在引用文章中搜索

[Free GPT-4]

[PDF] arxiv.org

Hunyuan-large: An open-source moe model with 52 billion activated parameters by tencent

X Sun, Y Chen, Y Huang, R **e, J Zhu, K Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

In this paper, we introduce Hunyuan-Large, which is currently the largest open-source
Transformer-based mixture of experts model, with a total of 389 billion parameters and 52 …

保存引用被引用次数：8 相关文章 HTML 版

[Free GPT-4]

[PDF] arxiv.org

On scaling up 3d gaussian splatting training

H Zhao, H Weng, D Lu, A Li, J Li, A Panda… - arxiv preprint arxiv …, 2024 - arxiv.org

3D Gaussian Splatting (3DGS) is increasingly popular for 3D reconstruction due to its
superior visual quality and rendering speed. However, 3DGS training currently occurs on a …

保存引用被引用次数：4 相关文章 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Exploring Variance Reduction in Importance Sampling for Efficient DNN Training

T Kutsuna - arxiv preprint arxiv:2501.13296, 2025 - arxiv.org

Importance sampling is widely used to improve the efficiency of deep neural network (DNN)
training by reducing the variance of gradient estimators. However, efficiently assessing the …

保存引用相关文章 HTML 版

创建快讯

引用

高级搜索

已保存到“我的图书馆”

Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling

Hunyuan-large: An open-source moe model with 52 billion activated parameters by tencent

On scaling up 3d gaussian splatting training

Exploring Variance Reduction in Importance Sampling for Efficient DNN Training