Shuming Ma

Zitiert von

	Alle	Seit 2020
Zitate	6236	5834
h-index	41	40
i10-index	69	64

3000

1500

750

2250

20172018201920202021202220232024202523 154 219 296 424 589 1323 2929 269

Öffentlicher Zugriff

Alle anzeigen

18 Artikel

1 Artikel

verfügbar

nicht verfügbar

Basierend auf Fördermandaten

Koautoren

Furu WeiPartner Research Manager, Microsoft ResearchBestätigte E-Mail-Adresse bei microsoft.com
Xu SunAssociate Professor, Peking UniversityBestätigte E-Mail-Adresse bei pku.edu.cn
houfeng wangPeking UniversityBestätigte E-Mail-Adresse bei pku.edu.cn
Junyang LinQwen Team, Alibaba Group & Peking UniversityBestätigte E-Mail-Adresse bei alibaba-inc.com
Lei CuiMicrosoft Research AsiaBestätigte E-Mail-Adresse bei microsoft.com
Tianyu LiuAlibabaBestätigte E-Mail-Adresse bei pku.edu.cn
Jingjing XuShanghai AI LabBestätigte E-Mail-Adresse bei pku.edu.cn
Wenjie LiThe Hong Kong Polytechnic UniversityBestätigte E-Mail-Adresse bei comp.polyu.edu.hk
Sujian LIPeking Univ.Bestätigte E-Mail-Adresse bei pku.edu.cn
Yizhong WangUniversity of WashingtonBestätigte E-Mail-Adresse bei cs.washington.edu

Folgen

Shuming Ma

Microsoft Research Asia

Bestätigte E-Mail-Adresse bei microsoft.com - Startseite

Natural language processing deep learning


Titel Nach Zitationen sortieren Nach Jahr sortieren Nach Titel sortieren	Zitiert von Zitiert von	Jahr
Kosmos-2: Grounding multimodal large language models to the world Z Peng, W Wang, L Dong, Y Hao, S Huang, S Ma, F Wei arXiv preprint arXiv:2306.14824, 2023	604	2023
SGM: sequence generation model for multi-label classification P Yang, X Sun, W Li, S Ma, W Wu, H Wang arXiv preprint arXiv:1806.04822, 2018	501	2018
Language is not all you need: Aligning perception with language models S Huang, L Dong, W Wang, Y Hao, S Singhal, S Ma, T Lv, L Cui, ... Advances in Neural Information Processing Systems 36, 72096-72109, 2023	478	2023
Why can gpt learn in-context? language models implicitly perform gradient descent as meta-optimizers D Dai, Y Sun, L Dong, Y Hao, S Ma, Z Sui, F Wei arXiv preprint arXiv:2212.10559, 2022	355	2022
Retentive network: A successor to transformer for large language models Y Sun, L Dong, S Huang, S Ma, Y Xia, J Xue, J Wang, F Wei arXiv preprint arXiv:2307.08621, 2023	300	2023
meprop: Sparsified back propagation for accelerated deep learning with reduced overfitting X Sun, X Ren, S Ma, H Wang International Conference on Machine Learning, 3299-3308, 2017	202	2017
Global encoding for abstractive summarization J Lin, X Sun, S Ma, Q Su arXiv preprint arXiv:1805.03989, 2018	201	2018
A whole-slide foundation model for digital pathology from real-world data H Xu, N Usuyama, J Bagga, S Zhang, R Rao, T Naumann, C Wong, ... Nature, 1-8, 2024	171	2024
Deepnet: Scaling transformers to 1,000 layers H Wang, S Ma, L Dong, S Huang, D Zhang, F Wei IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024	171	2024
The era of 1-bit llms: All large language models are in 1.58 bits S Ma, H Wang, L Ma, L Wang, W Wang, S Huang, L Dong, R Wang, J Xue, ... arXiv preprint arXiv:2402.17764, 2024	166	2024
Longnet: Scaling transformers to 1,000,000,000 tokens J Ding, S Ma, L Dong, X Zhang, S Huang, W Wang, N Zheng, F Wei arXiv preprint arXiv:2307.02486, 2023	152	2023
A length-extrapolatable transformer Y Sun, L Dong, B Patra, S Ma, S Huang, A Benhaim, V Chaudhary, ... arXiv preprint arXiv:2212.10554, 2022	149	2022
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA Z Chi arXiv preprint arXiv:2106.16138, 2021	133	2021
Language models are general-purpose interfaces Y Hao, H Song, L Dong, S Huang, Z Chi, W Wang, S Ma, F Wei arXiv preprint arXiv:2206.06336, 2022	108	2022
A simple and effective unified encoder for document-level machine translation S Ma, D Zhang, M Zhou Proceedings of the 58th annual meeting of the association for computational …, 2020	105	2020
Bitnet: Scaling 1-bit transformers for large language models H Wang, S Ma, L Dong, S Huang, H Wang, L Ma, F Yang, R Wang, Y Wu, ... arXiv preprint arXiv:2310.11453, 2023	101	2023
Alternating language modeling for cross-lingual pre-training J Yang, S Ma, D Zhang, S Wu, Z Li, M Zhou Proceedings of the AAAI Conference on Artificial Intelligence 34 (05), 9386-9393, 2020	95	2020
mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs Z Chi arXiv preprint arXiv:2104.08692, 2021	84	2021
On the representation collapse of sparse mixture of experts Z Chi, L Dong, S Huang, D Dai, S Ma, B Patra, S Singhal, P Bajaj, X Song, ... Advances in Neural Information Processing Systems 35, 34600-34613, 2022	83	2022
Improving semantic relevance for sequence-to-sequence learning of chinese social media text summarization S Ma, X Sun, J Xu, H Wang, W Li, Q Su arXiv preprint arXiv:1706.02459, 2017	83	2017

Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.

Artikel 1–20

Zitate pro Jahr

Doppelte Zitate

Zusammengeführte Zitate

Koautor hinzufügenKoautoren

Folgen

Zitiert von

Koautoren