Theo dõi
Sharan Narang
Sharan Narang
Research Engineer, Meta AI
Email được xác minh tại meta.com
Tiêu đề
Trích dẫn bởi
Trích dẫn bởi
Năm
Exploring the limits of transfer learning with a unified text-to-text transformer
C Raffel, N Shazeer, A Roberts, K Lee, S Narang, M Matena, Y Zhou, W Li, ...
Journal of machine learning research 21 (140), 1-67, 2020
214792020
Llama 2: Open foundation and fine-tuned chat models
H Touvron, L Martin, K Stone, P Albert, A Almahairi, Y Babaei, ...
arXiv preprint arXiv:2307.09288, 2023
129492023
Palm: Scaling language modeling with pathways
A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ...
Journal of Machine Learning Research 24 (240), 1-113, 2023
56412023
Deep speech 2: End-to-end speech recognition in english and mandarin
D Amodei, S Ananthanarayanan, R Anubhai, J Bai, E Battenberg, C Case, ...
International conference on machine learning, 173-182, 2016
39192016
Scaling instruction-finetuned language models
HW Chung, L Hou, S Longpre, B Zoph, Y Tay, W Fedus, Y Li, X Wang, ...
Journal of Machine Learning Research 25 (70), 1-53, 2024
33882024
The llama 3 herd of models
A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ...
arXiv preprint arXiv:2407.21783, 2024
28992024
Mixed precision training
P Micikevicius, S Narang, J Alben, G Diamos, E Elsen, D Garcia, ...
arXiv preprint arXiv:1710.03740, 2017
21502017
Self-consistency improves chain of thought reasoning in language models
X Wang, J Wei, D Schuurmans, Q Le, E Chi, S Narang, A Chowdhery, ...
arXiv preprint arXiv:2203.11171, 2022
13592022
Deep voice 3: Scaling text-to-speech with convolutional sequence learning
W Ping, K Peng, A Gibiansky, SO Arik, A Kannan, S Narang, J Raiman, ...
arXiv preprint arXiv:1710.07654, 2017
919*2017
Deep learning scaling is predictable, empirically
J Hestness, S Narang, N Ardalani, G Diamos, H Jun, H Kianinejad, ...
arXiv preprint arXiv:1712.00409, 2017
8572017
Byt5: Towards a token-free future with pre-trained byte-to-byte models
L Xue, A Barua, N Constant, R Al-Rfou, S Narang, M Kale, A Roberts, ...
Transactions of the Association for Computational Linguistics 10, 291-306, 2022
4742022
Exploring sparsity in recurrent neural networks
S Narang, E Elsen, G Diamos, S Sengupta
arXiv preprint arXiv:1704.05119, 2017
3822017
DSD: regularizing deep neural networks with dense-sparse-dense training flow
S Han, J Pool, S Narang, H Mao, S Tang, E Elsen, B Catanzaro, J Tran, ...
arXiv preprint arXiv:1607.04381 3 (6), 2016
350*2016
Wt5?! training text-to-text models to explain their predictions
S Narang, C Raffel, K Lee, A Roberts, N Fiedel, K Malkan
arXiv preprint arXiv:2004.14546, 2020
2112020
Effective long-context scaling of foundation models
W Xiong, J Liu, I Molybog, H Zhang, P Bhargava, R Hou, L Martin, ...
arXiv preprint arXiv:2309.16039, 2023
1882023
Scaling up models and data with t5x and seqio
A Roberts, HW Chung, G Mishra, A Levskaya, J Bradbury, D Andor, ...
Journal of Machine Learning Research 24 (377), 1-8, 2023
1622023
Block-sparse recurrent neural networks
S Narang, E Undersander, G Diamos
arXiv preprint arXiv:1711.02782, 2017
1612017
Llama 2: Open foundation and fine-tuned chat models. arXiv 2023
H Touvron, L Martin, K Stone, P Albert, A Almahairi, Y Babaei, ...
arXiv preprint arXiv:2307.09288 10, 2023
1522023
Scale efficiently: Insights from pre-training and fine-tuning transformers
Y Tay, M Dehghani, J Rao, W Fedus, S Abnar, HW Chung, S Narang, ...
arXiv preprint arXiv:2109.10686, 2021
1472021
Palm: Scaling language modeling with pathways, 2022
A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ...
arXiv preprint arXiv:2204.02311, 2022
1342022
Hệ thống không thể thực hiện thao tác ngay bây giờ. Hãy thử lại sau.
Bài viết 1–20