Llama 2: Open foundation and fine-tuned chat models H Touvron, L Martin, K Stone, P Albert, A Almahairi, Y Babaei, ... arXiv preprint arXiv:2307.09288, 2023 | 12114 | 2023 |
The llama 3 herd of models A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ... arXiv preprint arXiv:2407.21783, 2024 | 2204 | 2024 |
Beyond english-centric multilingual machine translation A Fan, S Bhosale, H Schwenk, Z Ma, A El-Kishky, S Goyal, M Baines, ... Journal of Machine Learning Research 22 (107), 1-48, 2021 | 856 | 2021 |
No language left behind: Scaling human-centered machine translation MR Costa-jussà, J Cross, O Çelebi, M Elbayad, K Heafield, K Heffernan, ... arXiv preprint arXiv:2207.04672, 2022 | 800 | 2022 |
BASE Layers: Simplifying Training of Large, Sparse Models L Lewis, Mike and Bhosale, Shruti and Dettmers, Tim and Goyal, Naman and ... International Conference on Machine Learning, 2021 | 248 | 2021 |
Effective long-context scaling of foundation models W Xiong, J Liu, I Molybog, H Zhang, P Bhargava, R Hou, L Martin, ... arXiv preprint arXiv:2309.16039, 2023 | 170 | 2023 |
Efficient Large Scale Language Modeling with Mixtures of Experts M Artetxe, S Bhosale, N Goyal, T Mihaylov, M Ott, S Shleifer, XV Lin, J Du, ... EMNLP 2022, 2021 | 168* | 2021 |
Llama 2: open foundation and fine-tuned chat models. arXiv H Touvron, L Martin, K Stone, P Albert, A Almahairi, Y Babaei, ... arXiv preprint arXiv:2307.09288, 2023 | 152 | 2023 |
Llama 2: Open foundation and fine-tuned chat models, 2023b H Touvron, L Martin, K Stone, P Albert, A Almahairi, Y Babaei, ... URL https://arxiv. org/abs/2307.09288, 2023 | 140 | 2023 |
Llama 2: Open foundation and fine-tuned chat models. arXiv 2023 H Touvron, L Martin, K Stone, P Albert, A Almahairi, Y Babaei, ... arXiv preprint arXiv:2307.09288, 0 | 140 | |
Jingfei Du, et al. 2021. Few-shot learning with multilingual language models XV Lin, T Mihaylov, M Artetxe, T Wang, S Chen, D Simig, M Ott, N Goyal, ... arXiv preprint arXiv:2112.10668, 35-40, 2021 | 106 | 2021 |
Facebook AI’s WMT21 News Translation Task Submission C Tran, S Bhosale, J Cross, P Koehn, S Edunov, A Fan Proceedings of the Sixth Conference on Machine Translation, 205-215, 2021 | 102 | 2021 |
Few-shot learning with multilingual generative language models XV Lin, T Mihaylov, M Artetxe, T Wang, S Chen, D Simig, M Ott, N Goyal, ... Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022 | 75 | 2022 |
The llama 3 herd of models A Grattafiori, A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, ... arXiv e-prints, arXiv: 2407.21783, 2024 | 62 | 2024 |
& Scialom, T.(2023). Llama 2: Open foundation and fine-tuned chat models H Touvron, L Martin, K Stone, P Albert, A Almahairi, Y Babaei, ... arXiv preprint arXiv:2307.09288, 2023 | 62 | 2023 |
Llama 2: open foundation and fine-tuned chat models. CoRR abs/2307.09288 (2023) H Touvron, L Martin, K Stone, P Albert, A Almahairi, Y Babaei, ... arXiv preprint arXiv:2307.09288 10, 2023 | 61 | 2023 |
Few-shot learning with multilingual language models XV Lin, T Mihaylov, M Artetxe, T Wang, S Chen, D Simig, M Ott, N Goyal, ... EMNLP 2022, 2021 | 58 | 2021 |
Fairscale: A general purpose modular pytorch library for high performance and large scale training M Baines, S Bhosale, V Caggiano, N Goyal, S Goyal, M Ott, B Lefaudeux, ... | 46 | 2021 |
Revisiting machine translation for cross-lingual classification M Artetxe, V Goswami, S Bhosale, A Fan, L Zettlemoyer arXiv preprint arXiv:2305.14240, 2023 | 26 | 2023 |
Multilingual Machine Translation with Hyper-Adapters C Baziotis, M Artetxe, J Cross, S Bhosale EMNLP 2022, 2022 | 25 | 2022 |