No language left behind: Scaling human-centered machine translation MR Costa-Jussà, J Cross, O Çelebi, M Elbayad, K Heafield, K Heffernan, ... arXiv preprint arXiv:2207.04672, 2022 | 767 | 2022 |
SeamlessM4T: Massively Multilingual & Multimodal Machine Translation L Barrault, YA Chung, MC Meglioli, D Dale, N Dong, PA Duquenne, ... arXiv preprint arXiv:2308.11596, 2023 | 113 | 2023 |
Seamless: Multilingual Expressive and Streaming Speech Translation L Barrault, YA Chung, MC Meglioli, D Dale, N Dong, M Duppenthaler, ... arXiv preprint arXiv:2312.05187, 2023 | 109 | 2023 |
No language left behind: Scaling human-centered machine translation N Team, MR Costa-Jussà, J Cross, O Çelebi, M Elbayad, K Heafield, ... arXiv preprint arXiv:2207.04672, 2022 | 81 | 2022 |
No language left behind: Scaling human-centered machine translation, 2022 NLLB Team, MR Costa-jussà, J Cross, O Çelebi, M Elbayad, K Heafield, ... URL https://arxiv. org/abs/2207.04672, 2022 | 23 | 2022 |
BLASER: A text-free speech-to-speech translation evaluation metric M Chen, PA Duquenne, P Andrews, J Kao, A Mourachko, H Schwenk, ... arXiv preprint arXiv:2212.08486, 2022 | 18 | 2022 |
No language left behind: scaling human-centered machine translation. arXiv MR Costa-Jussà, J Cross, O Çelebi, M Elbayad, K Heafield, K Heffernan, ... Preprint, 2022 | 15 | 2022 |
Findings of the WMT’22 shared task on large-scale machine translation evaluation for African languages D Adelani, MMI Alam, A Anastasopoulos, A Bhagia, MR Costa-jussà, ... Proceedings of the Seventh Conference on Machine Translation (WMT), 773-800, 2022 | 14 | 2022 |
Mutox: Universal multilingual audio-based toxicity dataset and zero-shot detector MR Costa-jussà, MC Meglioli, P Andrews, D Dale, P Hansanti, E Kalbassi, ... arXiv preprint arXiv:2401.05060, 2024 | 10 | 2024 |
xSIM++: An improved proxy to bitext mining performance for low-resource languages M Chen, K Heffernan, O Çelebi, A Mourachko, H Schwenk arXiv preprint arXiv:2306.12907, 2023 | 7 | 2023 |
Large Concept Models: Language Modeling in a Sentence Representation Space L Barrault, PA Duquenne, M Elbayad, A Kozhevnikov, B Alastruey, ... arXiv e-prints, arXiv: 2412.08821, 2024 | 2 | 2024 |
Aligning speech segments beyond pure semantics K Heffernan, A Kozhevnikov, L Barrault, A Mourachko, H Schwenk Findings of the Association for Computational Linguistics ACL 2024, 3626-3635, 2024 | 2 | 2024 |
Sonar expressive: Zero-shot expressive speech-to-speech translation PA Duquenne, K Heffernan, A Mourachko, B Sagot, H Schwenk | 2 | 2023 |
stopes-Modular Machine Translation Pipelines P Andrews, G Wenzek, K Heffernan, O Çelebi, A Sun, A Kamran, Y Guo, ... Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022 | 2 | 2022 |
LCFO: Long context and long form output dataset and benchmarking MR Costa-jussà, P Andrews, MC Meglioli, J Chen, J Chuang, D Dale, ... arXiv preprint arXiv:2412.08268, 2024 | 1 | 2024 |
BOUQuET: dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation P Andrews, M Artetxe, MC Meglioli, MR Costa-jussà, J Chuang, D Dale, ... arXiv preprint arXiv:2502.04314, 2025 | | 2025 |
Video Seal: Open and Efficient Video Watermarking P Fernandez, H Elsahar, IZ Yalniz, A Mourachko arXiv preprint arXiv:2412.09492, 2024 | | 2024 |
Large Concept Models: Language Modeling in a Sentence Representation Space LCM The, L Barrault, PA Duquenne, M Elbayad, A Kozhevnikov, ... arXiv preprint arXiv:2412.08821, 2024 | | 2024 |
Speech Data from Radio Broadcasts for Low Resource Languages BB Odoom, LPG Perera, P Hansanti, L Barrault, C Ropers, M Wiesner, ... Proceedings of the 21st International Conference on Spoken Language …, 2024 | | 2024 |
Proceedings of the Seventh Conference on Machine Translation (WMT) P Koehn, L Barrault, O Bojar, F Bougares, R Chatterjee, MR Costa-jussà, ... Proceedings of the Seventh Conference on Machine Translation (WMT), 2022 | | 2022 |