Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1747 | 2023 |
Hate speech dataset from a white supremacy forum O De Gibert, N Perez, A García-Pablos, M Cuadros arXiv preprint arXiv:1809.04444, 2018 | 567 | 2018 |
Are multilingual models the best choice for moderately under-resourced languages? A comprehensive assessment for Catalan J Armengol-Estapé, CP Carrino, C Rodriguez-Penagos, OG Bonet, ... arXiv preprint arXiv:2107.07903, 2021 | 47 | 2021 |
On the multilingual capabilities of very large-scale English language models J Armengol-Estapé, OG Bonet, M Melero arXiv preprint arXiv:2108.13349, 2021 | 30 | 2021 |
Spanish biomedical crawled corpus: A large, diverse dataset for spanish biomedical language models CP Carrino, J Armengol-Estapé, OG Bonet, A Gutiérrez-Fandiño, ... arXiv preprint arXiv:2109.07765, 2021 | 18 | 2021 |
A new massive multilingual dataset for high-performance language technologies O De Gibert, G Nail, N Arefyev, M Bañón, J Van Der Linde, S Ji, ... arXiv preprint arXiv:2403.14009, 2024 | 15 | 2024 |
Spanish biomedical and clinical language embeddings A Gutiérrez-Fandino, J Armengol-Estapé, CP Carrino, O De Gibert, ... arXiv preprint arXiv:2102.12843, 2021 | 10 | 2021 |
Estrategia multidimensional para la selección de candidatos de traducción automática para posedición N Aranberri, O de Gibert Linguamática 11 (2), 3-16, 2019 | 10 | 2019 |
Four approaches to low-resource multilingual NMT: The Helsinki submission to the AmericasNLP 2023 shared task O De Gibert, R Vázquez, M Aulamo, Y Scherrer, S Virpioja, J Tiedemann Proceedings of the Workshop on Natural Language Processing for Indigenous …, 2023 | 7 | 2023 |
Automatic removal of identifying information in official EU languages for public administrations: The MAPA Project L Gianola, Ē Ajausks, V Arranz, C Bendahman, L Bié, C Borg, A Cerdà, ... Legal Knowledge and Information Systems, 223-226, 2020 | 6 | 2020 |
Quality versus Quantity: Building Catalan-English MT Resources O de Gibert, K Kharitonova, BC Figueras, J Armengol-Estapé, M Melero | 5* | 2022 |
The catalan language club C Rodriguez-Penagos, C Armentano-Oller, M Villegas, M Melero, ... arXiv preprint arXiv:2112.01894, 2021 | 5 | 2021 |
Spanish Datasets for Sensitive Entity Detection in the Legal Domain O de Gibert, A Garcıa-Pablos, M Cuadros, M Melero | 4* | 2022 |
The OPUS-MT dashboard-A toolkit for a systematic evaluation of open machine translation models J Tiedemann, O De Gibert Annual Meeting of the Association for Computational Linguistics: ACL-DEMO …, 2023 | 3 | 2023 |
Hybrid distillation from RBMT and NMT: Helsinki-NLP’s submission to the Shared Task on Translation into Low-Resource Languages of Spain O De Gibert, M Aulamo, Y Scherrer, J Tiedemann Proceedings of the Ninth Conference on Machine Translation, 908-917, 2024 | 2 | 2024 |
Sequence-to-sequence resources for catalan O de Gibert, K Kharitonova, BC Figueras, J Armengol-Estapé, M Melero arXiv preprint arXiv:2202.06871, 2022 | 2 | 2022 |
Unsupervised Machine Translation in Real-World Scenarios O de Gibert, I Goenaga, J Armengol-Estapé, O Perez-de-Vinaspre | 2* | 2022 |
To post-edit or to translate… That is the question. A case study of a recommender system for Quality Estimation of Machine Translation based on linguistic features O de Gibert Bonet MA Thesis. University of Basque Country, 2018 | 2 | 2018 |
HPLT's first release of data and models N Arefyev, M Aulamo, P Chen, ODG Bonet, B Haddow, J Helcl, B Malik, ... The 25th Annual Conference of The European Association for Machine …, 2024 | 1 | 2024 |
MAMMOTH: Massively multilingual modular open translation@ Helsinki T Mickus, SA Grönroos, J Attieh, M Boggia, O De Gibert, S Ji, NA Lopi, ... arXiv preprint arXiv:2403.07544, 2024 | 1 | 2024 |