Multitask Prompted Training Enables Zero-Shot Task Generalization V Sanh, A Webson, C Raffel, SH Bach, L Sutawika, Z Alyafeai, A Chaffin, ... ICLR 2022 (Spotlight), 2022 | 1796 | 2022 |
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model BigScience arXiv, 2023 | 1765* | 2023 |
Crosslingual Generalization Through Multitask Finetuning N Muennighoff, T Wang, L Sutawika, A Roberts, S Biderman, TL Scao, ... ACL 2023, 2023 | 707 | 2023 |
Promptsource: An Integrated Development Environment and Repository for Natural Language Prompts SH Bach*, V Sanh*, ZX Yong, A Webson, C Raffel, NV Nayak, A Sharma, ... ACL 2022 System Demo, 2022 | 320 | 2022 |
Low-Resource Languages Jailbreak GPT-4 ZX Yong, C Menghini, SH Bach NeurIPS 2023 SoLaR Workshop (Best Paper Award ⭐️), 2023 | 179 | 2023 |
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model A Üstün*, V Aryabumi*, ZX Yong*, WY Ko*, D D'souza*, G Onilude, ... ACL 2024 (Best Paper Award ⭐️), 2024 | 146 | 2024 |
What Language Model To Train if You Have One Million GPU Hours? TL Scao, T Wang, D Hesslow, L Saulnier, S Bekman, MS Bari, S Bideman, ... EMNLP 2023 Findings, 2023 | 119 | 2023 |
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting ZX Yong, H Schoelkopf, N Muennighoff, AF Aji, DI Adelani, K Almubarak, ... ACL 2023, 2023 | 67 | 2023 |
The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges GI Winata, AF Aji, ZX Yong, T Solorio ACL 2023 Findings, 2023 | 43 | 2023 |
A Safe Harbor for AI Evaluation and Red Teaming S Longpre, S Kapoor, K Klyman, A Ramaswami, R Bommasani, ... ICML 2024 Position Paper, 2024 | 36 | 2024 |
Prompting Multilingual Large Language Models To Generate Code-Mixed Texts: The Case of South East Asian Languages ZX Yong, R Zhang, J Zosa Forde, S Wang, S Cahyawijaya, H Lovenia, ... EMNLP 2023 CALCS Workshop, 2023 | 36* | 2023 |
CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark D Romero, C Lyu, HA Wibowo, T Lynn, I Hamed, AN Kishore, A Mandal, ... NeurIPS 2024 Datasets and Benchmarks (Oral), 2024 | 20 | 2024 |
Semi-Supervised Deep Embedded Clustering With Anomaly Detection for Semantic Frame Induction ZX Yong, TT Torrent LREC 2020, 2020 | 15 | 2020 |
Representativeness as a Forgotten Lesson for Multilingual and Code-Switched Data Collection and Preparation AS Doğruöz, S Sitaram, ZX Yong EMNLP 2023 Findings, 2023 | 12 | 2023 |
Preference Tuning For Toxicity Mitigation Generalizes Across Languages X Li*, ZX Yong*, SH Bach EMNLP 2024 Findings, 2024 | 10 | 2024 |
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages H Lovenia, R Mahendra, SM Akbar, LJV Miranda, J Santoso, E Aco, ... EMNLP 2024, 2024 | 3 | 2024 |
Frame Shift Prediction ZX Yong, PD Watson, TT Torrent, O Czulo, CF Baker LREC 2022, 2022 | 3 | 2022 |
Humanity's Last Exam L Phan, A Gatti, Z Han, N Li, J Hu, H Zhang, S Shi, M Choi, A Agrawal, ... arXiv preprint arXiv:2501.14249, 2025 | 2 | 2025 |
LexC-Gen: Generating Data for Extremely Low-Resource Languages with Large Language Models and Bilingual Lexicons ZX Yong, C Menghini, SH Bach EMNLP 2024 Findings, 2024 | 2 | 2024 |
Towards Understanding the Fragility of Multilingual LLMs against Fine-Tuning Attacks S Poppi, ZX Yong, Y He, B Chern, H Zhao, A Yang, J Chi arXiv preprint arXiv:2410.18210, 2024 | 1 | 2024 |