Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1889* | 2023 |
Representation engineering: A top-down approach to ai transparency A Zou, L Phan, S Chen, J Campbell, P Guo, R Ren, A Pan, X Yin, ... arXiv preprint arXiv:2310.01405, 2023 | 339 | 2023 |
Harmbench: A standardized evaluation framework for automated red teaming and robust refusal M Mazeika, L Phan, X Yin, A Zou, Z Wang, N Mu, E Sakhaee, N Li, ... arXiv preprint arXiv:2402.04249, 2024 | 214 | 2024 |
Scifive: a text-to-text transformer model for biomedical literature LN Phan, JT Anibal, H Tran, S Chanana, E Bahadroglu, A Peltekian, ... arXiv preprint arXiv:2106.03598, 2021 | 167 | 2021 |
CoTexT: Multi-task Learning with Code-Text Transformer L Phan, H Tran, D Le, H Nguyen, J Anibal, A Peltekian, Y Ye ACL NLP4Prog, 2021 | 145 | 2021 |
The wmdp benchmark: Measuring and reducing malicious use with unlearning N Li, A Pan, A Gopal, S Yue, D Berrios, A Gatti, JD Li, AK Dombrowski, ... arXiv preprint arXiv:2403.03218, 2024 | 111 | 2024 |
Improving alignment and robustness with circuit breakers A Zou, L Phan, J Wang, D Duenas, M Lin, M Andriushchenko, JZ Kolter, ... Advances in Neural Information Processing Systems 37, 83345-83373, 2025 | 73* | 2025 |
ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language Generation L Phan, H Tran, H Nguyen, TH Trinh NAACL SRW 2022, 2022 | 64 | 2022 |
Tamper-resistant safeguards for open-weight llms R Tamirisa, B Bharathi, L Phan, A Zhou, A Gatti, T Suresh, M Lin, J Wang, ... arXiv preprint arXiv:2408.00761, 2024 | 24 | 2024 |
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress? R Ren, S Basart, A Khoja, A Gatti, L Phan, X Yin, M Mazeika, A Pan, ... Advances in Neural Information Processing Systems 37, 68559-68594, 2025 | 16 | 2025 |
MTet: Multi-domain translation for English and Vietnamese C Ngo, TH Trinh, L Phan, H Tran, T Dang, H Nguyen, M Nguyen, ... arXiv preprint arXiv:2210.05610, 2022 | 11 | 2022 |
Hierarchical transformer encoders for Vietnamese spelling correction H Tran, CV Dinh, L Phan, ST Nguyen Advances and Trends in Artificial Intelligence. Artificial Intelligence …, 2021 | 10 | 2021 |
Enriching biomedical knowledge for low-resource language through large-scale translation L Phan, T Dang, H Tran, TH Trinh, V Phan, LD Chau, MT Luong arXiv preprint arXiv:2210.05598, 2022 | 8 | 2022 |
SPBERT: An efficient pre-training BERT on SPARQL queries for question answering over knowledge graphs H Tran, L Phan, J Anibal, BT Nguyen, TS Nguyen Neural Information Processing: 28th International Conference, ICONIP 2021 …, 2021 | 8 | 2021 |
Humanity's Last Exam L Phan, A Gatti, Z Han, N Li, J Hu, H Zhang, S Shi, M Choi, A Agrawal, ... arXiv preprint arXiv:2501.14249, 2025 | 7 | 2025 |
HAL-X: Scalable hierarchical clustering for rapid and tunable single-cell analysis J Anibal, AG Day, E Bahadiroglu, L O’Neil, L Phan, A Peltekian, A Erez, ... PLoS Computational Biology 18 (10), e1010349, 2022 | 7* | 2022 |
Viesum: how robust are transformer-based models on Vietnamese summarization? H Nguyen, L Phan, J Anibal, A Peltekian, H Tran arXiv preprint arXiv:2110.04257, 2021 | 5 | 2021 |
Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs M Mazeika, X Yin, R Tamirisa, J Lim, BW Lee, R Ren, L Phan, N Mu, ... arXiv preprint arXiv:2502.08640, 2025 | 1 | 2025 |
Superhuman Automated Forecasting L Phan, A Zeng, M Mazeika, A Khoja, D Hendrycks https://www.safe.ai/blog/forecasting, 2024 | | 2024 |