Do-not-answer: Evaluating safeguards in LLMs Y Wang, H Li, X Han, P Nakov, T Baldwin Findings of the Association for Computational Linguistics: EACL 2024, 896-911, 2024 | 124* | 2024 |
M4: Multi-generator, multi-domain, and multi-lingual black-box machine-generated text detection Y Wang, J Mansurov, P Ivanov, J Su, A Shelmanov, A Tsvigun, ... EACL 2024 (Best Resource Paper), 2023 | 108 | 2023 |
A Survey of Confidence Estimation and Calibration in Large Language Models J Geng, F Cai, Y Wang, H Koeppl, P Nakov, I Gurevych Proceedings of the 2024 Conference of the North American Chapter of the …, 2024 | 61* | 2024 |
Semeval-2024 task 8: Multigenerator, multidomain, and multilingual black-box machine-generated text detection Y Wang, J Mansurov, P Ivanov, J Su, A Shelmanov, A Tsvigun, ... Proceedings of the 18th International Workshop on Semantic Evaluation, SemEval, 2024 | 61* | 2024 |
Factcheck-bench: Fine-grained evaluation benchmark for automatic fact-checkers Y Wang, RG Reddy, Z Mujahid, A Arora, A Rubashevskii, J Geng, ... Findings of the Association for Computational Linguistics: EMNLP 2024, 14199 …, 2024 | 53* | 2024 |
Uncertainty estimation and reduction of pre-trained models for text regression Y Wang, D Beck, T Baldwin, K Verspoor Transactions of the Association for Computational Linguistics 10, 680-696, 2022 | 31 | 2022 |
M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection Y Wang, J Mansurov, P Ivanov, J Su, A Shelmanov, A Tsvigun, OM Afzal, ... ACL 2024, 2024 | 29 | 2024 |
Learning from unlabelled data for clinical semantic textual similarity Y Wang, K Verspoor, T Baldwin Proceedings of the 3rd Clinical Natural Language Processing Workshop, 227-233, 2020 | 29 | 2020 |
Evaluating the utility of model configurations and data augmentation on clinical semantic textual similarity Y Wang, F Liu, K Verspoor, T Baldwin Proceedings of the 19th SIGBioMed workshop on biomedical language processing …, 2020 | 24 | 2020 |
HW-TSC’s participation at WMT 2021 quality estimation shared task Y Chen, C Su, Y Zhang, Y Wang, X Geng, H Yang, S Tao, G Jiaxin, ... Proceedings of the Sixth Conference on Machine Translation, 890-896, 2021 | 20 | 2021 |
GenAI content detection task 1: English and multilingual machine-generated text detection: AI vs. human Y Wang, A Shelmanov, J Mansurov, A Tsvigun, V Mikhailov, R Xing, Z Xie, ... arXiv preprint arXiv:2501.11012, 2025 | 16 | 2025 |
Factuality of large language models: A survey Y Wang, M Wang, MA Manzoor, F Liu, G Georgiev, R Das, P Nakov Proceedings of the 2024 Conference on Empirical Methods in Natural Language …, 2024 | 15* | 2024 |
Against The Achilles' Heel: A Survey on Red Teaming for Generative Models L Lin, H Mu, Z Zhai, M Wang, Y Wang, R Wang, J Gao, Y Zhang, W Che, ... Journal of Artificial Intelligence Research 82, 687-775, 2025 | 14* | 2025 |
The HW-TSC’s simultaneous speech translation system for IWSLT 2022 evaluation M Wang, J Guo, Y Li, X Qiao, Y Wang, Z Li, C Su, Y Chen, M Zhang, S Tao, ... Proceedings of the 19th International Conference on Spoken Language …, 2022 | 14 | 2022 |
Self-distillation mixup training for non-autoregressive neural machine translation J Guo, M Wang, D Wei, H Shang, Y Wang, Z Li, Z Yu, Z Wu, Y Chen, C Su, ... arXiv preprint arXiv:2112.11640, 2021 | 11 | 2021 |
OpenFactCheck: Building, Benchmarking Customized Fact-Checking Systems and Evaluating the Factuality of Claims and LLMs Y Wang, M Wang, H Iqbal, GN Georgiev, J Geng, I Gurevych, P Nakov Proceedings of the 31st International Conference on Computational …, 2025 | 10* | 2025 |
Capture human disagreement distributions by calibrated networks for natural language inference Y Wang, M Wang, Y Chen, S Tao, J Guo, C Su, M Zhang, H Yang Findings of the Association for Computational Linguistics: ACL 2022, 1524-1535, 2022 | 10 | 2022 |
Diformer: Directional transformer for neural machine translation M Wang, J Guo, Y Wang, D Wei, H Shang, C Su, Y Chen, Y Li, M Zhang, ... arXiv preprint arXiv:2112.11632, 2021 | 9 | 2021 |
How length prediction influence the performance of non-autoregressive translation? M Wang, G Jiaxin, Y Wang, Y Chen, S Chang, H Shang, M Zhang, S Tao, ... Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting …, 2021 | 8 | 2021 |
A Chinese Dataset for Evaluating the Safeguards in Large Language Models Y Wang, Z Zhai, H Li, X Han, L Lin, Z Zhang, J Zhao, P Nakov, T Baldwin arXiv preprint arXiv:2402.12193, 2024 | 7 | 2024 |