Cutting off the head ends the conflict: A mechanism for interpreting and mitigating knowledge conflicts in language models Z Jin, P Cao, H Yuan, Y Chen, J Xu, H Li, X Jiang, K Liu, J Zhao arXiv preprint arXiv:2402.18154, 2024 | 18 | 2024 |
RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models Z Jin, P Cao, C Wang, Z He, H Yuan, J Li, Y Chen, K Liu, J Zhao arXiv preprint arXiv:2406.10890, 2024 | 16 | 2024 |
Whispers that Shake Foundations: Analyzing and Mitigating False Premise Hallucinations in Large Language Models H Yuan, P Cao, Z Jin, Y Chen, D Zeng, K Liu, J Zhao arXiv preprint arXiv:2402.19103, 2024 | 5 | 2024 |
Towards robust knowledge unlearning: An adversarial framework for assessing and improving unlearning robustness in large language models H Yuan, Z Jin, P Cao, Y Chen, K Liu, J Zhao arXiv preprint arXiv:2408.10682, 2024 | 4 | 2024 |
CogKGE: A knowledge graph embedding toolkit and benchmark for representing multi-source and heterogeneous knowledge Z Jin, T Men, H Yuan, Z He, D Sui, C Wang, Z Xue, Y Chen, J Zhao Proceedings of the 60th Annual Meeting of the Association for Computational …, 2022 | 4 | 2022 |
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment Z Jin, H Yuan, T Men, P Cao, Y Chen, K Liu, J Zhao arXiv preprint arXiv:2412.13746, 2024 | 1 | 2024 |
CogKTR: A Knowledge-Enhanced Text Representation Toolkit for Natural Language Understanding Z Jin, T Men, H Yuan, Y Zhou, P Cao, Y Chen, Z Xue, K Liu, J Zhao Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022 | 1 | 2022 |
Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models H Yuan, Y Chen, P Cao, Z Jin, K Liu, J Zhao arXiv preprint arXiv:2406.12416, 2024 | | 2024 |