フォロー
Cunxiang Wang
Cunxiang Wang
Tsinghua University; ZhipuAI
確認したメール アドレス: westlake.edu.cn - ホームページ
タイトル
引用先
引用先
A survey on evaluation of large language models
Y Chang, X Wang, J Wang, Y Wu, L Yang, K Zhu, H Chen, X Yi, C Wang, ...
ACM Transactions on Intelligent Systems and Technology 15 (3), 1-45, 2024
20902024
Survey on factuality in large language models: Knowledge, retrieval and domain-specificity
C Wang, X Liu, Y Yue, X Tang, T Zhang, C Jiayang, Y Yao, W Gao, X Hu, ...
arXiv preprint arXiv:2310.07521, 2023
1842023
Pandalm: An automatic evaluation benchmark for llm instruction tuning optimization
Y Wang, Z Yu, Z Zeng, L Yang, C Wang, H Chen, C Jiang, R Xie, J Wang, ...
ICLR 2024, 2023
1792023
Does It Make Sense? And Why? A Pilot Study for Sense Making and Explanation
C Wang, S Liang, Y Zhang, X Li, T Gao
ACL 2019, 4020–4026, 2019
1202019
SemEval-2020 task 4: Commonsense validation and explanation
C Wang, S Liang, Y Jin, Y Wang, X Zhu, Y Zhang
SemEval-2020 Task track, 2020
1122020
Can generative pre-trained language models serve as knowledge bases for closed-book qa?
C Wang, P Liu, Y Zhang
ACL 2021, 2021
832021
A survey on evaluation of large language models. arXiv
Y Chang, X Wang, J Wang, Y Wu, L Yang, K Zhu, H Chen, X Yi, C Wang, ...
Preprint posted online on Dec 29, 2023
62*2023
Knowledge conflicts for llms: A survey
R Xu, Z Qi, Z Guo, C Wang, H Wang, Y Zhang, W Xu
arXiv preprint arXiv:2403.08319, 2024
532024
Evaluating Open-QA Evaluation
C Wang, S Cheng, Q Guo, Y Yue, B Ding, Z Xu, Y Wang, X Hu, Z Zhang, ...
Advances in Neural Information Processing Systems 36, 2023
51*2023
A survey on evaluation of large language models (2023)
Y Chang, X Wang, J Wang, Y Wu, L Yang, K Zhu, H Chen, X Yi, C Wang, ...
31*
A survey on evaluation of large language models. arXiv
Y Chang, X Wang, J Wang, Y Wu, L Yang, K Zhu, H Chen, X Yi, C Wang, ...
Preprint posted online on Dec 29, 2023
242023
Llms with chain-of-thought are non-causal reasoners
G Bao, H Zhang, L Yang, C Wang, Y Zhang
arXiv preprint arXiv:2402.16048, 2024
232024
Exploring generalization ability of pretrained language models on arithmetic and logical reasoning
C Wang, B Zheng, Y Niu, Y Zhang
Natural Language Processing and Chinese Computing: 10th CCF International …, 2021
212021
RFiD: Towards Rational Fusion-in-Decoder for Open-Domain Question Answering
C Wang, H Yu, Y Zhang
Findings of the Association for Computational Linguistics: ACL 2023, 2023
162023
A survey on evaluation of large language models. arXiv 2023
Y Chang, X Wang, J Wang, Y Wu, K Zhu, H Chen, L Yang, X Yi, C Wang, ...
arXiv preprint arXiv:2307.03109 10, 2023
16*2023
Self-DC: When to retrieve and When to generate? Self Divide-and-Conquer for Compositional Unknown Questions
H Wang, B Xue, B Zhou, T Zhang, C Wang, G Chen, H Wang, K Wong
arXiv preprint arXiv:2402.13514, 2024
132024
PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization
Y Wang, Z Yu, Z Zeng, L Yang, C Wang, H Chen, C Jiang, R Xie, J Wang, ...
13*2023
SHIELD: Evaluation and Defense Strategies for Copyright Compliance in LLM Text Generation
X Liu, T Sun, T Xu, F Wu, C Wang, X Wang, J Gao
arXiv preprint arXiv:2406.12975, 2024
102024
Novelqa: A benchmark for long-range novel question answering
C Wang, R Ning, B Pan, T Wu, Q Guo, C Deng, G Bao, Q Wang, Y Zhang
arXiv preprint arXiv:2403.12766, 2024
102024
Ragchecker: A fine-grained framework for diagnosing retrieval-augmented generation
D Ru, L Qiu, X Hu, T Zhang, P Shi, S Chang, C Jiayang, C Wang, S Sun, ...
arXiv preprint arXiv:2408.08067, 2024
82024
現在システムで処理を実行できません。しばらくしてからもう一度お試しください。
論文 1–20