Jiaxin Wen

Dikutip oleh

	Semua	Sejak 2020
Kutipan	362	362
indeks-h	10	10
indeks-i10	10	10

200

100

150

202120222023202420259 36 93 191 32

Akses publik

Lihat semua

2 artikel

0 artikel

tersedia

tidak tersedia

Berdasarkan pada mandat pendanaan

Pengarang bersama

Pei KeTsinghua UniversityEmail yang diverifikasi di mail.tsinghua.edu.cn
Jian GuanAnt GroupEmail yang diverifikasi di antgroup.com
Minlie HuangTsinghua UniversityEmail yang diverifikasi di tsinghua.edu.cn
Jacob SteinhardtStanford UniversityEmail yang diverifikasi di cs.stanford.edu
Yuxian GuTsinghua UniversityEmail yang diverifikasi di mails.tsinghua.edu.cn
Ethan PerezAnthropic; New York UniversityEmail yang diverifikasi di anthropic.com
akbir khanAnthropic; University College LondonEmail yang diverifikasi di cantab.ac.uk
He HeNew York UniversityEmail yang diverifikasi di cs.nyu.edu
Shi FengThe George Washington UniversityEmail yang diverifikasi di gwu.edu
Ruiqi ZhongUniversity of California, BerkeleyEmail yang diverifikasi di berkeley.edu
Samuel R. BowmanAnthropic and NYUEmail yang diverifikasi di anthropic.com
Zhihong ShaoTsinghua UniversityEmail yang diverifikasi di mails.tsinghua.edu.cn

Ikuti

Jiaxin Wen

Tsinghua Unviersity

Email yang diverifikasi di mails.tsinghua.edu.cn - Beranda

Natural Language Processing AI safety Alignment


Judul Urutkan menurut kutipan Urutkan menurut tahun Urutkan menurut judul	Dikutip oleh Dikutip oleh	Tahun
Unveiling the implicit toxicity in large language models J Wen, P Ke, H Sun, Z Zhang, C Li, J Bai, M Huang arXiv preprint arXiv:2311.17391, 2023	59	2023
Robustness testing of language understanding in task-oriented dialog J Liu, R Takanobu, J Wen, D Wan, H Li, W Nie, C Li, W Peng, M Huang arXiv preprint arXiv:2012.15262, 2020	53	2020
Augesc: Dialogue augmentation with large language models for emotional support conversation C Zheng, S Sabour, J Wen, Z Zhang, M Huang arXiv preprint arXiv:2202.13047, 2022	50	2022
A chatbot for mental health support: exploring the impact of Emohaa on reducing mental distress in China S Sabour, W Zhang, X Xiao, Y Zhang, Y Zheng, J Wen, J Zhao, M Huang Frontiers in digital health 5, 1133987, 2023	49	2023
Eva2. 0: Investigating open-domain chinese dialogue systems with large-scale pre-training Y Gu, J Wen, H Sun, Y Song, P Ke, C Zheng, Z Zhang, J Yao, L Liu, X Zhu, ... Machine Intelligence Research 20 (2), 207-219, 2023	45	2023
Ethicist: Targeted training data extraction through loss smoothed soft prompting and calibrated confidence estimation Z Zhang, J Wen, M Huang arXiv preprint arXiv:2307.04401, 2023	29	2023
Persona-guided planning for controlling the protagonist's persona in story generation Z Zhang, J Wen, J Guan, M Huang arXiv preprint arXiv:2204.10703, 2022	23	2022
Augesc: Large-scale data augmentation for emotional support conversation with pre-trained language models C Zheng, S Sabour, J Wen, M Huang arXiv preprint arXiv:2202.13047, 2022	17	2022
Autocad: Automatically generating counterfactuals for mitigating shortcut learning J Wen, Y Zhu, J Zhang, J Zhou, M Huang arXiv preprint arXiv:2211.16202, 2022	13	2022
Language models learn to mislead humans via rlhf J Wen, R Zhong, A Khan, E Perez, J Steinhardt, M Huang, SR Bowman, ... arXiv preprint arXiv:2409.12822, 2024	12	2024
Learning task decomposition to assist humans in competitive programming J Wen, R Zhong, P Ke, Z Shao, H Wang, M Huang Proceedings of the 62nd Annual Meeting of the Association for Computational …, 2024	4	2024
Adaptivebackdoor: Backdoored language model agents that detect human overseers H Wang, R Zhong, J Wen, J Steinhardt ICML 2024 Next Generation of AI Safety Workshop, 2024	3	2024
Robustness testing of language understanding in dialog systems J Liu, R Takanobu, J Wen, D Wan, W Nie, H Li, C Li, W Peng, M Huang CoRR, abs, 2012	3	2012
Codeplan: Unlocking reasoning potential in large langauge models by scaling code-form planning J Wen, J Guan, H Wang, W Wu, M Huang CoRR, 2024	2	2024
Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats J Wen, V Hebbar, C Larson, A Bhatt, A Radhakrishnan, M Sharma, ... arXiv preprint arXiv:2411.17693, 2024		2024
Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning J Wen, J Guan, H Wang, W Wu, M Huang arXiv preprint arXiv:2409.12452, 2024		2024
Re3Dial: Retrieve, Reorganize and Rescale Conversations for Long-Turn Open-Domain Dialogue Pre-training J Wen, H Zhou, J Guan, J Zhou, M Huang Proceedings of the 2023 Conference on Empirical Methods in Natural Language …, 2023		2023
SmartBackdoor: Malicious Language Model Agents that Avoid Being Caught H Wang, R Zhong, J Wen, J Steinhardt

Sistem tidak dapat melakukan operasi ini. Coba lagi nanti.

Artikel 1–18

Kutipan per tahun

Kutipan duplikat

Kutipan yang digabung

Tambahkan pengarang bersamaPengarang bersama

Ikuti

Dikutip oleh

Pengarang bersama