Active example selection for in-context learning Y Zhang, S Feng, C Tan arXiv preprint arXiv:2211.04486, 2022 | 166 | 2022 |
Effective Prompt Extraction from Language Models Y Zhang, N Carlini, D Ippolito arXiv preprint arXiv:2307.06865, 2023 | 64* | 2023 |
Conversations gone alright: Quantifying and predicting prosocial outcomes in online conversations J Bao, J Wu, Y Zhang, E Chandrasekharan, D Jurgens Proceedings of the Web Conference 2021, 1134-1145, 2021 | 43 | 2021 |
Selective explanations: Leveraging human input to align explainable ai V Lai, Y Zhang, C Chen, QV Liao, C Tan Proceedings of the ACM on Human-Computer Interaction 7 (CSCW2), 1-35, 2023 | 38 | 2023 |
FLamE: Few-shot learning from natural language explanations Y Zhou, Y Zhang, C Tan arXiv preprint arXiv:2306.08042, 2023 | 11 | 2023 |
Llama guard 3 vision: Safeguarding human-ai image understanding conversations J Chi, U Karn, H Zhan, E Smith, J Rando, Y Zhang, K Plawiak, ZD Coudert, ... arXiv preprint arXiv:2411.10414, 2024 | 10 | 2024 |
Backtracking improves generation safety Y Zhang, J Chi, H Nguyen, K Upasani, DM Bikel, J Weston, EM Smith arXiv preprint arXiv:2409.14586, 2024 | 8 | 2024 |
Biasx:" thinking slow" in toxic content moderation with explanations of implied social biases Y Zhang, S Nanduri, L Jiang, T Wu, M Sap arXiv preprint arXiv:2305.13589, 2023 | 7 | 2023 |
Openhexai: An open-source framework for human-centered evaluation of explainable machine learning J Ma, V Lai, Y Zhang, C Chen, P Hamilton, D Ljubenkov, H Lakkaraju, ... arXiv preprint arXiv:2403.05565, 2024 | 4 | 2024 |
Persistent Pre-Training Poisoning of LLMs Y Zhang, J Rando, I Evtimov, J Chi, EM Smith, N Carlini, F Tramèr, ... arXiv preprint arXiv:2410.13722, 2024 | 2 | 2024 |
Learning to ignore adversarial attacks Y Zhang, Y Zhou, S Carton, C Tan arXiv preprint arXiv:2205.11551, 2022 | 2 | 2022 |
Human-aligned chess with a bit of search Y Zhang, AP Jacob, V Lai, D Fried, D Ippolito arXiv preprint arXiv:2410.03893, 2024 | 1 | 2024 |
Forcing diffuse distributions out of language models Y Zhang, A Schwarzschild, N Carlini, Z Kolter, D Ippolito arXiv preprint arXiv:2404.10859, 2024 | 1 | 2024 |
Building a Flexible Knowledge Graph to Capture Real-World Events. L Burdick, O Ignat, Y Zhang, R Mihalcea, M Wang, S Wilson, Y Wei, ... TAC, 2019 | 1 | 2019 |