Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models N Guha, J Nyarko, D Ho, C Ré, A Chilton, A Chohlas-Wood, A Peters, ... Advances in Neural Information Processing Systems 36, 44123-44279, 2023 | 186 | 2023 |
Hallucination-free? assessing the reliability of leading ai legal research tools V Magesh, F Surani, M Dahl, M Suzgun, CD Manning, DE Ho arXiv preprint arXiv:2405.20362, 2024 | 68* | 2024 |
Ai regulation has its own alignment problem: The technical and institutional feasibility of disclosure, registration, licensing, and auditing N Guha, CM Lawrence, LA Gailmard, KT Rodolfa, F Surani, R Bommasani, ... Geo. Wash. L. Rev. 92, 1473, 2024 | 34* | 2024 |
Presto: A multilingual dataset for parsing realistic task-oriented dialogs R Goel, W Ammar, A Gupta, S Vashishtha, M Sano, F Surani, M Chang, ... arXiv preprint arXiv:2303.08954, 2023 | 14 | 2023 |
LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models. arXiv N Guha, J Nyarko, DE Ho, C Ré, A Chilton, A Narayana, A Chohlas-Wood, ... arXiv preprint arXiv:2308.11462, 2023 | 5 | 2023 |