[PDF][PDF] A survey of large language models

WX Zhao, K Zhou, J Li, T Tang… - arxiv preprint arxiv …, 2023 - paper-notes.zhjwpku.com
Ever since the Turing Test was proposed in the 1950s, humans have explored the mastering
of language intelligence by machine. Language is essentially a complex, intricate system of …

Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models

N Guha, J Nyarko, D Ho, C Ré… - Advances in …, 2023 - proceedings.neurips.cc
The advent of large language models (LLMs) and their adoption by the legal community has
given rise to the question: what types of legal reasoning can LLMs perform? To enable …

Inadequacies of large language model benchmarks in the era of generative artificial intelligence

TR McIntosh, T Susnjak, N Arachchilage, T Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
The rapid rise in popularity of Large Language Models (LLMs) with emerging capabilities
has spurred public curiosity to evaluate and compare different LLMs, leading many …

A reasoning and value alignment test to assess advanced gpt reasoning

TR McIntosh, T Liu, T Susnjak, P Watters… - ACM Transactions on …, 2024 - dl.acm.org
In response to diverse perspectives on artificial general intelligence (AGI), ranging from
potential safety and ethical concerns to more extreme views about the threats it poses to …

Lextreme: A multi-lingual and multi-task benchmark for the legal domain

J Niklaus, V Matoshi, P Rani, A Galassi… - arxiv preprint arxiv …, 2023 - arxiv.org
Lately, propelled by the phenomenal advances around the transformer architecture, the
legal NLP field has enjoyed spectacular growth. To measure progress, well curated and …

[PDF][PDF] Chatgpt as an artificial lawyer?

J Tan, H Westermann, K Benyekhlef - AI4AJ@ ICAIL, 2023 - ceur-ws.org
Lawyers can analyze and understand specific situations of their clients to provide them with
relevant legal information and advice. We qualitatively investigate to which extent ChatGPT …

Can ChatGPT Perform Reasoning Using the IRAC Method in Analyzing Legal Scenarios Like a Lawyer?

X Kang, L Qu, LK Soon, A Trakic, TY Zhuo… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs), such as ChatGPT, have drawn a lot of attentions recently in
the legal domain due to its emergent ability to tackle a variety of legal tasks. However, it is …

Evalverse: Unified and accessible library for large language model evaluation

J Kim, W Song, D Kim, Y Kim, Y Kim, C Park - arxiv preprint arxiv …, 2024 - arxiv.org
This paper introduces Evalverse, a novel library that streamlines the evaluation of Large
Language Models (LLMs) by unifying disparate evaluation tools into a single, user-friendly …

Embroid: Unsupervised prediction smoothing can improve few-shot classification

N Guha, M Chen, K Bhatia… - Advances in Neural …, 2023 - proceedings.neurips.cc
Recent work has shown that language models'(LMs) prompt-based learning capabilities
make them well suited for automating data labeling in domains where manual annotation is …