Google Наука

WX Zhao, K Zhou, J Li, T Tang… - arxiv preprint arxiv …, 2023 - paper-notes.zhjwpku.com

Ever since the Turing Test was proposed in the 1950s, humans have explored the mastering
of language intelligence by machine. Language is essentially a complex, intricate system of …

Запазване Позоваване С позовавания в 3800 Сродни статии Всички 6 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models

N Guha, J Nyarko, D Ho, C Ré… - Advances in …, 2023 - proceedings.neurips.cc

The advent of large language models (LLMs) and their adoption by the legal community has
given rise to the question: what types of legal reasoning can LLMs perform? To enable …

Запазване Позоваване С позовавания в 181 Сродни статии Всички 10 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] From cobit to iso 42001: Evaluating cybersecurity frameworks for opportunities, risks, and regulatory compliance in commercializing large language models

TR McIntosh, T Susnjak, T Liu, P Watters, D Xu… - Computers & …, 2024 - Elsevier

This study investigated the integration readiness of four predominant cybersecurity
Governance, Risk and Compliance (GRC) frameworks–NIST CSF 2.0, COBIT 2019, ISO …

Запазване Позоваване С позовавания в 70 Сродни статии Всички 10 версии

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Inadequacies of large language model benchmarks in the era of generative artificial intelligence

TR McIntosh, T Susnjak, N Arachchilage, T Liu… - arxiv preprint arxiv …, 2024 - arxiv.org

The rapid rise in popularity of Large Language Models (LLMs) with emerging capabilities
has spurred public curiosity to evaluate and compare different LLMs, leading many …

Запазване Позоваване С позовавания в 115 Сродни статии Всички 3 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

A reasoning and value alignment test to assess advanced gpt reasoning

TR McIntosh, T Liu, T Susnjak, P Watters… - ACM Transactions on …, 2024 - dl.acm.org

In response to diverse perspectives on artificial general intelligence (AGI), ranging from
potential safety and ethical concerns to more extreme views about the threats it poses to …

Запазване Позоваване С позовавания в 58 Сродни статии Всички 5 версии

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Lextreme: A multi-lingual and multi-task benchmark for the legal domain

J Niklaus, V Matoshi, P Rani, A Galassi… - arxiv preprint arxiv …, 2023 - arxiv.org

Lately, propelled by the phenomenal advances around the transformer architecture, the
legal NLP field has enjoyed spectacular growth. To measure progress, well curated and …

Запазване Позоваване С позовавания в 48 Сродни статии Всички 10 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] ceur-ws.org

[PDF][PDF] Chatgpt as an artificial lawyer?

J Tan, H Westermann, K Benyekhlef - AI4AJ@ ICAIL, 2023 - ceur-ws.org

Lawyers can analyze and understand specific situations of their clients to provide them with
relevant legal information and advice. We qualitatively investigate to which extent ChatGPT …

Запазване Позоваване С позовавания в 49 Сродни статии Всички 2 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Can ChatGPT Perform Reasoning Using the IRAC Method in Analyzing Legal Scenarios Like a Lawyer?

X Kang, L Qu, LK Soon, A Trakic, TY Zhuo… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs), such as ChatGPT, have drawn a lot of attentions recently in
the legal domain due to its emergent ability to tackle a variety of legal tasks. However, it is …

Запазване Позоваване С позовавания в 16 Сродни статии Всички 9 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Evalverse: Unified and accessible library for large language model evaluation

J Kim, W Song, D Kim, Y Kim, Y Kim, C Park - arxiv preprint arxiv …, 2024 - arxiv.org

This paper introduces Evalverse, a novel library that streamlines the evaluation of Large
Language Models (LLMs) by unifying disparate evaluation tools into a single, user-friendly …

Запазване Позоваване С позовавания в 6 Сродни статии Всички 3 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Embroid: Unsupervised prediction smoothing can improve few-shot classification

N Guha, M Chen, K Bhatia… - Advances in Neural …, 2023 - proceedings.neurips.cc

Recent work has shown that language models'(LMs) prompt-based learning capabilities
make them well suited for automating data labeling in domains where manual annotation is …

Запазване Позоваване С позовавания в 6 Сродни статии Всички 6 версии Във вид на HTML

Създаване на сигнал

Позоваване

Разширено търсене

Запазено в „Моята библиотека“

Legalbench: Prototy** a collaborative benchmark for legal reasoning

[PDF][PDF] A survey of large language models

Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models

[HTML][HTML] From cobit to iso 42001: Evaluating cybersecurity frameworks for opportunities, risks, and regulatory compliance in commercializing large language models

Inadequacies of large language model benchmarks in the era of generative artificial intelligence

A reasoning and value alignment test to assess advanced gpt reasoning

Lextreme: A multi-lingual and multi-task benchmark for the legal domain

[PDF][PDF] Chatgpt as an artificial lawyer?

Can ChatGPT Perform Reasoning Using the IRAC Method in Analyzing Legal Scenarios Like a Lawyer?

Evalverse: Unified and accessible library for large language model evaluation

Embroid: Unsupervised prediction smoothing can improve few-shot classification