Академия Google

J Geng, F Cai, Y Wang, H Koeppl, P Nakov… - arxiv preprint arxiv …, 2023 - arxiv.org

Language models (LMs) have demonstrated remarkable capabilities across a wide range of
tasks in various domains. Despite their impressive performance, the reliability of their output …

Сохранить Цитировать Цитируется: 30 Похожие статьи Все версии статьи (2) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

A Survey of Confidence Estimation and Calibration in Large Language Models

J Geng, F Cai, Y Wang, H Koeppl… - Proceedings of the …, 2024 - aclanthology.org

Large language models (LLMs) have demonstrated remarkable capabilities across a wide
range of tasks in various domains. Despite their impressive performance, they can be …

Сохранить Цитировать Цитируется: 37 Похожие статьи В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Adaptation with self-evaluation to improve selective prediction in llms

J Chen, J Yoon, S Ebrahimi, SO Arik, T Pfister… - arxiv preprint arxiv …, 2023 - arxiv.org

Large language models (LLMs) have recently shown great advances in a variety of tasks,
including natural language understanding and generation. However, their use in high …

Сохранить Цитировать Цитируется: 16 Похожие статьи Все версии статьи (7) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mitigating temporal misalignment by discarding outdated facts

MJQ Zhang, E Choi - arxiv preprint arxiv:2305.14824, 2023 - arxiv.org

While large language models are able to retain vast amounts of world knowledge seen
during pretraining, such knowledge is prone to going out of date and is nontrivial to update …

Сохранить Цитировать Цитируется: 10 Похожие статьи Все версии статьи (6) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Do llms know when to not answer? investigating abstention abilities of large language models

N Madhusudhan, ST Madhusudhan, V Yadav… - arxiv preprint arxiv …, 2024 - arxiv.org

Abstention Ability (AA) is a critical aspect of Large Language Model (LLM) reliability,
referring to an LLM's capability to withhold responses when uncertain or lacking a definitive …

Сохранить Цитировать Цитируется: 2 Похожие статьи Все версии статьи (3) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Can NLP Models' Identify','Distinguish', and'Justify'Questions that Don't have a Definitive Answer?

A Agarwal, N Patel, N Varshney, M Parmar… - arxiv preprint arxiv …, 2023 - arxiv.org

Though state-of-the-art (SOTA) NLP systems have achieved remarkable performance on a
variety of language understanding tasks, they primarily focus on questions that have a …

Сохранить Цитировать Цитируется: 3 Похожие статьи Все версии статьи (4) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Crowd-Calibrator: Can Annotator Disagreement Inform Calibration in Subjective Tasks?

U Khurana, E Nalisnick, A Fokkens… - arxiv preprint arxiv …, 2024 - arxiv.org

Subjective tasks in NLP have been mostly relegated to objective standards, where the gold
label is decided by taking the majority vote. This obfuscates annotator disagreement and the …

Сохранить Цитировать Цитируется: 1 Похожие статьи Все версии статьи (3) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Accelerating llm inference by enabling intermediate layer decoding

N Varshney, A Chatterjee, M Parmar… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) have achieved remarkable performance across a wide
variety of natural language tasks; however, their large size makes their inference slow and …

Сохранить Цитировать Цитируется: 7 Похожие статьи Все версии статьи (2) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Ambiguity meets uncertainty: Investigating uncertainty estimation for word sense disambiguation

Z Liu, Y Liu - arxiv preprint arxiv:2305.13119, 2023 - arxiv.org

Word sense disambiguation (WSD), which aims to determine an appropriate sense for a
target word given its context, is crucial for natural language understanding. Existing …

Сохранить Цитировать Цитируется: 4 Похожие статьи Все версии статьи (4) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

LLMs' Reading Comprehension Is Affected by Parametric Knowledge and Struggles with Hypothetical Statements

V Basmov, Y Goldberg, R Tsarfaty - arxiv preprint arxiv:2404.06283, 2024 - arxiv.org

The task of reading comprehension (RC), often implemented as context-based question
answering (QA), provides a primary means to assess language models' natural language …

Сохранить Цитировать Цитируется: 7 Похожие статьи Все версии статьи (2) В виде HTML

Создать оповещение

Цитировать

Расширенный поиск

Сохранено в вашей библиотеке

Post-Abstention: Towards Reliably Re-Attempting the Abstained Instances in QA

A survey of language model confidence estimation and calibration

A Survey of Confidence Estimation and Calibration in Large Language Models

Adaptation with self-evaluation to improve selective prediction in llms

Mitigating temporal misalignment by discarding outdated facts

Do llms know when to not answer? investigating abstention abilities of large language models

Can NLP Models' Identify','Distinguish', and'Justify'Questions that Don't have a Definitive Answer?

Crowd-Calibrator: Can Annotator Disagreement Inform Calibration in Subjective Tasks?

Accelerating llm inference by enabling intermediate layer decoding

Ambiguity meets uncertainty: Investigating uncertainty estimation for word sense disambiguation

LLMs' Reading Comprehension Is Affected by Parametric Knowledge and Struggles with Hypothetical Statements