Google Академія

H Taherkhani, H Hemmati - arxiv preprint arxiv:2411.08254, 2024 - arxiv.org

Large Language Models (LLMs) have demonstrated significant potential in automating
software testing, specifically in generating unit test cases. However, the validation of LLM …

Зберегти Послатися Цитовано в 11 джерелах Пов’язані статті Кількість версій: 2 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Representation Engineering for Large-Language Models: Survey and Research Challenges

L Bartoszcze, S Munshi, B Sukidi, J Yen, Z Yang… - arxiv preprint arxiv …, 2025 - arxiv.org

Large-language models are capable of completing a variety of tasks, but remain
unpredictable and intractable. Representation engineering seeks to resolve this problem …

Зберегти Послатися Пов’язані статті Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow Thinking

X Cheng, J Li, WX Zhao, JR Wen - arxiv preprint arxiv:2501.01306, 2025 - arxiv.org

Large language models (LLMs) demonstrate exceptional capabilities, yet still face the
hallucination issue. Typical text generation approaches adopt an auto-regressive generation …

Зберегти Послатися Цитовано в 1 джерелах Пов’язані статті Кількість версій: 2 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

HalluCana: Fixing LLM Hallucination with A Canary Lookahead

T Li, E Dayanik, S Tyagi, A Pierleoni - arxiv preprint arxiv:2412.07965, 2024 - arxiv.org

In this paper, we present HalluCana, a canary lookahead to detect and correct factuality
hallucinations of Large Language Models (LLMs) in long-form generation. HalluCana …

Зберегти Послатися Пов’язані статті Кількість версій: 2 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding

K Kim, G Park, Y Lee, W Yeo, SJ Hwang - arxiv preprint arxiv:2412.02186, 2024 - arxiv.org

Recent advancements in video large multimodal models (LMMs) have significantly improved
their video understanding and reasoning capabilities. However, their performance drops on …

Зберегти Послатися Пов’язані статті Кількість версій: 2 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

CoCo-CoLa: Evaluating Language Adherence in Multilingual LLMs

E Rahmati, AS Ziabari, M Dehghani - arxiv preprint arxiv:2502.12476, 2025 - arxiv.org

Multilingual Large Language Models (LLMs) develop cross-lingual abilities despite being
trained on limited parallel data. However, they often struggle to generate responses in the …

Зберегти Послатися Пов’язані статті Кількість версій: 2 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Attention-guided Self-reflection for Zero-shot Hallucination Detection in Large Language Models

Q Liu, X Chen, Y Ding, S Xu, S Wu, L Wang - arxiv preprint arxiv …, 2025 - arxiv.org

Hallucination has emerged as a significant barrier to the effective application of Large
Language Models (LLMs). In this work, we introduce a novel Attention-Guided SElf …

Зберегти Послатися Пов’язані статті Кількість версій: 2 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought

B Zhang, R Zhang - arxiv preprint arxiv:2502.17214, 2025 - arxiv.org

Large language models (LLMs) excel in many tasks but struggle to accurately quantify
uncertainty in their generated responses. This limitation makes it challenging to detect …

Зберегти Послатися Пов’язані статті Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Sample, Scrutinize and Scale: Effective Inference-Time Search by Scaling Verification

E Zhao, P Awasthi, S Gollapudi - arxiv preprint arxiv:2502.01839, 2025 - arxiv.org

Sampling-based search, a simple paradigm for utilizing test-time compute, involves
generating multiple candidate responses and selecting the best one--typically by verifying …

Зберегти Послатися Пов’язані статті Кількість версій: 2 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Collaborative Instance Navigation: Leveraging Agent Self-Dialogue to Minimize User Input

F Taioli, E Zorzi, G Franchi, A Castellini… - arxiv preprint arxiv …, 2024 - arxiv.org

Existing embodied instance goal navigation tasks, driven by natural language, assume
human users to provide complete and nuanced instance descriptions prior to the navigation …

Зберегти Послатися Пов’язані статті Кількість версій: 2 Показати у форматі HTML

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Llms know more than they show: On the intrinsic representation of llm hallucinations

VALTEST: Automated Validation of Language Model Generated Test Cases

Representation Engineering for Large-Language Models: Survey and Research Challenges

Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow Thinking

HalluCana: Fixing LLM Hallucination with A Canary Lookahead

VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding

CoCo-CoLa: Evaluating Language Adherence in Multilingual LLMs

Attention-guided Self-reflection for Zero-shot Hallucination Detection in Large Language Models

CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought

Sample, Scrutinize and Scale: Effective Inference-Time Search by Scaling Verification

Collaborative Instance Navigation: Leveraging Agent Self-Dialogue to Minimize User Input