„Google“ mokslinčius

On reinforcement learning and distribution matching for fine-tuning language models with no catastrophic forgetting

T Korbak, H Elsahar, G Kruszewski… - Advances in Neural …, 2022 - proceedings.neurips.cc

The availability of large pre-trained models is changing the landscape of Machine Learning
research and practice, moving from a" training from scratch" to a" fine-tuning''paradigm …

Išsaugoti Cituoti Cituoja 59 Susiję straipsniai Visos 6 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Oops i took a gradient: Scalable sampling for discrete distributions

W Grathwohl, K Swersky, M Hashemi… - International …, 2021 - proceedings.mlr.press

We propose a general and scalable approximate sampling strategy for probabilistic models
with discrete variables. Our approach uses gradients of the likelihood function with respect …

Išsaugoti Cituoti Cituoja 104 Susiję straipsniai Visos 6 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On the calibration of pre-trained language models using mixup guided by area under the margin and saliency

SY Park, C Caragea - arxiv preprint arxiv:2203.07559, 2022 - arxiv.org

A well-calibrated neural model produces confidence (probability outputs) closely
approximated by the expected accuracy. While prior studies have shown that mixup training …

Išsaugoti Cituoti Cituoja 32 Susiję straipsniai Visos 4 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Building minimal and reusable causal state abstractions for reinforcement learning

Z Wang, C Wang, X **ao, Y Zhu, P Stone - Proceedings of the AAAI …, 2024 - ojs.aaai.org

Two desiderata of reinforcement learning (RL) algorithms are the ability to learn from
relatively little experience and the ability to learn policies that generalize to a range of …

Išsaugoti Cituoti Cituoja 7 Susiję straipsniai Visos 11 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On the Calibration of Multilingual Question Answering LLMs

Y Yang, S Dan, D Roth, I Lee - arxiv preprint arxiv:2311.08669, 2023 - arxiv.org

Multilingual pre-trained Large Language Models (LLMs) are incredibly effective at Question
Answering (QA), a core task in Natural Language Understanding, achieving high accuracies …

Išsaugoti Cituoti Cituoja 1 Susiję straipsniai Visos 2 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Triple-Hybrid Energy-based Model Makes Better Calibrated Natural Language Understanding Models

H Xu, Y Zhang - Proceedings of the 17th Conference of the …, 2023 - aclanthology.org

Though pre-trained language models achieve notable success in many applications, it's
usually controversial for over-confident predictions. Specifically, the in-distribution (ID) …

Išsaugoti Cituoti Cituoja 2 Susiję straipsniai HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

Energy-based models with applications to speech and language processing

Z Ou - Foundations and Trends® in Signal Processing, 2024 - nowpublishers.com

Abstract Energy-Based Models (EBMs) are an important class of probabilistic models, also
known as random fields and undirected graphical models. EBMs are un-normalized and …

Išsaugoti Cituoti Cituoja 2 Susiję straipsniai Visos 7 versijos Paieška bibliotekoje HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] illinois.edu

Consistent and efficient long document understanding

Q Zeng - 2023 - ideals.illinois.edu

In the age of information overload, people's information needs from long documents are
rapidly emerging, while people's patience for careful reading and reasoning is gradually …

Išsaugoti Cituoti Cituoja 2 Susiję straipsniai HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Improving NMT Models by Retrofitting Quality Estimators into Trainable Energy Loss

G Yoo, JY Lee - Proceedings of the 31st International Conference …, 2025 - aclanthology.org

Reinforcement learning has shown great promise in aligning language models with human
preferences in a variety of text generation tasks, including machine translation. For …

Išsaugoti Cituoti Susiję straipsniai HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Consistent training via energy-based gflownets for modeling discrete joint distributions

C Ekbote, M Jain, P Das, Y Bengio - arxiv preprint arxiv:2211.00568, 2022 - arxiv.org

Generative Flow Networks (GFlowNets) have demonstrated significant performance
improvements for generating diverse discrete objects $ x $ given a reward function $ R (x) …

Išsaugoti Cituoti Cituoja 3 Susiję straipsniai Visos 3 versijos HTML kopija

Kurti įspėjimą

Cituoti

Išplėstinė paieška

Išsaugota skiltyje „Mano biblioteka“

Joint energy-based model training for better calibrated natural language understanding models

On reinforcement learning and distribution matching for fine-tuning language models with no catastrophic forgetting

Oops i took a gradient: Scalable sampling for discrete distributions

On the calibration of pre-trained language models using mixup guided by area under the margin and saliency

Building minimal and reusable causal state abstractions for reinforcement learning

On the Calibration of Multilingual Question Answering LLMs

Triple-Hybrid Energy-based Model Makes Better Calibrated Natural Language Understanding Models

Energy-based models with applications to speech and language processing

Consistent and efficient long document understanding

Improving NMT Models by Retrofitting Quality Estimators into Trainable Energy Loss

Consistent training via energy-based gflownets for modeling discrete joint distributions