- Academic Search

Y Pu, W Liang, Y Hao, Y Yuan… - Advances in …, 2024 - proceedings.neurips.cc

Modern detection transformers (DETRs) use a set of object queries to predict a list of
bounding boxes, sort them by their classification confidence scores, and select the top …

Zapisz Cytuj Cytowane przez 59 Powiązane artykuły Wszystkie wersje 5 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Efficient diffusion transformer with step-wise dynamic attention mediators

Y Pu, Z **a, J Guo, D Han, Q Li, D Li, Y Yuan… - … on Computer Vision, 2024 - Springer

This paper identifies significant redundancy in the query-key interactions within self-attention
mechanisms of diffusion transformer models, particularly during the early stages of …

Zapisz Cytuj Cytowane przez 9 Powiązane artykuły Wszystkie wersje 7

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Train once, get a family: State-adaptive balances for offline-to-online reinforcement learning

S Wang, Q Yang, J Gao, M Lin… - Advances in …, 2024 - proceedings.neurips.cc

Offline-to-online reinforcement learning (RL) is a training paradigm that combines pre-
training on a pre-collected dataset with fine-tuning in an online environment. However, the …

Zapisz Cytuj Cytowane przez 13 Powiązane artykuły Wszystkie wersje 6 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Understanding, predicting and better resolving Q-value divergence in offline-RL

Y Yue, R Lu, B Kang, S Song… - Advances in Neural …, 2024 - proceedings.neurips.cc

The divergence of the Q-value estimation has been a prominent issue offline reinforcement
learning (offline RL), where the agent has no access to real dynamics. Traditional beliefs …

Zapisz Cytuj Cytowane przez 11 Powiązane artykuły Wszystkie wersje 5 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Counterfactual-augmented importance sampling for semi-offline policy evaluation

S Tang, J Wiens - Advances in Neural Information …, 2023 - proceedings.neurips.cc

In applying reinforcement learning (RL) to high-stakes domains, quantitative and qualitative
evaluation using observational data can help practitioners understand the generalization …

Zapisz Cytuj Cytowane przez 5 Powiązane artykuły Wszystkie wersje 7 Wersja HTML

QFAE: Q-Function guided Action Exploration for offline deep reinforcement learning

T Pang, G Wu, Y Zhang, B Wang, Y Yin - Pattern Recognition, 2025 - Elsevier

Offline reinforcement learning (RL) expects to get an optimal policy by utilizing offline data.
During policy learning, one typical method often constrains the target policy by offline data to …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 3

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning

T Liu, Y Li, Y Lan, H Gao, W Pan, X Xu - arxiv preprint arxiv:2405.19909, 2024 - arxiv.org

In offline reinforcement learning, the challenge of out-of-distribution (OOD) is pronounced.
To address this, existing methods often constrain the learned policy through policy …

Zapisz Cytuj Cytowane przez 1 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

ExID: Offline RL with Intuitive Expert Insights in Limited-Data Settings

B Gangopadhyay, Z Wang, JF Yeh, S Takamatsu - 2024 - openreview.net

With the ability to learn from static datasets, Offline Reinforcement Learning (RL) emerges as
a compelling avenue for real-world applications. However, state-of-the-art offline RL …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] umich.edu

Towards Clinically Applicable Reinforcement Learning

S Tang - 2024 - deepblue.lib.umich.edu

In healthcare, clinicians constantly make decisions about when and how to treat each
patient. These decisions are based on medical training and clinical experience, but they …

Zapisz Cytuj Powiązane artykuły Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Interactive Terrain Affordance Learning via VAE Query Selection & Data Manipulation

J Sinclair, B Reily, CM Reardon - The 1st InterAI Workshop: Interactive AI … - openreview.net

Terrain preference learning from trajectory queries allows complex reward structures to be
obtained for robot navigation without the need for manual specification. However, traditional …

Zapisz Cytuj Powiązane artykuły Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Boosting offline reinforcement learning with action preference query

Rank-DETR for high quality object detection

Efficient diffusion transformer with step-wise dynamic attention mediators

Train once, get a family: State-adaptive balances for offline-to-online reinforcement learning

Understanding, predicting and better resolving Q-value divergence in offline-RL

Counterfactual-augmented importance sampling for semi-offline policy evaluation

QFAE: Q-Function guided Action Exploration for offline deep reinforcement learning

Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning

ExID: Offline RL with Intuitive Expert Insights in Limited-Data Settings

Towards Clinically Applicable Reinforcement Learning

Interactive Terrain Affordance Learning via VAE Query Selection & Data Manipulation