Rank-DETR for high quality object detection
Modern detection transformers (DETRs) use a set of object queries to predict a list of
bounding boxes, sort them by their classification confidence scores, and select the top …
bounding boxes, sort them by their classification confidence scores, and select the top …
Efficient diffusion transformer with step-wise dynamic attention mediators
This paper identifies significant redundancy in the query-key interactions within self-attention
mechanisms of diffusion transformer models, particularly during the early stages of …
mechanisms of diffusion transformer models, particularly during the early stages of …
Train once, get a family: State-adaptive balances for offline-to-online reinforcement learning
Offline-to-online reinforcement learning (RL) is a training paradigm that combines pre-
training on a pre-collected dataset with fine-tuning in an online environment. However, the …
training on a pre-collected dataset with fine-tuning in an online environment. However, the …
Understanding, predicting and better resolving Q-value divergence in offline-RL
The divergence of the Q-value estimation has been a prominent issue offline reinforcement
learning (offline RL), where the agent has no access to real dynamics. Traditional beliefs …
learning (offline RL), where the agent has no access to real dynamics. Traditional beliefs …
Counterfactual-augmented importance sampling for semi-offline policy evaluation
In applying reinforcement learning (RL) to high-stakes domains, quantitative and qualitative
evaluation using observational data can help practitioners understand the generalization …
evaluation using observational data can help practitioners understand the generalization …
QFAE: Q-Function guided Action Exploration for offline deep reinforcement learning
Offline reinforcement learning (RL) expects to get an optimal policy by utilizing offline data.
During policy learning, one typical method often constrains the target policy by offline data to …
During policy learning, one typical method often constrains the target policy by offline data to …
Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning
In offline reinforcement learning, the challenge of out-of-distribution (OOD) is pronounced.
To address this, existing methods often constrain the learned policy through policy …
To address this, existing methods often constrain the learned policy through policy …
ExID: Offline RL with Intuitive Expert Insights in Limited-Data Settings
B Gangopadhyay, Z Wang, JF Yeh, S Takamatsu - 2024 - openreview.net
With the ability to learn from static datasets, Offline Reinforcement Learning (RL) emerges as
a compelling avenue for real-world applications. However, state-of-the-art offline RL …
a compelling avenue for real-world applications. However, state-of-the-art offline RL …
Towards Clinically Applicable Reinforcement Learning
S Tang - 2024 - deepblue.lib.umich.edu
In healthcare, clinicians constantly make decisions about when and how to treat each
patient. These decisions are based on medical training and clinical experience, but they …
patient. These decisions are based on medical training and clinical experience, but they …
Interactive Terrain Affordance Learning via VAE Query Selection & Data Manipulation
Terrain preference learning from trajectory queries allows complex reward structures to be
obtained for robot navigation without the need for manual specification. However, traditional …
obtained for robot navigation without the need for manual specification. However, traditional …