- Academic Search

S Guo, Y Sun, J Hu, S Huang, H Chen, H Piao… - arxiv preprint arxiv …, 2023 - arxiv.org

Offline reinforcement learning (RL) provides a promising solution to learning an agent fully
relying on a data-driven paradigm. However, constrained by the limited quality of the offline …

Zapisz Cytuj Cytowane przez 11 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Selective imitation for efficient online reinforcement learning with pre-collected data

C Eom, D Lee, M Kwon - ICT Express, 2024 - Elsevier

Deep reinforcement learning (RL) has emerged as a promising solution for autonomous
devices requiring sequential decision-making. In the online RL framework, the agent must …

Zapisz Cytuj Cytowane przez 2 Powiązane artykuły Wszystkie wersje 2

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Temporal logic specification-conditioned decision transformer for offline safe reinforcement learning

Z Guo, W Zhou, W Li - arxiv preprint arxiv:2402.17217, 2024 - arxiv.org

Offline safe reinforcement learning (RL) aims to train a constraint satisfaction policy from a
fixed dataset. Current state-of-the-art approaches are based on supervised learning with a …

Zapisz Cytuj Cytowane przez 1 Powiązane artykuły Wszystkie wersje 7 Wersja HTML

Adversarial Conservative Alternating Q-Learning for Credit Card Debt Collection

W Liu, J Zhu, L Ni, J Bi, Z Wu, J Long… - … on Knowledge and …, 2025 - ieeexplore.ieee.org

Debt collection is utilized for risk control after credit card delinquency. The existing rule-
based method tends to be myopic and non-adaptive due to the delayed feedback …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2

An offline-to-online reinforcement learning approach based on multi-action evaluation with policy extension

X Cheng, X Huang, Z Huang, N Jiang - Applied Intelligence, 2024 - Springer

Abstract Offline Reinforcement Learning (Offline RL) is able to learn from pre-collected
offline data without real-time interaction with the environment by policy regularization via …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2

Transformer-based reinforcement learning for optical cavity temperature control system

H Zhang, Y Lu, C Wang, W Dou, S Liu, C Huang… - Applied …, 2025 - Springer

The accuracy of laser gas detection technology is influenced by the temperature of the
optical cavity. Traditional control methods suffer from inadequacies in fully considering the …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Goal-Conditioned Data Augmentation for Offline Reinforcement Learning

X Huang, DW Member, B Boulet - arxiv preprint arxiv:2412.20519, 2024 - arxiv.org

Offline reinforcement learning (RL) enables policy learning from pre-collected offline
datasets, relaxing the need to interact directly with the environment. However, limited by the …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Towards online training for RL-based query optimizer

M Ramadan, HMO Mokhtar, I Sobh… - International Journal of …, 2024 - Springer

Join query optimization aims to find the best join order for tables in a query, which is critical
for query processing performance. Recently, reinforcement learning models have been …

Zapisz Cytuj Cytowane przez 1 Powiązane artykuły Wszystkie wersje 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

DRDT3: Diffusion-Refined Decision Test-Time Training Model

X Huang, D Wu, B Boulet - arxiv preprint arxiv:2501.06718, 2025 - arxiv.org

Decision Transformer (DT), a trajectory modeling method, has shown competitive
performance compared to traditional offline reinforcement learning (RL) approaches on …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Adapting to Non-Stationary Environments: Multi-Armed Bandit Enhanced Retrieval-Augmented Generation on Knowledge Graphs

X Tang, J Li, N Du, S **e - arxiv preprint arxiv:2412.07618, 2024 - arxiv.org

Despite the superior performance of Large language models on many NLP tasks, they still
face significant limitations in memorizing extensive world knowledge. Recent studies have …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2 Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Sample efficient offline-to-online reinforcement learning

A simple unified uncertainty-guided framework for offline-to-online reinforcement learning

[HTML][HTML] Selective imitation for efficient online reinforcement learning with pre-collected data

Temporal logic specification-conditioned decision transformer for offline safe reinforcement learning

Adversarial Conservative Alternating Q-Learning for Credit Card Debt Collection

An offline-to-online reinforcement learning approach based on multi-action evaluation with policy extension

Transformer-based reinforcement learning for optical cavity temperature control system

Goal-Conditioned Data Augmentation for Offline Reinforcement Learning

Towards online training for RL-based query optimizer

DRDT3: Diffusion-Refined Decision Test-Time Training Model

Adapting to Non-Stationary Environments: Multi-Armed Bandit Enhanced Retrieval-Augmented Generation on Knowledge Graphs