Open Access System for Information Sharing

Login Library

 

Conference
Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Belief Projection-Based Reinforcement Learning for Environments with Delayed Feedbac

Title
Belief Projection-Based Reinforcement Learning for Environments with Delayed Feedbac
Authors
Kim, JangwonKim, Hangyeolkang, JiwookBaek, JongchanHAN, SOOHEE
Date Issued
2023-12-13
Publisher
NeurIPS 재단
Abstract
We present a novel actor-critic algorithm for an environment with delayed feedback, which addresses the state-space explosion problem of conventional approaches. Conventional approaches use an augmented state constructed from the last observed state and actions executed since visiting the last observed state Using the augmented state space, the correct Markov decision process for delayed environments can be constructed; however, this causes the state space to explode as the number of delayed timesteps increases, leading to slow convergence. Our proposed algorithm, called Belief-Projection-Based Q-learning (BPQL), addresses the state-space explosion problem by evaluating the values of the critic for which the input state size is equal to the original state-space size rather than that of the augmented one. We compare BPQL to traditional approaches in continuous control tasks and demonstrate that it significantly outperforms other algorithms in terms of asymptotic performance and sample efficiency. We also show that BPQL solves long-delayed environments, which conventional approaches are unable to do.
URI
https://oasis.postech.ac.kr/handle/2014.oak/122357
Article Type
Conference
Citation
37th Conference on Neural Information Processing Systems (NeurIPS 2023)., 2023-12-13
Files in This Item:
There are no files associated with this item.

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Views & Downloads

Browse