Open Access System for Information Sharing

Login Library

 

Thesis
Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads
Full metadata record
Files in This Item:
There are no files associated with this item.
DC FieldValueLanguage
dc.contributor.author김한결-
dc.date.accessioned2023-08-31T16:35:28Z-
dc.date.available2023-08-31T16:35:28Z-
dc.date.issued2023-
dc.identifier.otherOAK-2015-10227-
dc.identifier.urihttp://postech.dcollection.net/common/orgView/200000690720ko_KR
dc.identifier.urihttps://oasis.postech.ac.kr/handle/2014.oak/118424-
dc.descriptionMaster-
dc.description.abstractBehavior cloning (BC) has been considered as a practical policy constraint to alleviate the value overestimation problem from out-of-distribution (OOD) actions in the offline reinforcement learning (RL) setting. However, it has been reported that BC often suffers from insignificant policy update due to low-quality data. To overcome this problem, this paper proposes a data-selective approach to prescreen favorable data in advance before learning a policy. Positive advantage-related data is first selected to exploit the advantage function and is then applied to advantage-weighted method to further refine the policy. Finally, we present a new RL+BC algorithm, which combines RL with the proposed method and, practically, some implementation techniques are suggested to resolve the quality-quantity dilemma. The proposed algorithm outperforms the state-of-the-art algorithms on continuous control offline RL benchmark.-
dc.languageeng-
dc.publisher포항공과대학교-
dc.titleData-selective Advantage-weighted Method for Offline Reinforcement Learning-
dc.typeThesis-
dc.contributor.collegeIT융합공학과-
dc.date.degree2023- 8-

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Views & Downloads

Browse