Open Access System for Information Sharing

Department of Industrial & Management Engineering (산업경영공학과) 3. Theses_Ph.D.

Thesis

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Reinforcement Learning for constrained optimization problems in industrial systems

Title: Reinforcement Learning for constrained optimization problems in industrial systems

Authors: 박형준

Date Issued: 2023

Publisher: 포항공과대학교

Abstract: 강화학습 방법은 불확실성을 내포한 복잡한 문제의 해법을 제시할 수 있는 능력 때문에 최근 산업시스템 최적화에 많이 사용되고 있다. 하지만, 제약이 존재하는 산업 시스템 최적화 문제의 경우 강화학습을 이용한 해결이 어렵다고 알려져 있으며, 제약식을 고려하는 강화학습 방법론에 관한 연구가 아직까지 잘 정립되지 않은 실정이다. 본 연구에서는 강화학습 방법이 제약이 존재하는 산업 시스템 최적화 문제를 해결할 수 있도록 지원하는 세 가지 접근법을 제안한다. 두 번째 장에서는 제약 최적화 문제와 강화학습을 결합하여 다수의 시스템 제약을 고려할 수 있는 접근법을 제시한다. 제약이 존재하는 문제에서 지역 해에 빠지는 것을 완화하는 가치기반 강화학습 방법의 적용을 위해 새로운 거리기반 가치 갱신 기법을 도입한다. 또한, 새로운 페널티 모델을 분석하여 거리기반 가치 갱신 중에 실행 불가능한 행동을 효과적으로 제외한다. 마이크로그리드 운영 최적화 문제에서 제안하는 접근법이 근사 최적 정책을 도출하는 것을 검증하여 유효성을 확인한다. 세 번째 장에서는 제약이 존재하는 산업 시스템 문제에 적용할 수 있는 정책 구조를 분석하고 이를 강화학습 방법에서 활용하는 접근법을 제안한다. 제안하는 접근법을 기반으로 제약을 고려하는 정책을 효율적으로 최적화하는 구조화된 강화학습 알고리즘을 개발한다. 이 접근법은 정책 학습 속도를 향상시킬 뿐만 아니라 강화학습 정책의 우수한 적응적 성능을 달성한다. 제안하는 접근법이 재고 관리 시스템에서 운영 효율화를 달성할 수 있음을 사례연구를 통해 검증한다. 네 번째 장에서는 휴리스틱 패러다임을 강화학습 방법에 결합하여 산업 시스템 최적화에서 NP-hard를 유발하는 제약식을 다룰 수 있는 접근법을 제안한다. 제안하는 접근법은 분해 및 진화적 두 가지 휴리스틱 특성을 이용하여 강화학습 방법이 어려운 제약식을 충족하면서 정책을 최적화할 수 있도록 지원한다. 집적회로 설계 최적화 문제에서 제안하는 접근법의 우수성을 입증하는 사례연구를 통해 현실 산업 시스템의 여러 난제 해결의 가능성을 제시한다.
Reinforcement learning (RL) has gained popularity for solving industrial system optimization problems owing to its capability to handle complex problems with uncertainty. However, RL methods encounter challenges when it comes to optimizing policies under industrial systems with various constraints. Moreover, well-established approaches for satisfying system constraints in RL are currently lacking. To address the limitation, this thesis proposes three approaches to support RL methods in solving constrained optimization problems. Chapter 2 proposes an approach combining constrained optimization with RL to satisfy the system constraints. The study introduces a distance-based value update technique to optimize policy using value-based RL in constrained industrial systems. This approach overcomes the challenge of being trapped in local optima within constrained search spaces and yields reasonable policies. Moreover, a novel penalty cost model is analyzed to effectively exclude infeasible actions during the distance-based value update. The validity of the proposed approach is verified through a case study on microgrid operation. Chapter 3 introduces an approach that exploits policy structure to support RL in solving constrained problems within industrial systems. The study develops a structured RL algorithm based on this approach, which efficiently optimizes the constrained policy. This approach not only improves convergence rates but also exhibits the superior adaptive performance of RL policy. The proposed approach is demonstrated to achieve operational efficiency in the context of inventory systems. Chapter 4 proposes an approach wherein the heuristic paradigm is integrated into RL to address the critical constraints that give rise to the NP-hardness of industrial system optimization. The proposed approach leverages decomposition and evolutionary heuristics, enabling RL to effectively optimize policies while ensuring constraint satisfaction. The proposed approach is applied to optimize the design of integrated circuits, showcasing its effectiveness in solving practical industrial problems.

URI: http://postech.dcollection.net/common/orgView/200000690360
https://oasis.postech.ac.kr/handle/2014.oak/118442

Article Type: Thesis

Files in This Item:: There are no files associated with this item.

Show full item record

qr_code

트윗하기

Communities & Collection

Department of Industrial & Management Engineering (산업경영공학과)

Open Access System for Information Sharing

Communities & Collection

Views & Downloads

Browse