Open Access System for Information Sharing

Department of Computer Science & Engineering (컴퓨터공학과) 3. Theses_Ph.D.

Thesis

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

대표 스톨 이벤트 스택을 이용한 프로세서 성능 모델링

Title: 대표 스톨 이벤트 스택을 이용한 프로세서 성능 모델링

Authors: 장한휘

Date Issued: 2019

Publisher: 포항공과대학교

Abstract: Processor microarchitectures have been evolving and getting sophisticated to meet increasing compute demands. Processor designs have moved from simple, in-order core to complex, out-of-order core to exploit hidden instruction-level parallelism of workloads. Besides, the advent of multi-core and heterogeneous processors to overcome power-wall and the end of Dennard scaling even further increase the complexity of processor designs. Designing a high-performance processor is an extremely challenging task. To find an optimal design, computer architects need to evaluate various designs. Because each design's overall performance is determined by many performance-critical events, which interact inside the processor, architects rely on accurate but slow cycle-level simulators to evaluate various processor designs. From a series of slow simulations, they recognize the critical bottlenecks of the current design and choose the next one that resolves the bottlenecks, then repeat the process until they find the optimal one. However, the simulations are too slow so architects cannot evaluate whole design points within a development time. To minimize the simulation overhead, they can use performance models to reduce the number of simulations by predicting the performance of designs. Nonetheless, the existing method is still slow or inaccurate, so architects end up with a sub-optimal design. In this thesis, we propose a fast and accurate performance methodology, Representative Stall-Event Stacks. The key idea is considering all key performance bottlenecks of the current design including the hidden secondary bottlenecks as well as the critical one to predict the performance of design changes. Then, based on the model, we introduce two design exploration methods, RpStacks and RpStacks-MT, for out-of-order superscalar processors and multi-core processors, respectively. For exploring 1,000 design points, RpStacks achieves 26 times speedup compared to a cycle-level simulator. RpStacks-MT is 88 times faster than the simulator on multi-core design space exploration for 10,000 design points.

URI: http://postech.dcollection.net/common/orgView/200000175795
https://oasis.postech.ac.kr/handle/2014.oak/111840

Article Type: Thesis

Files in This Item:: There are no files associated with this item.

Show full item record

qr_code

트윗하기

Communities & Collection

Department of Computer Science & Engineering (컴퓨터공학과)

Open Access System for Information Sharing

Communities & Collection

Views & Downloads

Browse