Open Access System for Information Sharing

Login Library

 

Thesis
Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

이종 주 기억장치 하위체제

Title
이종 주 기억장치 하위체제
Authors
김동기
Date Issued
2016
Publisher
포항공과대학교
Abstract
High bandwidth multi-core architecture such as single-chip CPU/GPU is commonly used in high-end embedded systems. Since DRAM has problems such as difficulties in scaling down and idle power consumption (e.g., refresh operation), hybrid DRAM and emerging non-volatile memory, e.g., phase-change RAM (PRAM) configuration is considered as main memory sub-system. In the hybrid main memory sub-system, DRAM is occasionally utilized as a cache, and PRAM is considered as main memory. However, such a DRAM cache has two problems. First, in order to cache the data from DRAM to PRAM with the large granularity (i.e., DRAM page), timing and bandwidth overhead are necessary. Second, in the DRAM cache, processor can utilize only DRAM channel while PRAM channel is accessed by DRAM. In other words, memory bandwidth which can utilized by processor is limited to the bandwidth of DRAM channel. Thus, DRAM cache is not suitable for high bandwidth multi-core architecture. In this dissertation, we optimize the performance of the hybrid main memory sub-system. We consider both DRAM and PRAM as main memory. Since PRAM is slower than DRAM, to meet the tight latency requirements of CPU, we map CPU application to DRAM. If graphics program run on GPU, its latency requirement is relatively tolerant. Thus, we locate GPU application (especially, graphics) to PRAM. Since PRAM has very poor write performance, in order to guarantee GPU’s write performance, we propose an in-DRAM write buffer to absorb GPU write traffics, dynamic hot data management to enhance the efficiency of the write buffer, runtime- adaptive adjustment of the write buffer size to meet the given CPU performance bound and CPU-aware DRAM access scheduling to guarantee CPU performance. The experimental results show that in-DRAM write buffer gives 1.02-44.2 times performance improvement in GPU performance with negligible CPU performance overhead when compute-intensive CPU programs run. Recently, GPU processes not only graphics but also general purpose programs (called GPGPU). In case that latency critical program runs on the GPGPU, in-DRAM write buffer does not cover poor read latency of PRAM and it does not guarantee performance of the program which runs on GPGPU. Thus, caching mechanism is required. We propose the method called in-DRAM selective cache which merges DRAM cache into in-DRAM write buffer. In this situation, in-DRAM selective cache also has the problems of DRAM cache explained above. We solve these problems of DRAM cache using our observation. Based on the observation, the data queue of memory controller is performance bottle-neck of DRAM cache. In other words, the performance and fairness of DRAM cache is limited by the stall time at the data queue. We propose novel method to cache data selectively in a way to reduce stall time. Selective caching reduces the overhead of data caching, and it makes processor possible to access both DRAM and PRAM channels. Our experiments show that in- DRAM selective cache gives significant improvements in performance (by average 21%), fairness (by 2.1 times) and energy consumption (by 10%) compared with the best of the existing method in a multi-core system which consists of multiple CPUs and GPU.
URI
http://postech.dcollection.net/jsp/common/DcLoOrgPer.jsp?sItemId=000002223461
https://oasis.postech.ac.kr/handle/2014.oak/93228
Article Type
Thesis
Files in This Item:
There are no files associated with this item.

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Views & Downloads

Browse