Open Access System for Information Sharing

Department of Computer Science & Engineering (컴퓨터공학과) 2. Conference Papers

Conference

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Full metadata record

Files in This Item:: There are no files associated with this item.

DC Field	Value	Language
dc.contributor.author	Kim, Youngsok	-
dc.contributor.author	JO, JAEEON	-
dc.contributor.author	JANG, HANHWI	-
dc.contributor.author	RHU, MINSOO	-
dc.contributor.author	KIM, HANJUN	-
dc.contributor.author	Kim, Jangwoo	-
dc.date.accessioned	2018-05-11T00:38:38Z	-
dc.date.available	2018-05-11T00:38:38Z	-
dc.date.created	2017-09-18	-
dc.date.issued	2017-10-18	-
dc.identifier.uri	https://oasis.postech.ac.kr/handle/2014.oak/42832	-
dc.description.abstract	Graphics Processing Unit (GPU) vendors have been scaling single-GPU architectures to satisfy the ever-increasing user demands for faster graphics processing. However, as it gets extremely difficult to further scale single-GPU architectures, the vendors are aiming to achieve the scaled performance by simultaneously using multiple GPUs connected with newly developed, fast inter-GPU networks (e.g., NVIDIA NVLink, AMD XDMA). With fast inter-GPU networks, it is now promising to employ split frame rendering (SFR) which improves both frame rate and single-frame latency by assigning disjoint regions of a frame to different GPUs. Unfortunately, the scalability of current SFR implementations is seriously limited as they suffer from a large amount of redundant computation among GPUs. This paper proposes GPUpd, a novel multi-GPU architecture for fast and scalable SFR. With small hardware extensions, GPUpd introduces a new graphics pipeline stage called Cooperative Projection & Distribution (C-PD) where all GPUs cooperatively project 3D objects to 2D screen and efficiently redistribute the objects to their corresponding GPUs. C-PD not only eliminates the redundant computation among GPUs, but also incurs minimal inter-GPU network traffic by transferring object IDs instead of mid-pipeline outcomes between GPUs. To further reduce the redistribution overheads, GPUpd minimizes inter-GPU synchronizations by implementing batching and runahead-execution of draw commands. Our detailed cycle-level simulations with 8 real-world game traces show that GPUpd achieves a geomean speedup of 4.98X in single-frame latency with 16 GPUs, whereas the current SFR implementations achieve only 3.07X geomean speedup which saturates on 4 or more GPUs.	-
dc.publisher	IEEE/ACM	-
dc.relation.isPartOf	International Symposium on Microarchitecture	-
dc.relation.isPartOf	Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)	-
dc.title	GPUpd: A Fast and Scalable Multi-GPU Architecture Using Cooperative Projection and Distribution	-
dc.type	Conference	-
dc.type.rims	CONF	-
dc.identifier.bibliographicCitation	International Symposium on Microarchitecture	-
dc.citation.conferenceDate	2017-10-14	-
dc.citation.conferencePlace	US	-
dc.citation.title	International Symposium on Microarchitecture	-
dc.contributor.affiliatedAuthor	JO, JAEEON	-
dc.contributor.affiliatedAuthor	JANG, HANHWI	-
dc.contributor.affiliatedAuthor	RHU, MINSOO	-
dc.contributor.affiliatedAuthor	KIM, HANJUN	-
dc.identifier.scopusid	2-s2.0-85034065832	-
dc.description.journalClass	1	-
dc.description.journalClass	1	-

Show simple item record

qr_code

트윗하기

Communities & Collection

Department of Computer Science & Engineering (컴퓨터공학과)

Department of Creative IT Engineering (창의IT융합공학과)

Related Researcher

Researcher

김한준KIM, HANJUN: Dept. Convergence IT Engineering

Read more

Open Access System for Information Sharing

Communities & Collection

Related Researcher

Views & Downloads

Browse