Open Access System for Information Sharing

Department of Creative IT Engineering (창의IT융합공학과) 3. Theses_Ph.D.

Thesis

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

A Study on Hardware Accelerator for Graph Convolutional Neural Network

Title: A Study on Hardware Accelerator for Graph Convolutional Neural Network

Authors: 이경준

Date Issued: 2023

Publisher: 포항공과대학교

Abstract: This dissertation presents a study on hardware accelerator for graph convolutional neural networks (GCNs): i) A 384G Output NonZeros/J Graph Convolutional Neural Network Accelerator and ii) Joint Optimization of Cache Management and Graph Reordering for GCN Acceleration. Firstly, the paper presents the first IC implementation of a GCN accelerator chip. It introduces a sparsity-aware dataflow optimized for sub-block-wise processing of three different matrices in GCN, aiming to enhance the utilization ratio of computational resources while diminishing the redundant access of off-chip memory. The implemented accelerator in 28nm CMOS produces 384G non-zero outputs/J for the extremely sparse matrix multiplications of the GCN. It shows 58k-to-143k, 38k-to- 92k and 5k-to-13k Graph/J for the benchmark graph datasets of Cora, Citeseer and Pubmed, respectively. Benchmark tests on datasets from Cora, Citeseer, and Pubmed display energy efficiency in Graph/J, indicating a 4-to-11× and 8-to-25× improvement over 8b FPGA and 32b FPGA implementations, respectively. Secondly, the dissertation introduces a software/hardware co-optimized platform for the processing of general GCNs. Despite the established efficacy of GCNs in real-world applications such as social networks and recommendation systems, their acceleration differs from deep neural networks due to the large number of nodes and sparsity of connections. Existing techniques for reducing re-access, primarily degree-based sorting and graph partitioning, prove inadequate due to the significant variability in node connections. Thus, this research proposes a hybrid scheme of graph reordering, combining sorting and clustering into an adaptively optimized two-way partitioning strategy. The two-way partitioning scheme facilitates efficient allocation of on-chip cache memory space, reducing off-chip memory access by 4-to-12%. The implemented accelerator, again using a 28nm process, demonstrates full functionalities, improving energy efficiency by 2.2-to-3.7× in comparison to previous GCN accelerators.

URI: http://postech.dcollection.net/common/orgView/200000690323
https://oasis.postech.ac.kr/handle/2014.oak/118461

Article Type: Thesis

Files in This Item:: There are no files associated with this item.

Show full item record

qr_code

트윗하기

Communities & Collection

Department of Creative IT Engineering (창의IT융합공학과)

Open Access System for Information Sharing

Communities & Collection

Views & Downloads

Browse