Open Access System for Information Sharing

Login Library

 

Thesis
Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

A Study on Hardware Accelerator for Graph Convolutional Neural Network

Title
A Study on Hardware Accelerator for Graph Convolutional Neural Network
Authors
이경준
Date Issued
2023
Publisher
포항공과대학교
Abstract
This dissertation presents a study on hardware accelerator for graph convolutional neural networks (GCNs): i) A 384G Output NonZeros/J Graph Convolutional Neural Network Accelerator and ii) Joint Optimization of Cache Management and Graph Reordering for GCN Acceleration. Firstly, the paper presents the first IC implementation of a GCN accelerator chip. It introduces a sparsity-aware dataflow optimized for sub-block-wise processing of three different matrices in GCN, aiming to enhance the utilization ratio of computational resources while diminishing the redundant access of off-chip memory. The implemented accelerator in 28nm CMOS produces 384G non-zero outputs/J for the extremely sparse matrix multiplications of the GCN. It shows 58k-to-143k, 38k-to- 92k and 5k-to-13k Graph/J for the benchmark graph datasets of Cora, Citeseer and Pubmed, respectively. Benchmark tests on datasets from Cora, Citeseer, and Pubmed display energy efficiency in Graph/J, indicating a 4-to-11× and 8-to-25× improvement over 8b FPGA and 32b FPGA implementations, respectively. Secondly, the dissertation introduces a software/hardware co-optimized platform for the processing of general GCNs. Despite the established efficacy of GCNs in real-world applications such as social networks and recommendation systems, their acceleration differs from deep neural networks due to the large number of nodes and sparsity of connections. Existing techniques for reducing re-access, primarily degree-based sorting and graph partitioning, prove inadequate due to the significant variability in node connections. Thus, this research proposes a hybrid scheme of graph reordering, combining sorting and clustering into an adaptively optimized two-way partitioning strategy. The two-way partitioning scheme facilitates efficient allocation of on-chip cache memory space, reducing off-chip memory access by 4-to-12%. The implemented accelerator, again using a 28nm process, demonstrates full functionalities, improving energy efficiency by 2.2-to-3.7× in comparison to previous GCN accelerators.
URI
http://postech.dcollection.net/common/orgView/200000690323
https://oasis.postech.ac.kr/handle/2014.oak/118461
Article Type
Thesis
Files in This Item:
There are no files associated with this item.

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Views & Downloads

Browse