Open Access System for Information Sharing

Department of Mathematics (수학과) 1. Journal Papers

Article

Cited 1 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Full metadata record

Files in This Item:: There are no files associated with this item.

DC Field	Value	Language
dc.contributor.author	Moon, Seunghyun	-
dc.contributor.author	Mun, Han-Gyeol	-
dc.contributor.author	Son, Hyunwoo	-
dc.contributor.author	Sim, Jae-Yoon	-
dc.date.accessioned	2024-02-19T06:20:07Z	-
dc.date.available	2024-02-19T06:20:07Z	-
dc.date.created	2024-02-19	-
dc.date.issued	2024-01	-
dc.identifier.issn	0018-9200	-
dc.identifier.uri	https://oasis.postech.ac.kr/handle/2014.oak/120276	-
dc.description.abstract	Various pruning and quantization heuristics have been proposed to compress recent deep-learning models. However, the rapid development of new optimization techniques makes it difficult for domain-specific accelerators to efficiently process various models showing irregularly stored parameters or nonlinear quantization. This article presents a scalable-precision deep-learning accelerator that supports multiply-and-accumulate operations (MACs) with two arbitrarily quantized data sequences. The proposed accelerator includes three main features. To minimize logic overhead when processing arbitrarily quantized 8-bit precision data, a lookup table (LUT)-based runtime reconfiguration is proposed. The use of bit-serial execution without unnecessary computations enables the multiplication of data with non-equal precision while minimizing logic and latency waste. Furthermore, two distinct data formats, raw and run-length compressed, are supported by a zero-eliminator (ZE) and runtime-density detector (RDD) that are compatible with both formats, delivering enhanced storage and performance. For a precision range of 1-8 bit and fixed sparsity of 30%, the accelerator implemented in 28 nm low-power (LP) CMOS shows a peak performance of 0.87-5.55 TOPS and a power efficiency of 15.1-95.9 TOPS/W. The accelerator supports processing with arbitrary quantization (AQ) while achieving state-of-the-art (SOTA) power efficiency.	-
dc.language	English	-
dc.publisher	Institute of Electrical and Electronics Engineers	-
dc.relation.isPartOf	IEEE Journal of Solid-State Circuits	-
dc.title	Multipurpose Deep-Learning Accelerator for Arbitrary Quantization With Reduction of Storage, Logic, and Latency Waste	-
dc.type	Article	-
dc.identifier.doi	10.1109/jssc.2023.3312615	-
dc.type.rims	ART	-
dc.identifier.bibliographicCitation	IEEE Journal of Solid-State Circuits, v.59, no.1, pp.143 - 156	-
dc.identifier.wosid	001088286600001	-
dc.citation.endPage	156	-
dc.citation.number	1	-
dc.citation.startPage	143	-
dc.citation.title	IEEE Journal of Solid-State Circuits	-
dc.citation.volume	59	-
dc.contributor.affiliatedAuthor	Sim, Jae-Yoon	-
dc.identifier.scopusid	2-s2.0-85174825344	-
dc.description.journalClass	1	-
dc.description.journalClass	1	-
dc.description.isOpenAccess	N	-
dc.type.docType	Article	-
dc.subject.keywordAuthor	Arbitrary quantization (AQ)	-
dc.subject.keywordAuthor	bit-serial processing	-
dc.subject.keywordAuthor	deep neural network (DNN) accelerator	-
dc.subject.keywordAuthor	lookup table (LUT)	-
dc.subject.keywordAuthor	precision scalability	-
dc.subject.keywordAuthor	run-length compression (RLC)	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Engineering	-

Show simple item record

qr_code

트윗하기

Communities & Collection

Department of Mathematics (수학과)

Related Researcher

Researcher

심재윤SIM, JAE YOON: Dept of Electrical Enginrg

Read more

Open Access System for Information Sharing

Communities & Collection

Related Researcher

Views & Downloads

Browse