Open Access System for Information Sharing

Login Library

 

Thesis
Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

혼합 길이 벡터 프루닝을 이용한 희소성 인식 Transformer 가속기

Title
혼합 길이 벡터 프루닝을 이용한 희소성 인식 Transformer 가속기
Authors
류은지
Date Issued
2024
Publisher
포항공과대학교
Abstract
We present the energy-efficient TF-MVP architecture, a sparsity-aware transformer accelerator, by introducing novel algorithm-hardware co-optimization techniques. From the previous fine-grained pruning map, for the first time, the direction strength is developed to analyze the pruning patterns quantitatively, indicating the major pruning direction and size of each layer. Then, the mixed-length vector pruning (MVP) is proposed to generate the hardware-friendly pruned-transformer model, which is fully supported by our TF-MVP accelerator with the reconfigurable PE struc- ture. Implemented in a 28nm CMOS technology, as a result, TF-MVP achieves 377 GOPs/W for accelerating GPT-2 small model by realizing 4096 multiply-accumulate operators, which is 2.09 times better than the state-of-the-art sparsity-aware transformer accelerator.
URI
http://postech.dcollection.net/common/orgView/200000732423
https://oasis.postech.ac.kr/handle/2014.oak/123313
Article Type
Thesis
Files in This Item:
There are no files associated with this item.

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Views & Downloads

Browse