Open Access System for Information Sharing

Login Library

 

Thesis
Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Pruning-Based Deep Neural Network Compression Techniques for Edge Devices

Title
Pruning-Based Deep Neural Network Compression Techniques for Edge Devices
Authors
박종민
Date Issued
2021
Publisher
포항공과대학교
Abstract
Targeting the resource-limited intelligent mobile system, the two most significant factors limiting the behavior of deep neural networks (DNNs) are computation cost and memory usage. In this thesis, various novel pruning techniques are studied to reduce the computation cost and memory usage of DNNs. Proposed novel pruning techniques carefully consider the DNN's structure used representatively in each field; vision and speech. In the vision domain, filter-wise pruning and multi-level indexing techniques are proposed to compress multiple convolutional neural networks (CNNs) to extreme levels. Multi-level indexing is designed to use minimal index data to encode the model, considering the filter and element level sparsity. Furthermore, the removed filter groups are retrained considering multiple types of inputs. After the retraining, the retrained filters are merged into the existing pruned model in a stacked form. The stacked model is dynamically decoded with optimal filters in inference according to the input type with maximizing accuracy. In the speech domain, singular value decomposition (SVD) based model compression is applied for the lightweight sequence to sequence (seq2seq) model. The LSTM based seq2seq structures have a large matrix multiplication as the model's primary computation process. The proposed SVD based compression removes less necessary computations by utilizing the low-rank approximation. The seq2seq model is divided into N subsets with well-optimized pruning thresholds to relax the computation cost and reduce the model size while maintaining the model accuracy. As a result, the proposed two novel pruning methods enable aggressive optimization by reflecting the characteristics of the model used in each domain and leading to the practical DNN architecture for edge devices.
URI
http://postech.dcollection.net/common/orgView/200000507114
https://oasis.postech.ac.kr/handle/2014.oak/114154
Article Type
Thesis
Files in This Item:
There are no files associated with this item.

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Views & Downloads

Browse