Open Access System for Information Sharing

Department of Electrical Engineering (전자전기공학과) 4. Theses_Master

Thesis

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Pruning-Based Deep Neural Network Compression Techniques for Edge Devices

Title: Pruning-Based Deep Neural Network Compression Techniques for Edge Devices

Authors: 박종민

Date Issued: 2021

Publisher: 포항공과대학교

Abstract: Targeting the resource-limited intelligent mobile system, the two most significant factors limiting the behavior of deep neural networks (DNNs) are computation cost and memory usage. In this thesis, various novel pruning techniques are studied to reduce the computation cost and memory usage of DNNs. Proposed novel pruning techniques carefully consider the DNN's structure used representatively in each field; vision and speech. In the vision domain, filter-wise pruning and multi-level indexing techniques are proposed to compress multiple convolutional neural networks (CNNs) to extreme levels. Multi-level indexing is designed to use minimal index data to encode the model, considering the filter and element level sparsity. Furthermore, the removed filter groups are retrained considering multiple types of inputs. After the retraining, the retrained filters are merged into the existing pruned model in a stacked form. The stacked model is dynamically decoded with optimal filters in inference according to the input type with maximizing accuracy. In the speech domain, singular value decomposition (SVD) based model compression is applied for the lightweight sequence to sequence (seq2seq) model. The LSTM based seq2seq structures have a large matrix multiplication as the model's primary computation process. The proposed SVD based compression removes less necessary computations by utilizing the low-rank approximation. The seq2seq model is divided into N subsets with well-optimized pruning thresholds to relax the computation cost and reduce the model size while maintaining the model accuracy. As a result, the proposed two novel pruning methods enable aggressive optimization by reflecting the characteristics of the model used in each domain and leading to the practical DNN architecture for edge devices.

URI: http://postech.dcollection.net/common/orgView/200000507114
https://oasis.postech.ac.kr/handle/2014.oak/114154

Article Type: Thesis

Files in This Item:: There are no files associated with this item.

Show full item record

qr_code

트윗하기

Communities & Collection

Department of Electrical Engineering (전자전기공학과)

Open Access System for Information Sharing

Communities & Collection

Views & Downloads

Browse