DC Field | Value | Language |
---|---|---|
dc.contributor.author | BUM, JUN KIM | - |
dc.contributor.author | HYEYEON, CHOI | - |
dc.contributor.author | JANG, HYEONAH | - |
dc.contributor.author | KIM, SANG WOO | - |
dc.date.accessioned | 2023-02-22T01:20:44Z | - |
dc.date.available | 2023-02-22T01:20:44Z | - |
dc.date.created | 2023-02-21 | - |
dc.date.issued | 2022-10 | - |
dc.identifier.issn | 0924-669X | - |
dc.identifier.uri | https://oasis.postech.ac.kr/handle/2014.oak/115387 | - |
dc.description.abstract | Deep neural network optimization is challenging. Large gradients in their chaotic loss landscape lead to unstable behavior during gradient descent. In this paper, we investigate a stable gradient descent algorithm. We revisit the mathematical derivations of the Momentum optimizer and discuss the potential problem for steep walls. Inspired by the physical motion of the mass, we propose Smooth Momentum, a new optimizer that improves the behavior on steep walls. We mathematically analyze the characteristics of the proposed optimizer and prove that Smooth Momentum exhibits improved Lipschitz properties and convergence, which allows stable and faster convergence in gradient descent. We also demonstrate how Smooth Gradient, a component of the proposed optimizer, can be plugged into other optimizers, like Adam. The proposed method offers a regularization effect comparable to batch normalization or weight decay. Experiments demonstrate that our proposed optimizer significantly improves the optimization of transformers, convolutional neural networks, and non-convex functions for various tasks and datasets. | - |
dc.language | English | - |
dc.publisher | Kluwer Academic Publishers | - |
dc.relation.isPartOf | Applied Intelligence | - |
dc.title | Smooth momentum: improving lipschitzness in gradient descent | - |
dc.type | Article | - |
dc.identifier.doi | 10.1007/s10489-022-04207-7 | - |
dc.type.rims | ART | - |
dc.identifier.bibliographicCitation | Applied Intelligence | - |
dc.identifier.wosid | 000871171700003 | - |
dc.citation.title | Applied Intelligence | - |
dc.contributor.affiliatedAuthor | BUM, JUN KIM | - |
dc.contributor.affiliatedAuthor | HYEYEON, CHOI | - |
dc.contributor.affiliatedAuthor | JANG, HYEONAH | - |
dc.contributor.affiliatedAuthor | KIM, SANG WOO | - |
dc.identifier.scopusid | 2-s2.0-85140446045 | - |
dc.description.journalClass | 1 | - |
dc.description.journalClass | 1 | - |
dc.description.isOpenAccess | N | - |
dc.type.docType | Article; Early Access | - |
dc.subject.keywordAuthor | Gradient descent | - |
dc.subject.keywordAuthor | Momentum | - |
dc.subject.keywordAuthor | Lipschitz | - |
dc.subject.keywordAuthor | Non-convex optimization | - |
dc.subject.keywordAuthor | Deep learning | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Artificial Intelligence | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Computer Science | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
library@postech.ac.kr Tel: 054-279-2548
Copyrights © by 2017 Pohang University of Science ad Technology All right reserved.