Open Access System for Information Sharing

Department of Industrial & Management Engineering (산업경영공학과) 4. Theses_Master

Thesis

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Personalized Modelling Using Similarity Space Learning via Supervised Autoencoder

Title: Personalized Modelling Using Similarity Space Learning via Supervised Autoencoder

Authors: 조현재

Date Issued: 2022

Publisher: 포항공과대학교

Abstract: Recently, data with heterogeneous characteristics have increased in various fields. It makes a predictive model challenging to predict a target variable. Complex global models with an increased number of parameters have been developed for high predictive performance, but the complexity of the model makes it difficult to interpret model results. In order to overcome this, personalized modelling is being studied. Personalized modelling trains predictive models using only information from observations similar to a new point. In other words, it is a "specific to point" or a "personalized to the point" model. Thus, personalized modelling aims to be a model that can interpret model results with high predictive performance. However, existing methodologies have two limitations. First, it is a possibility that the similarity may be distorted because the similarity is calculated regardless of the importance of the predictors affecting the target variable. Second, the density of neighbors may vary depending on the location of the new point, but the existing methodologies do not reflect the density concept. This thesis proposes a new personalized modelling method. The proposed methodology converts an input space into a latent space and calculates the similarity between observations using the latent variables. This method has two significances compared to the existing methods. First, the importance of the predictors on the target variable is reflected by calculating the similarity through the latent vector. Second, the number of sampled neighbors is different according to the density of each new point by setting a threshold for similarity. This study conducted an experiment using data from various fields. In order to verify the performance of this methodology, the global logistic regression model, K-Nearest Neighborhood, and the personalized model were compared. As a result of the experiment, it was found that the proposed methodology has interpretability for the model results; also proposed model outperformed other predictive models.
최근 여러 분야에서 이질적 특성을 지닌 데이터가 증가하면서 결과 변수 예측이 어려워지고 있다. 높은 예측 성능을 위해서 파라미터 수를 늘린 복잡한 전역 모형들이 개발되고 있지만, 모형의 복잡성으로 인해 예측 결과에 대한 해석을 어렵게 만들고 있다. 이를 극복하기 위해 개인화 학습이 연구되고 있다. 개인화 학습은 새로운 관측치와 유사한 관측치들의 정보만을 이용 하여 예측 모형을 학습한다. 즉, “관측치에 특정된” 또는 “관측치에 개인화된” 모형이라고 할 수 있다. 이를통해 개인화 학습은 예측 결과에 대한 해석 능력을 가지면서도 높은 성능을 내는 모형을 목표로 한다. 하지만 기존 방법론들은 두가지 한계점을 가지고 있다. 첫번째는 결과 변수에 미치는 예측 변수들의 중요도와 상관없이 유사도가 계산되는 경우가 많아 유사도가 왜곡될 가능성이 있다. 두번째는 관측치의 위치에 따라 주변 이웃의 밀도가 다를 수 있는데, 기존 방법론은 이를 반영하기 어렵다는 점이다. 본 연구는 새로운 개인화 학습 방법을 제시한다. 제안 모형은 입력 공간을 잠재공간으로 변환하고, 잠재 변수들을 이용하여 관측치간의 유사도를 계산하는 학습 방법이다. 본 방법은 기존 방법론에 비해 2가지 의의점을 가지고 있다. 첫번째로 잠재 변수를 통해 유사도를 계산함으로써 예측변수가 결과변수에 미치는 중요도를 반영하였다. 두번째로 유사도에 대한 임계점을 설정함으로써 각 관측치의 주변 밀도에 따라 이웃의 개수를 다르게 하였다. 본 연구는 여러 분야의 데이터를 활용하여 실험을 진행하였다. 본방법론의 성능을 검증하기 위해 전역 로지스틱 회귀 모형과 K-Nearest Neighborhood, 그리고 이전 개인화 모형과 비교하였다. 실험 결과 제안 방법론은 결과에 대한 설명력을 가지면서도, 성능이 다른 예측 모형들에 비해 우수한 것으로 나타났다.

URI: http://postech.dcollection.net/common/orgView/200000598902
https://oasis.postech.ac.kr/handle/2014.oak/112334

Article Type: Thesis

Files in This Item:: There are no files associated with this item.

Show full item record

qr_code

트윗하기

Communities & Collection

Department of Industrial & Management Engineering (산업경영공학과)

Open Access System for Information Sharing

Communities & Collection

Views & Downloads

Browse