Open Access System for Information Sharing

Department of Computer Science & Engineering (컴퓨터공학과) 3. Theses_Ph.D.

Thesis

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

외부 자원을 활용한 대화형 에이젠트

Title: 외부 자원을 활용한 대화형 에이젠트

Authors: 이규송

Date Issued: 2017

Publisher: 포항공과대학교

Abstract: 음성 대화 시스템은 사람과 기계가 자연어를 이용하여 의사소통을 가능하게 하는 기술이며 맛집 검색, 날씨 검색, 버스 스케쥴 검색 등 다양한 목적을 위해 연구 되었다. 하지만 음성 대화 시스템에서 가장 어려운 점은 안정된 시스템을 구축하기 위한 데이터 정제 과정에서 많은 비용이 소요 된다는 것이다. 자연어 처리 및 모든 기계학습 문제에서 데이터 부족의 문제는 가장 큰 문제 중에 하나이다. 대화 시스템에서도 전문가들이 수작업으로 데이터를 구축 하고 시나리오를 작성하는데에는 많은 노력과 시간이 필요하다. 그래서 학계에서 연구 개발된 시스템의 경우 대부분 토이 시스템으로 제안된 도메인에 대해서만 응답이 가능한 경우가 대부분이 있다. 그래서 이러한 데이터 수집의 비용을 충당할 수 있는 기업에서 개발한 대화 시스템이 큰 성공을 거두고 있다. 대표적인 예로는 애플사 Siri, 마이크로소프트 S-Voice, 구글 나우, 아마존 알렉사 등이 있다. 하지만 대부분의 상업용 시스템이 규칙 기반 방법론에 의존하다 보니 인간 레벨의 대화용 시스템을 만들기에는 많은 한계가 있다. 본 박사 논문에서는 외부의 여러 다른 종류들로 이뤄진 자원을 활용하여 대화 시스템을 구축하는 방법에 대해서 제안한다. 특성이 다른 여러 데이터를 잘 활용하여 성능을 향상한다. 첫번째로 서로 다른 레이블된 데이터와 레이블 되지 않은 데이터를 활용, 두번째로 지식데이터데이스와 텍스트 데이터의 활용, 셋째는 외부 즉 다른 연구팀에서 개발한 에이젠트와 내부에서 개발한 개발한 에이젠트를 활용하여 개발된 시스템을 말한다. 서로 다른 레이블된 데이터와 레이블 되지 않은 데이터를 활용하여 자동으로 확장된 데이터를 이용하여 대화 시스템 개발의 필수적인 텍스트 분류 성능을 높이는 방법은 제안한다. 또한, 지식 데이터 데이스와 위키피디아 텍스트 데이터를 활용하여 객체명과 관계의 유사도를 이용하여 시스템 응답을 생성하고 사용자의 현재 관심과 맞는 응답을 생성하는데 사용한다. 마지막으로 다수의 외부 시스템을와 내부 시스템을 활용하여 하나의 대화에이젠트를 구축하는 하여 많은 양의 멀티 도메인 데이터를 구축 방법에 대해서 제안한다.
Conversational agents, also known as dialog systems or interactive systems, have been extensively studied to enable communication with a machine using natural language, such as asking about the weather, restaurants, and bus information. However, one of the major challenges in the development of a conversational agent is collecting sufficient human–computer interaction data to train the robust system. Therefore, only a few systems have been developed previously in academia, e.g., the single-domain or the slot-filling dialog system with a simple goal. After automatic speech recognition was significantly improved, various commercial personal assistant agents such as Apple’s Siri, Microsoft’s Cortana, Samsung’s S-voice, Google’s Now, and Amazon’s Alexa were launched. Although these companies have the resources to collect a large amount of data from existing services and devices, their conversational agents are still far from the human level of intelligence. To handle various types of user inputs, conversational agents should be automatically constructed from a large amount of heterogeneous external resources. In this thesis, several publicly available heterogeneous external resources are used such as a labeled resource and a unlabeled resource, a structured knowledge-based resource and a unstructured Wikipedia text resource, and external remote agents and internal master agent. We improved the accuracy of text classification for a conversational agent by augmenting data using labeled data and unlabeled data. We applied pre-training to the augmented data and fine-tuning to the manually labeled data. The performance was significantly improved after application of pre-training to the augmented data. Moreover, We used a structured knowledge-base database and a unstructured text for a conversational agent to generate the system action and to trigger the user's interest using the knowledge-graph-based entity similarity and relation similarity techniques. Finally, we introduced a new approach to integrate various remote external agents and internal master agent so that a single-entry point to collect multi-domain dialog data from real users could be enabled.

URI: http://postech.dcollection.net/jsp/common/DcLoOrgPer.jsp?sItemId=000002326519
https://oasis.postech.ac.kr/handle/2014.oak/93533

Article Type: Thesis

Files in This Item:: There are no files associated with this item.

Show full item record

qr_code

트윗하기

Communities & Collection

Department of Computer Science & Engineering (컴퓨터공학과)

Open Access System for Information Sharing

Communities & Collection

Views & Downloads

Browse