Open Access System for Information Sharing

Login Library

 

Thesis
Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

통계기계번역에서 문장구조와 단어에 기반한 클러스터링

Title
통계기계번역에서 문장구조와 단어에 기반한 클러스터링
Authors
김한경
Date Issued
2010
Publisher
포항공과대학교
Abstract
Clustering method which based on sentence type or document genre is a technique used to improve translation quality of statistical machine translation (SMT) by domain-specific translation. But there is no previous research using sentence type information and document genre simultaneously. In this paper, we suggest an integrated clustering method that classifying sentence type by syntactic structure similarity and document genre by word similarity information. We interpolated domain-specific models from clusters with general models to improve translation quality of SMT system. Both similarities are calculated by cosine measures and interpolated. With these similarities, we used K-means machine learning algorithm to clustering training corpus. Compared to previous approach in Japanese-English patent translation corpus, this approach relatively improved 14% of translation quality.
URI
http://postech.dcollection.net/jsp/common/DcLoOrgPer.jsp?sItemId=000000547198
https://oasis.postech.ac.kr/handle/2014.oak/607
Article Type
Thesis
Files in This Item:
There are no files associated with this item.

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Views & Downloads

Browse