Open Access System for Information Sharing

Login Library

Department of Computer Science & Engineering (컴퓨터공학과) 4. Theses_Master

Thesis

Cited 0 time in webofscience

webofscience

Cited 0 time in scopus

scopus

Metadata Downloads

Full metadata record

Files in This Item:: There are no files associated with this item.

DC Field	Value	Language
dc.contributor.author	전은영	-
dc.date.accessioned	2023-04-07T16:34:39Z	-
dc.date.available	2023-04-07T16:34:39Z	-
dc.date.issued	2022	-
dc.identifier.other	OAK-2015-09840	-
dc.identifier.uri	http://postech.dcollection.net/common/orgView/200000602421	ko_KR
dc.identifier.uri	https://oasis.postech.ac.kr/handle/2014.oak/117294	-
dc.description	Master	-
dc.description.abstract	Text-to-image synthesis aims to generate a photo-realistic image from a given natural language description. A text description, unlike a label condition, includes many constraints which make the synthesis task challenging. Although significant progress has been made in generating visually realistic images using Generative Adversarial Networks (GANs), current text-to-image synthesis models ignore some text constraints. In this paper, we address the text-image consistency problem by adopting image captioning task. Image captioning is an inversion problem of text-to-image synthesis so it works to keep cycle consistency. To this end, we propose a Recaptioning Discriminator (RecapD) which not only computes the adversarial logits but also redescribes the input image. The RecapD internally has captioning model which is trained with the discriminator. Therefore, RecapD is more efficient than adopting an extra pre-trained captioning model. Furthermore, RecapD encourages the generator to produce a realistic and text-aligned image for good redescription by using the internal captioning model. Experiments on the MS-COCO dataset show the superiority of our proposed method compared to recent text-to-image synthesis models. Ablation study demonstrates the effectiveness of the proposed RecapD. We use FID to measure image quality, and R-precision to evaluate text-image consistency. The RecapD significantly improves performance of both image quality and text-image consistency.	-
dc.language	eng	-
dc.publisher	포항공과대학교	-
dc.title	Improving Text-to-Image Generation by Discriminator with Recaption Ability	-
dc.title.alternative	이미지 캡션 재생성 판별자를 이용한 텍스트 대 이미지 생성 모델 개선	-
dc.type	Thesis	-
dc.contributor.college	컴퓨터공학과	-
dc.date.degree	2022- 2	-

Show simple item record

qr_code

트윗하기

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Communities & Collection

Department of Computer Science & Engineering (컴퓨터공학과)

Views & Downloads

OAK

개인정보처리방침 Personal Information Protection Policy

library@postech.ac.kr Tel: 054-279-2548

Copyrights © by 2017 Pohang University of Science ad Technology All right reserved.

Browse

Login Library Help