Open Access System for Information Sharing

Login Library

 

Thesis
Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads
Full metadata record
Files in This Item:
There are no files associated with this item.
DC FieldValueLanguage
dc.contributor.author이예하en_US
dc.date.accessioned2014-12-01T11:48:00Z-
dc.date.available2014-12-01T11:48:00Z-
dc.date.issued2012en_US
dc.identifier.otherOAK-2014-00989en_US
dc.identifier.urihttp://postech.dcollection.net/jsp/common/DcLoOrgPer.jsp?sItemId=000001218603en_US
dc.identifier.urihttps://oasis.postech.ac.kr/handle/2014.oak/1491-
dc.descriptionDoctoren_US
dc.description.abstractSince the advent of the Internet, it has become one of the most important channels for communicating information among users including individuals and news organizations.Many news organizations have started to distribute news stories on the Internet, and a large number of news stories are published by various news channels, on a daily basis.This makes it difficult to keep track of important news stories.As a result, users' need to identify top news stories has increased, and news story search has played an increasingly important role in users' Internet activity.The objective of this dissertation is to identify important news stories for a given date, using the blogosphere.Blogs consists of blog posts that are user-generated contents, and reflects diverse the opinion of users about news stories.Therefore, a news story that attracts much attention in the blogosphere is likely to be important.In this dissertation, we define the popularity of a news story as the amount of attention it receives from users within the blogosphere.We first evaluate the popularity of a news story in terms of content similarity between the story and blog posts published on a given date.For this purpose, we propose several approaches to estimate language models for each of the story and the blog posts.We also generate a temporal profile of a news story by analyzing the distribution of the number of blog posts relevant to the story over several days, and evaluate the popularity of the story based on the temporal profile.The experimental results on the TREC 2009 and 2010 Blog Track show that our approach is effective in identifying the important news stories.In particular, the proposed approach achieved the state-of-the-art performance.Furthermore, we propose a simple but effective approach to deal with the noisy information of blog posts.In general, blog posts include several types of noisy information including blog templates, advertisements and navigation panels.This noisy information is not user-generated contents, and has a bad influence on our system for identifying important news stories.The motivation for our approach is that most of the noisy contents do not change across several consecutive posts within the same blog.To eliminate the noisy information, we compare two consecutive posts belonging to the same blog.Then, we consider common parts of the two posts as the noisy contents, and remove them.Experimental results from the TREC blog track are remarkable, showing that the retrieval system using the proposed method results in an important performance improvement of about 10% MAP (Mean Average Precision) increase over that of the baseline system.en_US
dc.languageengen_US
dc.publisher포항공과대학교en_US
dc.rightsBY_NC_NDen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/2.0/kren_US
dc.titleNews Story Ranking Using Blogosphereen_US
dc.typeThesisen_US
dc.contributor.college일반대학원 컴퓨터공학과en_US
dc.date.degree2012- 2en_US
dc.type.docTypeThesis-

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Views & Downloads

Browse