Open Access System for Information Sharing

Login Library

 

Article
Cited 86 time in webofscience Cited 115 time in scopus
Metadata Downloads
Full metadata record
Files in This Item:
There are no files associated with this item.
DC FieldValueLanguage
dc.contributor.authorKwon, OW-
dc.contributor.authorLee, JH-
dc.date.accessioned2016-03-31T12:56:02Z-
dc.date.available2016-03-31T12:56:02Z-
dc.date.created2010-01-11-
dc.date.issued2003-01-
dc.identifier.issn0306-4573-
dc.identifier.other2003-OAK-0000003143-
dc.identifier.urihttps://oasis.postech.ac.kr/handle/2014.oak/18730-
dc.description.abstractAutomatic categorization is a viable method to deal with the scaling problem on the World Wide Web. For Web site classification, this paper proposes the use of Web pages linked with the home page in a different manner from the sole use of home pages in previous research. To implement our proposed method, we derive a scheme for Web site classification based on the k-nearest neighbor (k-NN) approach: It consists of three phases: Web page selection (connectivity analysis), Web page classification, and Web site classification. Given a Web site, the Web page selection chooses several representative Web pages using connectivity analysis. The k-NN classifier next classifies each of the selected Web pages. Finally, the classified Web pages are extended to a classification of the entire Web site. To improve performance, we supplement the k-NN approach with a feature selection method and a term weighting scheme using markup tags, and also reform its document-document similarity measure. In our experiments on a Korean commercial Web directory, the proposed system, using both a home page and its linked pages, improved the performance of micro-averaging breakeven point by 30.02%, compared with an ordinary classification which uses a home page only. (C) 2002 Elsevier Science Ltd. All rights reserved.-
dc.description.statementofresponsibilityX-
dc.languageEnglish-
dc.publisherPERGAMON-ELSEVIER SCIENCE LTD-
dc.relation.isPartOfINFORMATION PROCESSING & MANAGEMENT-
dc.subjecttext categorization-
dc.subjectWeb site classification-
dc.subjectWeb page classification-
dc.subjectk-nearest neighbor approach-
dc.subjectmachine learning-
dc.titleText categorization based on k-nearest neighbor approach for Web site classification-
dc.typeArticle-
dc.contributor.college컴퓨터공학과-
dc.identifier.doi10.1016/S0306-4573(02)00022-5-
dc.author.googleKwon, OW-
dc.author.googleLee, JH-
dc.relation.volume39-
dc.relation.issue1-
dc.relation.startpage25-
dc.relation.lastpage44-
dc.contributor.id10083961-
dc.relation.journalINFORMATION PROCESSING & MANAGEMENT-
dc.relation.indexSCI급, SCOPUS 등재논문-
dc.relation.sciSCIE-
dc.collections.nameJournal Papers-
dc.type.rimsART-
dc.identifier.bibliographicCitationINFORMATION PROCESSING & MANAGEMENT, v.39, no.1, pp.25 - 44-
dc.identifier.wosid000180495500002-
dc.date.tcdate2019-01-01-
dc.citation.endPage44-
dc.citation.number1-
dc.citation.startPage25-
dc.citation.titleINFORMATION PROCESSING & MANAGEMENT-
dc.citation.volume39-
dc.contributor.affiliatedAuthorLee, JH-
dc.identifier.scopusid2-s2.0-0037213443-
dc.description.journalClass1-
dc.description.journalClass1-
dc.description.wostc69-
dc.type.docTypeArticle-
dc.subject.keywordAuthortext categorization-
dc.subject.keywordAuthorWeb site classification-
dc.subject.keywordAuthorWeb page classification-
dc.subject.keywordAuthork-nearest neighbor approach-
dc.subject.keywordAuthormachine learning-
dc.relation.journalWebOfScienceCategoryComputer Science, Information Systems-
dc.relation.journalWebOfScienceCategoryInformation Science & Library Science-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaInformation Science & Library Science-

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher

이종혁LEE, JONG HYEOK
Grad. School of AI
Read more

Views & Downloads

Browse