Integrating incremental feature weighting into naïve bayes text classifier

Han Joon Kim, Jaeyoung Chang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

In the real-world operational environment, text classification systems should handle the problem of incomplete training set and no prior knowledge of feature space. In this regard, the most appropriate algorithm for operational text classification is the Naïve Bayes since it is easy to incrementally update its pre-learned classification model and feature space. Our work mainly focuses on improving Naïve Bayes classifier through feature weighting strategy. The basic idea is that parameter estimation of Naïve Bayes can consider the degree of feature importance as well as feature distribution. In addition, we have extended a conventional algorithm for incremental feature update for developing a dynamic feature space in operational environment. Through experiments using the Reuters-21578 and the 20Newsgroup benchmark collections, we show that the traditional multinomial Naïve Bayes classifier can be significantly improved by X2-statistic based feature weighting.

Original languageEnglish
Title of host publicationProceedings of the Sixth International Conference on Machine Learning and Cybernetics, ICMLC 2007
Pages1137-1143
Number of pages7
DOIs
StatePublished - 2007
Event6th International Conference on Machine Learning and Cybernetics, ICMLC 2007 - Hong Kong, China
Duration: 19 Aug 200722 Aug 2007

Publication series

NameProceedings of the Sixth International Conference on Machine Learning and Cybernetics, ICMLC 2007
Volume2

Conference

Conference6th International Conference on Machine Learning and Cybernetics, ICMLC 2007
Country/TerritoryChina
CityHong Kong
Period19/08/0722/08/07

Keywords

  • Feature selection
  • Feature weighting
  • Naïve Bayes classifier
  • Text classification
  • X-statistic

Fingerprint

Dive into the research topics of 'Integrating incremental feature weighting into naïve bayes text classifier'. Together they form a unique fingerprint.

Cite this