Variable selection for naive bayes semisupervised learning

Byoung Jeong Choi, Kwang Rae Kim, Kyu Dong Cho, Changyi Park, Ja Yong Koo

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

This article deals with a semisupervised learning based on naive Bayes assumption. A univariate Gaussian mixture density is used for continuous input variables whereas a histogram type density is adopted for discrete input variables. The EM algorithm is used for the computation of maximum likelihood estimators of parameters in the model when we fix the number of mixing components for each continuous input variable. We carry out a model selection for choosing a parsimonious model among various fitted models based on an information criterion. A common density method is proposed for the selection of significant input variables. Simulated and real datasets are used to illustrate the performance of the proposed method.

Original languageEnglish
Pages (from-to)2702-2713
Number of pages12
JournalCommunications in Statistics Part B: Simulation and Computation
Volume43
Issue number10
DOIs
StatePublished - 2014

Keywords

  • BIC
  • Commondensity
  • Density estimation
  • EMalgorithm
  • Model selection
  • Naive Bayes
  • Semisupervised learning
  • Variable selection

Fingerprint

Dive into the research topics of 'Variable selection for naive bayes semisupervised learning'. Together they form a unique fingerprint.

Cite this