Semantic text classification with tensor space model-based naïve Bayes

Han Joon Kim, Jiyun Kim, Jinseog Kim

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

This paper presents a semantic naïve Bayes classification technique that is based upon our tensor space model for text representation. In our work, each of Wikipedia articles is defined as a single concept, and a document is represented as a 2nd-order tensor. Our method expands the conventional naïve Bayes by incorporating the semantic concept features into term feature statistics under the tensor-space model. Through extensive experiments using three popular document collections, we prove that the proposed method significantly outperforms the conventional naïve Bayes. Surprisingly, the classification performance amounts to almost 100% in terms of F1-measures when using Reuters-21578 and 20Newsgroups document collections.

Original languageEnglish
Title of host publication2016 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2016 - Conference Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4206-4210
Number of pages5
ISBN (Electronic)9781509018970
DOIs
StatePublished - 6 Feb 2017
Event2016 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2016 - Budapest, Hungary
Duration: 9 Oct 201612 Oct 2016

Publication series

Name2016 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2016 - Conference Proceedings

Conference

Conference2016 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2016
Country/TerritoryHungary
CityBudapest
Period9/10/1612/10/16

Keywords

  • Concepts
  • Naïve Bayes
  • Semantics
  • Tensor space
  • Text classification
  • Vector space
  • Wikipedia

Fingerprint

Dive into the research topics of 'Semantic text classification with tensor space model-based naïve Bayes'. Together they form a unique fingerprint.

Cite this