WordNet-based feature engineering for text classification systems

Han Joon Kim, Jae Young Chang

Research output: Contribution to journalArticlepeer-review

Abstract

This paper proposes a new way of WordNet-based feature engineering method that can help to improve text classification systems. Basically, machine learning-based classification systems can be enhanced by augmenting their set of model features. To this end, we intend to identify some significant features from training data and to extract their synonyms or hyponyms with WordNet; in order to isolate more significant feature, we devise a special function that computes a similarity between a given word and each of classes. To evaluate the proposed method, we try to improve the Naive Bayes text classifier with Reuters-21578 collection as a test set. In our experiment, we show that the proposed method can contribute to improve the Naive Bayes classifier even without modifying its core algorithm.

Original languageEnglish
Pages (from-to)8161-8168
Number of pages8
JournalInformation
Volume16
Issue number11
StatePublished - Nov 2013

Keywords

  • Feature engineering
  • Machine learning
  • Naïve bayes
  • Text classification
  • WordNet

Fingerprint

Dive into the research topics of 'WordNet-based feature engineering for text classification systems'. Together they form a unique fingerprint.

Cite this