Cluster-based hierarchical topic trees for topic detection

Man Xuan, Han Joon Kim, Jae Young Chang

Research output: Contribution to journalArticlepeer-review

Abstract

Extracting topic keywords from on-line text documents is highly significant in text mining applications. In our work, extracted keywords are represented as a hierarchical topic tree. For this, we basically use incremental clustering technique for incoming online documents. Moreover, we define a cluster-based measure similar to the tfidf measure and a probabilistic inequality to determine subsumption relationships among keywords. In this paper, with Google news data, we empirically analyze our proposed method in terms of the threshold value of incremental clustering algorithm, the range of keyword extraction measure and the amount of text data and prove its superiority.

Original languageEnglish
Article number102
Pages (from-to)706-710
Number of pages5
JournalLife Science Journal
Volume11
Issue number7
StatePublished - 2014

Keywords

  • Clustering
  • Text mining
  • Topic keywords
  • Topic trees

Fingerprint

Dive into the research topics of 'Cluster-based hierarchical topic trees for topic detection'. Together they form a unique fingerprint.

Cite this