News keyword extraction for topic tracking

Sungjick Lee, Han Joon Kim

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

84 Scopus citations

Abstract

This paper presents a keyword extraction technique that can be used for tracking topics over time. In our work, keywords are a set of significant words in an article that gives high-level description of its contents to readers. Identifying keywords from a large amount of on-line news data is very useful in that it can produce a short summary of news articles. As on-line text documents rapidly increase in size with the growth of WWW, keyword extraction has become a basis of several text mining applications such as search engine, text categorization, summarization, and topic detection. Manual keyword extraction is an extremely difficult and time consuming task; in fact, it is almost impossible to extract keywords manually In case of news articles published in a single day due to their volume. For a rapid use of keywords, we need to establish an automated process that extracts keywords from news articles. We propose an unsupervised keyword extraction technique that includes several variants of the conventional TF-IDF model with reasonable heuristics.

Original languageEnglish
Title of host publicationProceedings - 4th International Conference on Networked Computing and Advanced Information Management, NCM 2008
Pages554-559
Number of pages6
DOIs
StatePublished - 2008
Event4th International Conference on Networked Computing and Advanced Information Management, NCM 2008 - Gyeongju, Korea, Republic of
Duration: 2 Sep 20084 Sep 2008

Publication series

NameProceedings - 4th International Conference on Networked Computing and Advanced Information Management, NCM 2008
Volume2

Conference

Conference4th International Conference on Networked Computing and Advanced Information Management, NCM 2008
Country/TerritoryKorea, Republic of
CityGyeongju
Period2/09/084/09/08

Fingerprint

Dive into the research topics of 'News keyword extraction for topic tracking'. Together they form a unique fingerprint.

Cite this