TY - GEN
T1 - News keyword extraction for topic tracking
AU - Lee, Sungjick
AU - Kim, Han Joon
PY - 2008
Y1 - 2008
N2 - This paper presents a keyword extraction technique that can be used for tracking topics over time. In our work, keywords are a set of significant words in an article that gives high-level description of its contents to readers. Identifying keywords from a large amount of on-line news data is very useful in that it can produce a short summary of news articles. As on-line text documents rapidly increase in size with the growth of WWW, keyword extraction has become a basis of several text mining applications such as search engine, text categorization, summarization, and topic detection. Manual keyword extraction is an extremely difficult and time consuming task; in fact, it is almost impossible to extract keywords manually In case of news articles published in a single day due to their volume. For a rapid use of keywords, we need to establish an automated process that extracts keywords from news articles. We propose an unsupervised keyword extraction technique that includes several variants of the conventional TF-IDF model with reasonable heuristics.
AB - This paper presents a keyword extraction technique that can be used for tracking topics over time. In our work, keywords are a set of significant words in an article that gives high-level description of its contents to readers. Identifying keywords from a large amount of on-line news data is very useful in that it can produce a short summary of news articles. As on-line text documents rapidly increase in size with the growth of WWW, keyword extraction has become a basis of several text mining applications such as search engine, text categorization, summarization, and topic detection. Manual keyword extraction is an extremely difficult and time consuming task; in fact, it is almost impossible to extract keywords manually In case of news articles published in a single day due to their volume. For a rapid use of keywords, we need to establish an automated process that extracts keywords from news articles. We propose an unsupervised keyword extraction technique that includes several variants of the conventional TF-IDF model with reasonable heuristics.
UR - http://www.scopus.com/inward/record.url?scp=57849123256&partnerID=8YFLogxK
U2 - 10.1109/NCM.2008.199
DO - 10.1109/NCM.2008.199
M3 - Conference contribution
AN - SCOPUS:57849123256
SN - 9780769533223
T3 - Proceedings - 4th International Conference on Networked Computing and Advanced Information Management, NCM 2008
SP - 554
EP - 559
BT - Proceedings - 4th International Conference on Networked Computing and Advanced Information Management, NCM 2008
T2 - 4th International Conference on Networked Computing and Advanced Information Management, NCM 2008
Y2 - 2 September 2008 through 4 September 2008
ER -