Abstract
A tweet, possessing various facets, is created at the speed of thought, propagated in real time and produces social interchange on an international scale. As a result, users demand the analysis of twitter mining with a map to search for trendy topics or find what is being talked about among users. Due to the sparsity of location information, however, there are real difficulties in analysis related to position information. To run Twitter mining on all Korean users, this study used firehose level, which is massive 100 % twitter data, while utilizing a new spatial indicator to overcome the sparsity of location information. Furthermore, the study suggested an algorithm to process firehose data and solutions to overcome the study’s limit. The conventional method of using spritzer level data and the supervised method resulted in 44 times more positions inferred on a tweet than the method using geotag, whereas the method used in this study saw inferences rise 680 fold. In the case of the clustering algorithm, the method of K-Center Clustering was found to have inferred the most number of user residential locations. The ultimate goal of the study is for the twitter data, including the massive volume of location information inferred and created in real time, to serve as a means of city monitoring by overcoming the study’s limit, which is automated refining of unnecessary words for profile location information and twitter mining.
Original language | English |
---|---|
Pages (from-to) | 421-435 |
Number of pages | 15 |
Journal | Spatial Information Research |
Volume | 24 |
Issue number | 4 |
DOIs | |
State | Published - 1 Aug 2016 |
Keywords
- City monitoring
- Location inference
- SNS
- Stream data
- Twitter mining