TY - JOUR
T1 - Evaluating residential location inference of twitter users at district level
T2 - focused on Seoul city
AU - Kim, Moon Gie
AU - Kang, Young Ok
AU - Koh, June Hwan
N1 - Publisher Copyright:
© 2016, Korean Spatial Information Society.
PY - 2016/8/1
Y1 - 2016/8/1
N2 - Many people can easily write their articles using social network service (SNS), since we are living in Smart phone era. If we can infer twitter user’s residential location, it is possible to analyze sentimental analysis, movement of population, disease tracking, discourse of political and social issue for conversation related research, on-line issue monitoring and managing the risk about consumer. But because of privacy disclosure issue, it’s weakness that we can’t get a large amount of twitter user location information. In this research, we are using firehose twitter data level, spatial indicator, several clustering algorithm to overcome a small amount of location information in tweet. Also we are using district level residential location inference to improve accuracy. We selected Seoul city in South Korea, which has high twitter user and population in this research. We adopted variable clustering algorithm and compared inference accuracy by distance range. This research result analyzed that using spatial indicator for group A (point type of geotag), B (SNS), C (geocode), D (polygon type of geotag) rather than group A, C and Convex hull with onion peeling clustering algorithm has more inference probability of residential location also. As a result of this research, we hope to contribute algorithm research for twitter user location information inference, sparsity overcome and automated residential location inference.
AB - Many people can easily write their articles using social network service (SNS), since we are living in Smart phone era. If we can infer twitter user’s residential location, it is possible to analyze sentimental analysis, movement of population, disease tracking, discourse of political and social issue for conversation related research, on-line issue monitoring and managing the risk about consumer. But because of privacy disclosure issue, it’s weakness that we can’t get a large amount of twitter user location information. In this research, we are using firehose twitter data level, spatial indicator, several clustering algorithm to overcome a small amount of location information in tweet. Also we are using district level residential location inference to improve accuracy. We selected Seoul city in South Korea, which has high twitter user and population in this research. We adopted variable clustering algorithm and compared inference accuracy by distance range. This research result analyzed that using spatial indicator for group A (point type of geotag), B (SNS), C (geocode), D (polygon type of geotag) rather than group A, C and Convex hull with onion peeling clustering algorithm has more inference probability of residential location also. As a result of this research, we hope to contribute algorithm research for twitter user location information inference, sparsity overcome and automated residential location inference.
KW - Clustering
KW - Firehose API
KW - Residential location inference
KW - SNS
KW - Spatial indicator
KW - Twitter
UR - http://www.scopus.com/inward/record.url?scp=85091758853&partnerID=8YFLogxK
U2 - 10.1007/s41324-016-0039-5
DO - 10.1007/s41324-016-0039-5
M3 - Article
AN - SCOPUS:85091758853
SN - 2366-3294
VL - 24
SP - 493
EP - 502
JO - Spatial Information Research
JF - Spatial Information Research
IS - 4
ER -