Classification of twitter accounts into automated agents and human users

Gilani, Z 
Kochmar, E 
Crowcroft, Jonathon  ORCID logo

Thumbnail Image
Conference Object
Change log

© 2017 Association for Computing Machinery. Online social networks (OSNs) have seen a remarkable rise in the presence of surreptitious automated accounts. Massive human user-base and business-supportive operating model of social networks (such as Twitter) facilitates the creation of automated agents. In this paper we outline a systematic methodology and train a classifier to categorise Twitter accounts into ‘automated’ and ‘human’ users. To improve classification accuracy we employ a set of novel steps. First, we divide the dataset into four popularity bands to compensate for differences in types of accounts. Second, we create a large ground truth dataset using human annotations and extract relevant features from raw tweets. To judge accuracy of the procedure we calculate agreement among human annotators as well as with a bot detection research tool. We then apply a Random Forests classifier that achieves an accuracy close to human agreement. Finally, as a concluding step we perform tests to measure the efficacy of our results.

Publication Date
Online Publication Date
Acceptance Date
4605 Data Management and Data Science, 46 Information and Computing Sciences, 4608 Human-Centred Computing
Journal Title
Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2017
Journal ISSN
Volume Title
All rights reserved