Show simple item record

dc.contributor.authorCaines, Andrewen
dc.contributor.authorPastrana, Sen
dc.contributor.authorHutchings, Aliceen
dc.contributor.authorButtery, Paulaen
dc.date.accessioned2019-01-03T00:31:16Z
dc.date.available2019-01-03T00:31:16Z
dc.date.issued2018-12-01en
dc.identifier.issn2193-7680
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/287520
dc.description.abstractThe automatic classification of posts from hacking-related online forums is of potential value for the understanding of user behaviour in social networks relating to cybercrime. We designed annotation schema to label forum posts for three properties: post type, author intent, and addressee. The post type indicates whether the text is a question, a comment, and so on. The author’s intent in writing the post could be positive, negative, moderating discussion, showing gratitude to another user, etc. The addressee of a post tends to be a general audience (e.g. other forum users) or individual users who have already contributed to a threaded discussion. We manually annotated a sample of posts and returned substantial agreement for post type and addressee, and fair agreement for author intent. We trained rule-based (logical) and machine learning (statistical) classification models to predict these labels automatically, and found that a hybrid logical–statistical model performs best for post type and author intent, whereas a purely statistical model is best for addressee. We discuss potential applications for this data, including the analysis of thread conversations in forum data and the identification of key actors within social networks.
dc.publisherSpringer
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.titleAutomatically identifying the function and intent of posts in underground forumsen
dc.typeArticle
prism.issueIdentifier1en
prism.publicationDate2018en
prism.publicationNameCrime Scienceen
prism.volume7en
dc.identifier.doi10.17863/CAM.34825
dcterms.dateAccepted2018-11-17en
rioxxterms.versionofrecord10.1186/s40163-018-0094-4en
rioxxterms.versionVoR
rioxxterms.licenseref.urihttp://www.rioxx.net/licenses/all-rights-reserveden
rioxxterms.licenseref.startdate2018-12-01en
dc.contributor.orcidCaines, Andrew [0000-0001-9647-4902]
dc.contributor.orcidHutchings, Alice [0000-0003-3037-2684]
dc.identifier.eissn2193-7680
rioxxterms.typeJournal Article/Reviewen
pubs.funder-project-idEPSRC (EP/M020320/1)
pubs.funder-project-idAlan Turing Institute (DS_SDS_1718_4)


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution 4.0 International
Except where otherwise noted, this item's licence is described as Attribution 4.0 International