Topic or style? Exploring the most useful features for authorship attribution
View / Open Files
Authors
Sari, Y
Stevenson, M
Vlachos, A
Publication Date
2018-01-01Journal Title
COLING 2018 - 27th International Conference on Computational Linguistics, Proceedings
Conference Name
The 27th International Conference on Computational Linguistics
ISBN
9781948087506
Publisher
Association for Computational Linguistics
Pages
343-353
Type
Conference Object
This Version
VoR
Metadata
Show full item recordCitation
Sari, Y., Stevenson, M., & Vlachos, A. (2018). Topic or style? Exploring the most useful features for authorship attribution. COLING 2018 - 27th International Conference on Computational Linguistics, Proceedings, 343-353. https://doi.org/10.17863/CAM.78746
Abstract
Approaches to authorship attribution, the task of identifying the author of a document, are based on analysis of individuals’ writing style and/or preferred topics. Although the problem has been widely explored, no previous studies have analysed the relationship between dataset characteristics and effectiveness of different types of features. This study carries out an analysis of four widely used datasets to explore how different types of features affect authorship attribution accuracy under varying conditions. The results of the analysis are applied to authorship attribution models based on both discrete and continuous representations. We apply the conclusions from our analysis to an extension of an existing approach to authorship attribution and outperform the prior state-of-the-art on two out of the four datasets used.
Identifiers
External DOI: https://doi.org/10.17863/CAM.78746
This record's URL: https://www.repository.cam.ac.uk/handle/1810/331299
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk