An Open-source Benchmark of Deep Learning Models for Audio-visual Apparent and Self-reported Personality Recognition

Liao, R; Song, S; Gunes, H

doi:https://doi.org/10.17863/CAM.106056

An Open-source Benchmark of Deep Learning Models for Audio-visual Apparent and Self-reported Personality Recognition

Accepted version

Peer-reviewed

Repository URI

https://www.repository.cam.ac.uk/handle/1810/364367

Repository DOI

https://doi.org/10.17863/CAM.106056

Files

Primary Accepted version (3.87 MB)

Type

Article

Authors

Liao, R

Song, S

Gunes, Hatice

https://orcid.org/0000-0003-2407-3012

Abstract

Personality determines a wide variety of human daily and working behaviours, and is crucial for understanding human internal and external states. In recent years, a large number of automatic personality computing approaches have been developed to predict either the apparent personality or self-reported personality of the subject based on non-verbal audio-visual behaviours. However, the majority of them suffer from complex and dataset-specific pre-processing steps and model training tricks. In the absence of a standardized benchmark with consistent experimental settings, it is not only impossible to fairly compare the real performances of these personality computing models but also makes them difficult to be reproduced. In this paper, we present the first reproducible audio-visual benchmarking framework to provide a fair and consistent evaluation of eight existing personality computing models (e.g., audio, visual and audio-visual) and seven standard deep learning models on both self-reported and apparent personality recognition tasks. Building upon a set of benchmarked models, we also investigate the impact of two previously-used long-term modelling strategies for summarising short-term/frame-level predictions on personality computing results. We conduct a comprehensive investigation into all the benchmarked models to demonstrate their capabilities in modelling personality traits on two publicly available datasets, audio-visual apparent personality (ChaLearn First Impression) and self-reported personality (UDIVA) datasets. The experimental results conclude: (i) apparent personality traits, inferred from facial behaviours by most benchmarked deep learning models, show more reliability than self-reported ones; (ii) visual models frequently achieved superior performances than audio models on personality recognition; (iii) non-verbal behaviours contribute differently in predicting different personality traits; and (iv) our reproduced personality computing models generally achieved worse performances than their original reported results. We make all the code and settings of this personality computing benchmark publicly available at https://github.com/liaorongfan/DeepPersonality.

Keywords

46 Information and Computing Sciences, 4608 Human-Centred Computing

Journal Title

IEEE Transactions on Affective Computing

Journal ISSN

1949-3045
1949-3045

Volume Title

abs/2210.09138

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Publisher DOI

https://doi.org/10.1109/TAFFC.2024.3363710

Rights

Attribution 4.0 International

Sponsorship

Engineering and Physical Sciences Research Council (EP/R030782/1)

Collections

University of Cambridge Research Outputs (Articles and Conferences)

An Open-source Benchmark of Deep Learning Models for Audio-visual Apparent and Self-reported Personality Recognition

Accepted version

Peer-reviewed

Repository URI

Repository DOI

Files

Type

Change log

Authors

Abstract

Description

Keywords

Journal Title

Conference Name

Journal ISSN

Volume Title

Publisher

Publisher DOI

Rights

Sponsorship

Collections