Dataset Bias in Deception Detection.

Mambreyan, Ara; Punskaya, Elena; Gunes, Hatice

doi:10.17863/CAM.85707

Dataset Bias in Deception Detection.

Accepted version

Peer-reviewed

Repository URI

https://www.repository.cam.ac.uk/handle/1810/338298

Repository DOI

https://doi.org/10.17863/CAM.85707

Files

Accepted version (3.65 MB)

Type

Conference Object

Authors

Mambreyan, Ara

Punskaya, Elena

Gunes, Hatice

https://orcid.org/0000-0003-2407-3012

Abstract

With the advances in Machine Learning, lie detection technology gained significant attention. In recent years, several multi-modal techniques achieved as high as 99% accuracy results using the Real-life Trial dataset with only 121 data points. This led to considerable media hype and research interest in lie detection with machine learning. In this paper, we analyze the effect of dataset bias in deception detection. More specifically, we train a classifier to predict the sex of the identity appearing in the video. On a test data point, we use the sex predictor to predict sex which we use as a proxy for predicting deception, predicting lie for females and truth for males. This lie predictor simulates a classifier that uses nothing but dataset bias. Nevertheless, we find that the performance of this biased classifier is comparable to those of state-of-the-art papers. More specifically, when using IDT features, our biased classifier achieves 64.6% and 59.3% AUC while a classifier trained normally on truth/lie labels achieves 57.4% accuracy and 69.3% AUC. We perform similar experiments on the Bag-of-Lies dataset and show that it too is biased with respect to sex. In addition, we apply the state-of- the-art techniques on an unbiased dataset and show that their performance is no better than chance. Our experiments strongly suggest that the results of recent deception detection techniques can be explained by the bias inherent in the datasets.

Journal Title

ICPR

Conference Name

The 26th International Conference on Pattern Recognition

Publisher

IEEE

Publisher DOI

https://doi.org/10.17863/CAM.85707

Rights

Attribution 4.0 International

Sponsorship

Engineering and Physical Sciences Research Council (EP/R030782/1)
European Commission Horizon 2020 (H2020) Societal Challenges (826232)

Collections

University of Cambridge Research Outputs (Articles and Conferences)

Dataset Bias in Deception Detection.

Accepted version

Peer-reviewed

Repository URI

Repository DOI

Files

Type

Change log

Authors

Abstract

Description

Keywords

Journal Title

Conference Name

Journal ISSN

Volume Title

Publisher

Publisher DOI

Rights

Sponsorship

Collections