Multimodal Gender Fairness in Depression Prediction: Insights on Data from the USA & China.
Accepted version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Abstract
Social agents and robots are increasingly being used in wellbeing settings. However, a key challenge is that these agents and robots typically rely on machine learning (ML) algorithms to detect and analyse an individual’s mental wellbeing. The problem of bias and fairness in ML algorithms is becoming an increasingly greater source of concern. In concurrence, existing literature has also indicated that mental health conditions can manifest differently across gender and culture. We hypothesise that the representation of features (acoustic, textual, and visual) and their inter-modal relations would vary among subjects from different culture and gender, thus impacting the performance and fairness of various ML models. We present the very first evaluation of multimodal gender fairness in depression manifestation by undertaking a study on 2 different datasets from the USA and China. We undertake thorough statistical and ML experimentation and repeat the experiments for several different algorithms to ensure that the results are not algorithm-dependent. Our findings indicate though there are differences between both datasets, it is not conclusive whether this is due to the difference in depression manifestation as hypothesised or other external factors such as difference in data collection methodology. Our findings further motivate a call for more consistent and culturally-aware data collection process in order to address the problem of ML bias in depression detection and to promote the development of fairer agents and robots for wellbeing.