Repository logo

Multiple Holdouts With Stability: Improving the Generalizability of Machine Learning Analyses of Brain-Behavior Relationships.

Published version



Change log


Mihalik, Agoston 
Ferreira, Fabio S 
Moutoussis, Michael 
Ziegler, Gabriel 
Adams, Rick A 


BACKGROUND: In 2009, the National Institute of Mental Health launched the Research Domain Criteria, an attempt to move beyond diagnostic categories and ground psychiatry within neurobiological constructs that combine different levels of measures (e.g., brain imaging and behavior). Statistical methods that can integrate such multimodal data, however, are often vulnerable to overfitting, poor generalization, and difficulties in interpreting the results. METHODS: We propose an innovative machine learning framework combining multiple holdouts and a stability criterion with regularized multivariate techniques, such as sparse partial least squares and kernel canonical correlation analysis, for identifying hidden dimensions of cross-modality relationships. To illustrate the approach, we investigated structural brain-behavior associations in an extensively phenotyped developmental sample of 345 participants (312 healthy and 33 with clinical depression). The brain data consisted of whole-brain voxel-based gray matter volumes, and the behavioral data included item-level self-report questionnaires and IQ and demographic measures. RESULTS: Both sparse partial least squares and kernel canonical correlation analysis captured two hidden dimensions of brain-behavior relationships: one related to age and drinking and the other one related to depression. The applied machine learning framework indicates that these results are stable and generalize well to new data. Indeed, the identified brain-behavior associations are in agreement with previous findings in the literature concerning age, alcohol use, and depression-related changes in brain volume. CONCLUSIONS: Multivariate techniques (such as sparse partial least squares and kernel canonical correlation analysis) embedded in our novel framework are promising tools to link behavior and/or symptoms to neurobiology and thus have great potential to contribute to a biologically grounded definition of psychiatric disorders.



Adolescence, Brain–behavior relationship, Depression, Framework, RDoC, SPLS, Brain, Gray Matter, Humans, Machine Learning, Mood Disorders, National Institute of Mental Health (U.S.), United States

Journal Title

Biol Psychiatry

Conference Name

Journal ISSN


Volume Title



Elsevier BV
Wellcome Trust (095844/Z/11/Z)
MQ: Transforming Mental Health (MQ17-24 Vertes)
Medical Research Council (MC_G0802534)