Repository logo
 

Development of a differential treatment selection model for depression on consolidated and transformed clinical trial datasets.

Published version
Peer-reviewed

Repository DOI


Change log

Authors

Mehltretter, Joseph 
Armstrong, Caitrin 

Abstract

Major depressive disorder (MDD) is the leading cause of disability worldwide, yet treatment selection still proceeds via "trial and error". Given the varied presentation of MDD and heterogeneity of treatment response, the use of machine learning to understand complex, non-linear relationships in data may be key for treatment personalization. Well-organized, structured data from clinical trials with standardized outcome measures is useful for training machine learning models; however, combining data across trials poses numerous challenges. There is also persistent concern that machine learning models can propagate harmful biases. We have created a methodology for organizing and preprocessing depression clinical trial data such that transformed variables harmonized across disparate datasets can be used as input for feature selection. Using Bayesian optimization, we identified an optimal multi-layer dense neural network that used data from 21 clinical and sociodemographic features as input in order to perform differential treatment benefit prediction. With this combined dataset of 5032 individuals and 6 drugs, we created a differential treatment benefit prediction model. Our model generalized well to the held-out test set and produced similar accuracy metrics in the test and validation set with an AUC of 0.7 when predicting binary remission. To address the potential for bias propagation, we used a bias testing performance metric to evaluate the model for harmful biases related to ethnicity, age, or sex. We present a full pipeline from data preprocessing to model validation that was employed to create the first differential treatment benefit prediction model for MDD containing 6 treatment options.

Description

Acknowledgements: We would like to thank GlaxoSmithKline and Eli Lilly for providing the de-identified individual patient raw data and information for the clinical trials.


Funder: ERA-Permed Vision 2020 supporting IMADAPT

Keywords

Humans, Depressive Disorder, Major, Machine Learning, Bayes Theorem, Clinical Trials as Topic, Female, Male, Antidepressive Agents, Adult, Middle Aged, Neural Networks, Computer

Journal Title

Transl Psychiatry

Conference Name

Journal ISSN

2158-3188
2158-3188

Volume Title

14

Publisher

Springer Science and Business Media LLC