Data Mining and Causal Analyses: A Healthcare Operations Approach to Understanding the Patient Journey

Change log
Bobroske, Katherine  ORCID logo

This thesis intersects traditional operations management and data analytics with clinical applications to provide insight into the patient journey – the process of how patient seek and receive care from the healthcare system. In the first section of the thesis, we empirically analyze the beginning stage of the patient journey of opioid use. In the United States alone, the opioid epidemic contributes to tens of thousands of deaths each year, and the government has funded billions of dollars of interventions including prescription drug monitoring programs and tamper-resistant opioid formulations. As most research is focused on chronic opioid users, little is known about how operational interventions shortly after opioid initiation can impact a patient’s likelihood of long-term opioid use. We combine medical and pharmaceutical claims to investigate the care delivery process for patients who initiate opioid prescriptions in the primary care setting. Of the patients who return to the primary care setting for a follow-up appointment, we estimate the impact of provider discordance (i.e. seeing a clinician other than the original opioid prescriber for the follow-up appointment) on long-term opioid use. A series of controlled logistic regressions, instrumental variable analyses, and propensity score matching (with a bias minimization method) establish a significant causal effect in the presence of unobserved confounders. Additionally, we analyze a potential mechanism underlying the main effect and investigate the factors that could help inform a potential process intervention. The analyses provide robust evidence that provider discordance during opioid initiation could be a promising and hitherto untapped opportunity to reduce the influx of patients afflicted by the opioid epidemic. In the second section of this thesis, we take a broader view of the patient journey and develop a methodology to extract a clinically meaningful representation of patient journey diagnosis and treatment patterns from claims data. While medical and pharmaceutical claims are a rich source of data in healthcare research, they are also notoriously difficult to analyze. We first process claims into strings of characters that represent the series of events (e.g., distinct interactions with the healthcare system) in the patient journey. Then we adapt the dynamic program underlying the Levenshtein edit distance sequence alignment algorithm to evaluate the level of similarity between each pair of episodes. The similarity scores serve as inputs to an unsupervised clustering algorithm that extracts the main patterns of the patient journey. When the methodology is applied to the context of mechanical back and neck pain, it yields an overview of the main treatment patterns that had not been previously captured in the medical literature. The methodology also enables investigation into self-triage: how the patient’s first decision of where to seek care is associated with longer-term variation in treatment. We identify clear differences in patient-level socio-economic and clinical characteristics between both the choice of initial site of care and treatment patterns, highlighting the potential public health impact of this methodology. Overall, we show that applying data mining and econometrics to claims data yields insights that contribute to both the medical and operations fields.

Scholtes, Stefan
healthcare operations, patient journey, medical claims, pharmaceutical claims, data mining, econometrics
Doctor of Philosophy (PhD)
Awarding Institution
University of Cambridge
Cambridge International Trust Scholarship, Cambridge Judge Business School, Cambridge School of Technology Fieldwork Fund