Investigating reinforcement learning processes in depression and substance use disorder: translational, computational and neuroimaging approaches.

Change log
Zuhlsdorff, Katharina  ORCID logo

Reinforcement learning (RL) is the process by which an animal utilises its previous experience to improve outcomes of future choices by maximising reward and minimising punishment. This thesis investigates how RL processes are altered in psychiatric disorders such as major depressive disorder (MDD) and substance use disorder (SUD). The neural basis underlying RL is investigated using brain neuroimaging techniques and translational approaches in both rats and humans. Given the importance of RL and implicated cognitive impairments in psychiatric disorders such as cognitive inflexibility, this PhD thesis sets out to integrate relevant computational and neurobiological substrates, an objective that hitherto has not been widely researched.

Chapter 3 presents the findings of a longitudinal study to investigate the behavioural and neural consequences of early-life maternal separation in rats as a way of simulating early life stress (ELS) in humans. The question addressed was whether early stress is necessary and sufficient for the development of stress-related behaviours relevant to depression. Animals underwent behavioural testing, including probabilistic reversal learning (PRL) to assess behavioural flexibility, and sequential fMRI to evaluate resting-state functional connectivity. Computational analyses revealed differences in reward and punishment learning rates in males arising from maternal separation (MS) and adulthood stress. In contrast, MS female rats showed differences in the 'stickiness' parameter, a latent variable aligned with a loss of flexibility and habit-like behaviour. Finally, MS females and MS males have opposite directional changes in connectivity, as females show lower functional connectivity from the amygdala to the anterior cingulate cortex, infralimbic cortex and insular cortex compared to males.

The subsequent chapter uses a computational approach to investigate latent vulnerability variables in cocaine addiction. A longitudinal dataset acquired in rats was analysed, which involved behavioural phenotyping for several addiction vulnerability traits, including behavioural inflexibility, together with high-resolution MRI brain scans. It was found that future drug-related compulsivity was predicted by higher values of the stickiness parameter, reflecting an increase in perseverative responding commonly found in stimulant-dependent individuals. Structurally, a positive correlation between the volume of the anterior insular cortex and a parameter relating to how subjects explore versus exploit reward options was found.

The remaining results chapters involve the analysis of three datasets collected from human participants. Chapter 5 includes data from a study involving PRL run concurrently with fMRI scanning. The participants in this study included healthy controls (HCs), as well as individuals with cocaine use disorder (CUD) and gambling disorder (GD). Contrary to previously published findings, no significant differences in alpha, beta or kappa were observed between controls and the CUD group. However, in pathological gamblers, a significant increase in side stickiness was found, showing that gamblers tend to repeat responding in the same spatial location regardless of the outcome on previous trials. Neurally, there is an altered balance in the tracking of reward and punishment expected value (EV) in GD, as well as a shifted balance in processing positive and negative punishment prediction errors (PPE) in CUD. Reward EV tracking in GD involved greater activity in the middle temporal gyrus, cingulate gyrus, precuneus cortex and amygdala, whereas during punishment EV tracking there was lower activity in the postcentral gyrus, superior parietal lobule and precuneus cortex compared to HCs. In response to positive PPEs, the frontal pole, superior frontal gyrus and cingulate gyrus showed lower activity in patients with CUD than controls, but the same group showed greater activity following negative PPEs in the superior and middle frontal gyrus.

Chapter 6 includes behavioural and clinical data from samples of patients with SUD and/or MDD as well as healthy individuals. The main findings of this chapter were that patients with SUD have reduced reinforcement sensitivity and increased stimulus stickiness, as do patients diagnosed with both disorders. No evidence for an association between computationally derived variables and clinical measures (e.g., the Inventory of Depressive Symptomatology – IDS) was found.

The final results chapter presents a novel behavioural task that measures a different subtype of proactive cognitive flexibility, specifically, how healthy participants make decisions in the face of uncertainty and whether they shift their response when they are given the opportunity to repeat their choice following presentation of unreliable feedback. Participants changed their response more frequently following negative than positive feedback. Significant fMRI activations in the frontal pole, anterior cingulate cortex, frontal orbital cortex, and superior frontal gyrus were found when the response was changed rather than repeated. Furthermore, stronger connectivity between the anterior insula and parts of the occipital cortex was found during repeat trials. Finally, it was shown using a multivariate pattern fMRI analysis that behavioural responses on the next trial could be successfully predicted.

The results in this thesis demonstrate the importance of RL in preclinical and clinical psychiatric cohorts. The parameter kappa is identified as a key behavioural marker across species. This parameter is altered as a result of ELS in rodents and can help predict rats that show high-compulsive behaviours on cocaine self-administration paradigms. In humans, kappa is affected in individuals with GD as well as SUD. Brain regions underlying RL parameters, including kappa, in both rodents and humans are identified, particularly highlighting the involvement of the cingulate gyrus in reinforcement learning across species. The results from the reversal learning task studies are then compared with findings from the behavioural and fMRI analyses of a new flexibility task, which extend our knowledge of cognitive flexibility beyond our current understanding of this construct.

Dalley, Jeffrey
Cognitive Flexibility, Major Depressive Disorder, Reinforcement Learning, Reversal Learning, Substance Use Disorder
Doctor of Philosophy (PhD)
Awarding Institution
Cambridge University
Wellcome Trust (104631/Z/14/Z)
Institute for Neuroscience, University of Cambridge; Alan Turing Institute