The Deconstruction of Reinforcement Learning in Human Substance Use Disorder
Individuals diagnosed with substance use disorder (SUD) often behave in ways detrimental to their own interest and well-being. The mechanisms behind such maladaptive behaviour in human SUD remain unclear, but can be explained by disruptions to reinforcement learning processes that under normal circumstances shape behaviour adaptively. This perspective has led to two different, but not mutually exclusive, hypotheses: (1) reinforcement learning is impaired in drug-addicted individuals, as they are unable to learn from the consequences of their actions, and (2) learned behaviour in drug-addicted individuals reflects an imbalance between two regulatory systems: the goal-directed and the habit system. Recently, trial-by-trial computational modelling lends itself as a promising tool to deconstruct latent cognitive processes that underpin learning, which can provide mechanistic insights into these impairments. Thus, with multiple learning paradigms, the objectives of this thesis are two-fold: (1) to characterise the cognitive profile related to impaired reinforcement learning and its supporting processes in SUD with computational modelling; (2) to clarify the relationship between impaired reinforcement learning and habit learning in SUD.
The first part of the thesis describes the computational analyses of task performance in probabilistic reinforcement learning. These analyses identified in two independent cohorts of stimulant-addicted individuals a selectively reduced learning rate from punishment, suggesting that their behaviour may be less amenable to negative feedback. In one of these cohorts, participants underwent pharmacological manipulations with dopamine D2/3 receptor agents, which found that both dopamine D2/3 receptor antagonist (400mg amisulpride) and agonist (0.5 mg pramipexole) differentially modulated behaviour in stimulant-addicted individuals: while both dopamine agents impaired performance in control participants, they ameliorated learning from negative feedback in stimulant-addicted individuals – confirming the link between aberrant learning and dopamine dysfunction in SUD. Next, I investigated the integrity of declarative and non-declarative memory systems in cocaine use disorder patients with a category learning task, as these systems are thought to complement reinforcement learning. I found that cocaine use disorder patients showed clear deficits in both declarative and non-declarative memory. Analyses of their response strategies revealed that these patients were more likely than control participants to adopt a simple but suboptimal memorisation strategy during learning, as opposed to a more complex integrative strategy, which supports the notion of an aberrant engagement of memory systems during reinforcement learning.
Given that SUD is associated with enhanced habit formation, I then tested the hypothesis that reinforcement learning impairments exacerbate subsequent habit formation in cocaine use disorder, by reanalysing prior data on an appetitive instrumental learning task with computational methods. Contrary to the hypothesis, I found that impaired reinforcement learning in cocaine use disorder, in the form of a reduced learning rate, is insufficient to account for enhanced habit formation in these patients, suggesting other modulatory factors at play. I subsequently addressed the question of whether patients with cocaine use disorder have insight into their behavioural tendencies by using self-report questionnaires. These data revealed evidence for a predilection for automatic habits and reduced goal-directed actions in their daily lives. Finally, I expanded my work by measuring instrumental learning in a community sample of individuals recruited online who consume alcohol hazardously (as measured with the Alcohol Use Disorder Identification Test questionnaire) – but not formally diagnosed with alcohol use disorder. I tested this with a novel task paradigm which measures goal-directed and habitual responses in a conflict situation, but did not find any evidence for an impaired goal-directed or augmented habitual control associated with harmful alcohol use.
Jointly, the study of reinforcement learning with multiple paradigms refined our understanding of maladaptive behaviours in severe SUD, which may be characterised by the attenuated effects of negative feedback on behaviour, as well as aberrant non-declarative and declarative memory systems. Impaired reinforcement learning, however, cannot fully account for habit predominance associated with SUD. Instead, this predominance might be modulated differentially by different drugs of abuse, drug use severity and individual differences in habitual tendencies.