Repository logo

Confirmatory reinforcement learning changes with age during adolescence.

Published version

Change log


Understanding how learning changes during human development has been one of the long-standing objectives of developmental science. Recently, advances in computational biology have demonstrated that humans display a bias when learning to navigate novel environments through rewards and punishments: they learn more from outcomes that confirm their expectations than from outcomes that disconfirm them. Here, we ask whether confirmatory learning is stable across development, or whether it might be attenuated in developmental stages in which exploration is beneficial, such as in adolescence. In a reinforcement learning (RL) task, 77 participants aged 11-32 years (four men, mean age = 16.26) attempted to maximize monetary rewards by repeatedly sampling different pairs of novel options, which varied in their reward/punishment probabilities. Mixed-effect models showed an age-related increase in accuracy as long as learning contingencies remained stable across trials, but less so when they reversed halfway through the trials. Age was also associated with a greater tendency to stay with an option that had just delivered a reward, more than to switch away from an option that had just delivered a punishment. At the computational level, a confirmation model provided increasingly better fit with age. This model showed that age differences are captured by decreases in noise or exploration, rather than in the magnitude of the confirmation bias. These findings provide new insights into how learning changes during development and could help better tailor learning environments to people of different ages. RESEARCH HIGHLIGHTS: Reinforcement learning shows age-related improvement during adolescence, but more in stable learning environments compared with volatile learning environments. People tend to stay with an option after a win more than they shift from an option after a loss, and this asymmetry increases with age during adolescence. Computationally, these changes are captured by a developing confirmatory learning style, in which people learn more from outcomes that confirm rather than disconfirm their choices. Age-related differences in confirmatory learning are explained by decreases in stochasticity, rather than changes in the magnitude of the confirmation bias.


Funder: Jacobs Foundation; Id:

Funder: Agence Nationale de la Recherche; Id:


adolescence, computational modelling, confirmation bias, exploration, learning rates, reinforcement learning, Male, Humans, Adolescent, Reinforcement, Psychology, Learning, Reward, Punishment

Journal Title

Dev Sci

Conference Name

Journal ISSN


Volume Title


Wellcome Trust (via University of Oxford) (107496/Z/15/Z?)
Wellcome Trust (104908/Z/14/Z)