Orbitofrontal cortex neurons code utility changes during natural reward consumption as correlates of relative reward-specific satiety

Natural, on-going reward consumption can differentially reduce the subjective value (‘utility’) of specific rewards, which indicates relative, reward-specific satiety. Two-dimensional choice indifference curves (IC) represent the utility of choice options with two distinct reward components (‘bundles’) according to Revealed Preference Theory. We estimated two-dimensional ICs from stochastic choices and found that natural on-going consumption of two bundle rewards induced specific IC distortions that indicated differential reduction of reward utility indicative of relative reward-specific satiety. Licking changes confirmed satiety in a mechanism-independent manner. Neuronal signals in orbitofrontal cortex (OFC) that coded the value of the chosen option followed closely the consumption-induced IC distortions within recording periods of individual neurons. A neuronal classifier predicted well the changed utility inferred from the altered behavioral choices. Neuronal signals for more conventional single-reward choice options showed similar relationships to utility alterations from on-going consumption. These results demonstrate a neuronal substrate for the differential, reward-specific alteration of utility by on-going reward consumption reflecting reward-specific satiety. Significance Repeated delivery reduces the subjective value (‘utility’) of rewards to different degrees depending on their individual properties, a phenomenon commonly referred to as sensory-specific satiety. We tested monkeys during economic choice of two-component options. On-going consumption differentially reduced reward utility in a way that suggested relative reward-specific satiety between the two components. Neurons in the orbitofrontal cortex (OFC) changed their responses in close correspondence to the differential utility reduction, thus representing a neuronal correlate of relative reward-specific satiety. Control experiments with conventional single-component choice showed similar satiety-induced differential response reductions. These results are compatible with the notion of OFC neurons coding crucial decision variables robustly across different satiety levels.


107
We presented the monkey simultaneously with two composite stimuli on a horizontally mounted 108 touch screen (binary choice task with two discrete, mutually exclusive and collectively exhaustive 109 options; Figure 1A, B). Two rectangles in each stimulus represented a bundle with two reward 110 components whose individual amounts were indicated by a vertical bar (higher was more). The two 111 components were blackcurrant juice or blackcurrant juice with added monosodium glutamate 112 (MSG) in all bundle types as Reward A, and grape juice, strawberry juice, mango juice, water, 113 apple juice, peach juice or grape juice with added inosine monophosphate (IMG) as Reward B.   Basic behavioral design 144 Our study followed the notions that subjective reward value (utility) can be inferred from 145 observable economic choice, that altered choice would indicate a change in utility, and that a 146 reduction of utility from natural, on-going consumption reflects satiety. The assessment of 147 differential, reward-specific utility change requires at least two rewards. We tested choices between 148 bundles that each had two liquid rewards whose independently variable amounts were represented 149 at the axes and interior of two-dimensional graphs ( Figure 1C). We investigated neuronal activity in 150 repeated trials for reasons of statistics and thus tested stochastic, rather than single-shot, choices that 151 are often used on humans. 152 Pilot tests of all rewards had indicated that blackcurrant juice was least prone to satiety, 153 possibly reflecting taste and/or sugar content differences. Therefore, we designated blackcurrant 154 juice as Reward A for the y-axis of the two-dimensional graph, whereas all other liquids constituted 155 Reward B and were plotted on the x-axis. This convention allowed us to estimate the relative value 156 of all rewards in the common currency of blackcurrant juice at choice indifference. 157 In choice between two bundles, relative reward utility is inferred from the amount of the   172 At the onset of a daily experiment, the black and green bundles of Figure 1C were chosen with 173 equal probability. When choosing the green bundle, the animal gave up 0.5 ml of blackcurrant juice 174 (from 0.6 ml to 0.1 ml) to gain 0.3 ml of grape juice. With on-going consumption of both juices the 175 value ratio between the rewards (trade-off amount) changed: to gain the same 0.3 ml amount of 176 grape juice, the animal gave up progressively less blackcurrant juice, from 0.45 ml via 0.38 ml and 177 0.25 ml to finally only 1.8 ml ( Figure 1C; upward arrow, from violet via blue and orange to red). 178 Thus, the slope between the two bundles on the two-dimensional graph changed as the animal 179 'payed' progressively less blackcurrant juice for the same amount of grape juice. 180 We set both rewards in the Reference Bundle, and one reward of the Variable Bundle, to  grape juice; apparently, grape juice had lost more value compared to blackcurrant juice. As each IP 191 was estimated psychophysically in 80 trials, satiety as studied here progressed in test blocks rather 192 than on a trial-by-trial basis. The initial two IPs were close together (green and violet within green 193 95% confidence interval, CI), suggesting initially maintained reward value, whereas the next IPs 194 outside the CI were considerably higher and indicated substantial value loss (blue, yellow and red 195 IPs). In other words, the MRS declined with on-going consumption, as schematized in Figure 1C. 196 We assumed that the value change inferred from IP positions outside the CI indicated satiety.

197
At choice indifference between the two bundles, the amounts of the two Variable Bundle 198 rewards defined an IP (Figure 1E). A new IP was obtained by setting the Reference Bundle to a   199   previously estimated IP position, then setting one reward of the Variable Bundle to a specific   200   amount, varying its other reward psychophysically and estimating choice indifference from curve   201 fitting. Repetition of this procedure, in pseudorandomly alternating directions to avoid local 202 distortions (Knetsch, 1989), resulted in a series of equally preferred IPs. We used these IPs to fit 203 two-dimensional indifference curves (IC) whose slope and curvature reflected the utility of one 204 bundle reward relative to the other bundle reward ( Figure 1E; see Methods; Eq. 1). Thus, on-going 205 reward consumption resulted not only in slope change ( Figure 1C) but in more informative 206 monotonic IC curvature change from convex (green) via near-linear (blue) to concave (red), which 207 provided systematic evidence for the animal's increasing reluctance to give up blackcurrant juice 208 unless receiving more substantial amounts of grape juice. Both IC changes characterized in a 209 systematic manner the differential reduction of utility of grape juice relative to blackcurrant juice 210 during on-going consumption of both juices, which suggested relative reward-specific satiety for 211 grape juice. These two-dimensional changes were measured during recording periods of individual 212 neurons and constituted our test scheme for behavioral and neuronal correlates of satiety.

213
For more simple numeric value assessment, we positioned single-component bundles on the 214 x-and y-axes and studied only the ratio between equally preferred rewards, which was graphically 215 represented as two-dimensional slope change (anchor trials). We held blackcurrant juice constant    On-going reward consumption induced IC shape changes with all eight bundles in both 246 animals ( Figure 2). Stronger satiety for 6 of the 8 liquids (x-axis) relative to blackcurrant (y-axis) 247 resulted in flattening of IC slopes and transition from convex to linear and concave curvature 248 ( Figure S1G, H). However, monkey B seemed to become less sated on peach juice compared to 249 blackcurrant juice, as suggested by steeper ICs ( Figure 2H) Figure 3H, not included in 305 Figure 3I).

306
Thus, the licking changes confirmed in a mechanism-independent manner the relative reward-307 specific utility changes inferred from bundle choices.   Neuronal test design 332 We used the IC changes with on-going reward consumption observed in a large variety of bundles 333 to investigate altered value coding in OFC reward neurons. Given the shallower slopes and the less 334 convex and more concave curvatures, we placed bundles on specific segments of the ICs that would 335 change with on-going consumption, such that the physically unaltered bundles would end up on 336 different ICs or IC parts. We subjected most neurons to two tests: (i) during choice over zero-337 bundle; both rewards were set to zero in one bundle, and the animal unfailingly chose the 338 alternative, non-zero bundle; (ii) during choice between two non-zero bundles; at least one reward 339 was set to non-zero in both bundles, and the animal chose either bundle.    The same consumption-induced neuronal changes occurred in choice between two non-zero  consumption. The responses continued to discriminate well the amount of blackcurrant juice whose 419 utility had changed relatively less ( Figures 4A, 4C, S2A, S2C) but were altered for grape juice 420 whose relative utility had dropped more ( Figures 4B, 4D, S2B, S2D). The altered OFC signals 421 reflected the reward-specific relative utility change induced by on-going consumption as inferred 422 from the altered ICs.

424
Neuronal population 425 We investigated the effects of on-going reward consumption in a total of 272 task-related OFC  Table 2). 433 We tested averaged z-scored neuronal population responses with the same scheme of bundle  Table 2). These responses changes reflected the differential     In demonstrating substantial accuracy changes, these tests suggested that the neuronal 518 responses followed the substantial IC changes that reflected the utility changes from on-going 519 reward consumption indicative of satiety.  Figure 7C). However, with the satiety-535 induced IC change, the large water amount was now positioned much more below the highest IC 536 than before ( Figure 7D, red on x-axis) and on about the same IC as the small blackcurrant amount 537 (blue on y-axis). Correspondingly, the neuronal activity with the large water amount lost its peak 538 (reduction by 50%) and was now very similar to the activity with the small blackcurrant amount 539 ( Figure 7C, D, red dotted vs. blue solid arrows). Further, the position of the small water amount was 540 now below its original IC (blue on x-axis), and the neuron, with its lost response, failed to 541 distinguish between the two water amounts. Thus, the neuronal changes with single-reward bundles 542 followed the satiety-induced IC changes, demonstrating that the neuronal satiety changes reported 543 above occurred also with single rewards (degenerated bundles).

544
Next, we used single-reward bundles to quantify neuronal response changes with on-going 545 reward consumption in relation to utility changes inferred from behavioral choices. We established 546 vector plots that display the ratio of reward weights (b's) for behavioral choice (Eq. 1a; Figure 7E  This study tested binary choice between bundles of two rewards and found response changes in 611 OFC reward neurons that suggested a differential loss of reward utility indicative of relative reward-612 specific satiety from on-going reward consumption. The choices were captured by graphic ICs that 613 represented the relative utilities of the two bundle rewards in a conceptually rigorous manner. The

614
ICs changed in an orderly and characteristic manner with on-going reward consumption, without 615 requiring unnatural reward bolus administration (Figures 1, 2, S1). The ICs flattened progressively 616 and showed gradual curvature changes from convexity to concavity, which indicated gradual utility  However, general satiety cannot explain our asymmetric neuronal changes that correlate with 701 relative reward-specific utility changes.

705
The study used the same 2 male adult rhesus monkeys as previously (Pastor-Bernier et al.,

723
To test whether the animal's choice reflected the amount of the bundle rewards during satiety, 724 rather than other, unintended variables such as spatial bias, we used the logistic regression:

747
To assess neuronal compliance with the two-dimensional IC scheme, we used a two-factor 748 Anova on each task-related response that was significant for both regressors in Eq. 3. Neuronal 749 responses following the IC scheme were significant across-ICs (factor 1: P < 0.05) but insignificant     axis. Hence, the IC slope would be steeper than the diagonal line (see Figure 1C, D     indicating the revealed preference relation between the two rewards of a bundle, and thus the value for neuronal tests), rather than reading it from fitted ICs. indifference. 1130 We assessed the coding of chosen value and unchosen value in all neurons that followed the 1131 revealed preference scheme, using the following regression: with UCV as value of the unchosen option that was not further considered here, and e as compound