Introduction
The capacity to adequately respond to environmental rewards is vital for adaptive functioning, and therefore an essential feature of mental health. Preclinical evidence consistently shows that reward cues and outcomes, particularly if they violate the prediction, are registered by striatal dopamine (DA) neurons, providing the learning signal that guides future incentive-based choices (Schultz et al. Reference Schultz, Dayan and Montague1997; Schultz, Reference Schultz2010, Reference Schultz2016).
We recently translated these findings, to humans, and detected significant DA release in response to reward learning in the caudate nucleus, putamen and ventral striatum (VST) of healthy volunteers (HV) (Kasanova et al. Reference Kasanova, Ceccarini, Frank, Amelsvoort, Booij and Heinzel2017). Moreover, the striatal reward-induced DA release was associated with accuracy in acquisition of reward contingencies, and in the right caudate and VST also with reward-oriented behavior in the daily life.
Reward deficits have been linked to motivational impairments in psychotic disorders (Strauss et al. Reference Strauss, Frank, Waltz, Kasanova, Herbener and Gold2011; Gold et al. Reference Gold, Waltz, Matveeva, Kasanova, Strauss and Herbener2012; Strauss et al. Reference Strauss, Waltz and Gold2014). Importantly, the genetic common risk variants that increase liability to schizophrenia have been associated with abnormal neural activity during reward learning in the frontal pole and VST of healthy individuals (Lancaster et al. Reference Lancaster, Ihssen, Brindley, Tansey, Mantripragada and O'Donovan2016), suggesting that reward deficits might be a vulnerability trait marker for psychosis. First-degree relatives (FDR) of patients with psychosis show a significant enrichment effect of such risk variants (Bigdeli et al. Reference Bigdeli, Ripke, Bacanu, Lee, Wray and Gejman2016), conferring potential predisposition to aberrant neural signaling of reward contingencies. Reward dysfunction has not, however, been decisively corroborated in FDR; functional magnetic resonance imaging (fMRI) studies of striatal neural activation to rewards have yielded inconsistent findings (Grimm et al. Reference Grimm, Heinz, Walter, Kirsch, Erk and Haddad2014; de Leeuw et al. Reference de Leeuw, Kahn and Vink2015; Hanssen et al. Reference Hanssen, van der Velde, Gromann, Shergill, de Haan and Bruggeman2015), while increased striatal DA synthesis, believed to undermine reward sensitivity (Boehme et al. Reference Boehme, Deserno, Gleich, Katthagen, Pankow and Behr2015), has been reported to be present (Huttunen et al. Reference Huttunen, Heinimaa, Svirskis, Nyman, Kajander and Forsback2008) or absent (Shotbolt et al. Reference Shotbolt, Stokes, Owens, Toulopoulou, Picchioni and Bose2011) in FDR. These inconsistencies warrant exploration of the striatal dopaminergic modulation of reward learning and its behavioral correlates in individuals with a familial liability to psychosis.
We therefore added 16 FDR to the existing sample of 16 HV (Kasanova et al. Reference Kasanova, Ceccarini, Frank, Amelsvoort, Booij and Heinzel2017) to compare the two groups on striatal DA release during reward learning using DA D2/3 receptor [18F]fallypride positron emission tomography (PET). Additionally, both groups underwent a 6-day ecological momentary assessment (EMA) study of reward-oriented behavior in the everyday life. Concretely, in addition to the sample described previously, we recruited 16 individuals who had a sibling or a parent diagnosed with non-affective psychotic disorder [as determined by the Family Interview for Genetic Studies (FIGS) (Nurnberger et al. Reference Nurnberger, Blehar, Kaufmann, York-Cooler, Simpson and Harkavy-Friedman1994)], but were free of Axis I or II diagnosis themselves [as determined by the Mini-International Neuropsychiatric Interview (M.I.N.I) (Sheehan et al. Reference Sheehan, Lecrubier, Sheehan, Amorim, Janavs and Weiller1998)].
Methods
Procedures
PET data acquisition and analyses
All participants underwent the same procedures outlined previously (Kasanova et al. Reference Kasanova, Ceccarini, Frank, Amelsvoort, Booij and Heinzel2017). Briefly, on a PET scan day, the [18F]fallypride was injected simultaneously with the start of a 180-min dynamic acquisition, consisting of a 80-min active control condition, 15-min break, 25-min rest condition without any stimulation, and precisely at 120-min post-injection the initiation of a 60-min probabilistic stimulus selection task (PSST) (Frank et al. Reference Frank, Seeberger and O'Reilly2004).
[18F]fallypride PET data were processed and analyzed using the procedures detailed previously (Kasanova et al. Reference Kasanova, Ceccarini, Frank, Amelsvoort, Booij and Heinzel2017). The main outcome measure – the γ parameter – reflects the magnitude of [18F]fallypride displacement, hence indexing DA release. The additional outcome measure is the spatial extent of reward-induced DA release (Christian et al. Reference Christian, Lehrer, Shi, Narayanan, Strohmeyer and Buchsbaum2006; Lataster et al. Reference Lataster, Collip, Ceccarini, Haas, Booij and van Os2011; Ceccarini et al. Reference Ceccarini, Vrieze, Koole, Muylle, Bormans and Claes2012; Kuepper et al. Reference Kuepper, Ceccarini, Lataster, van Os, van Kroonenburgh and van Gerven2013; Vrieze et al. Reference Vrieze, Ceccarini, Pizzagalli, Bormans, Vandenbulcke and Demyttenaere2013; Hernaus et al. Reference Hernaus, Collip, Kasanova, Winz, Heinzel and van Amelsvoort2015), calculated as the percentage of statistically significant voxels [surviving p(/number of total voxels) = 0.05] within each region of interest (ROI). This test is generally more sensitive to extended signals (Poline et al. Reference Poline, Worsley, Evans and Friston1997) expected in the current study.
Probabilistic reinforcement learning task
In six learning blocks, participants were presented with three pairs of stimuli and required to select the better one with a button press, after which they received feedback: a smile and 5 Euro cents for the correct choice and a frown and a loss of 5 cents for the incorrect one. The probabilities of reinforcement were 90 : 10, 80 : 20, and 70 : 30. A different set of items was presented in each block. Performance was quantified as the total amount of money won, and the proportion of correct choices (choices of the more frequently rewarded stimulus over its pair) across all blocks.
Ecological momentary assessments
In order to monitor individual daily-life experience and behavior, participants received an electronic portable device (PsyMate®) (Myin-Germeys et al. Reference Myin-Germeys, Birchwood and Kwapil2011) programmed to beep 10 times per day in unexpected moments for the duration of 6 days. Each beep was a prompt to fill out a brief questionnaire informing about the level of engagement in an activity (‘I am engaged in an activity’) and the enjoyment of the current activity (‘I enjoy this activity’) all rated on a seven-point Likert scale (from 1 = not at all to 7 = very much). We conceptualize daily-life reward-oriented behavior as the odds that enjoyment of active engagement in a behavior will increase active engagement at the next measurement moment [detailed EMA procedures are presented elsewhere (Kasanova et al. Reference Kasanova, Ceccarini, Frank, Amelsvoort, Booij and Heinzel2017)].
Statistical analyses
The second-level data analyses were performed in STATA 11.2. To test for group differences in any of the demographics, clinical measures, task performance, and reward-induced DA release parameters, these were entered as separate outcome variables, with group (HV, FDR) as the predictor in regression analyses. Gender was the outcome of a logistic regression with group as the predictor.
To test group differences in the association between reward-induced tracer displacement in all ROIs and performance on the PSST, regression analyses were conducted with total winnings/proportion of correct choices as the outcome variable, and the spatial extent/magnitude of reward-induced tracer displacement in each ROI, group, and their interaction as the predictors.
Lagged multilevel regression analyses were performed on the EMA data, with level of active engagement at t0 as the outcome, and level of active engagement at t − 1 × enjoyment of activities at t − 1 × group as its predictor. To assess group differences in the modulatory effect of reward-induced DA release on daily-life reward-oriented behavior, extent/magnitude of DA release per ROI was entered as an additional predictor. All analyses were controlled for age, gender, and IQ.
Results
There were no differences between the two groups on any of the demographic variables; subclinical positive, negative, or depressive symptoms; performance on the task (Table 1), nor in the daily-life reward-oriented behavior (p = 0.100, β = 0.06, z = 1.64).
Table 1. Group average and comparison on demographics, psychopathology symptoms, PSST performance, EMA, spatial extent, and amplitude of reward-induced tracer displacement (i.e. DA release) per ROI
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200311170536323-0204:S0033291717003476:S0033291717003476_tab1.gif?pub-status=live)
M, mean; s.d., standard deviation; p, p value; B, β-coefficient; t, t-statistic; R, right; L, left.
Akin to HV reported previously (Kasanova et al. Reference Kasanova, Ceccarini, Frank, Amelsvoort, Booij and Heinzel2017), FDR showed significant reward-induced DA release, as well as its spatial extent, in all ROIs, except for the right putamen (Table 1, Fig. 1). Importantly, there was no group difference in the amplitude of reward-induced DA release nor in its spatial extent in any of the ROIs (all p > 0.05, Table 1, Fig. 1).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200311170536323-0204:S0033291717003476:S0033291717003476_fig1g.jpeg?pub-status=live)
Fig. 1. Striatal reward-induced dopaminergic release in controls and relatives. Average statistical parametric Z-map per group (controls and relatives) of γ representing the striatal DA release induced by the reward learning task shown in transverse, coronal, and sagittal sections overlaid on T1-weighted MRI template. The images visualize the comparable reward-induced 18F-fallypride displacement in the caudate nucleus and ventral striatum of controls and relatives of individuals with psychosis. The top bar visualizes the masks used to delineate the ROIs.
To the contrary, as evidenced by Table 2, in both groups alike the greater extent of reward-induced DA release in bilateral caudate, putamen, and right VST was associated with better performance on the task. The amplitude of reward-induced DA release in the right VST was significantly associated with higher winnings, and also with the proportion of correct choices in the task (Table 2, Fig. 2).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200311170536323-0204:S0033291717003476:S0033291717003476_fig2g.gif?pub-status=live)
Fig. 2. Associations between reward-induced 18F-fallypride displacement (i.e. DA release) in bilateral caudate nucleus, putamen, and ventral striatum and performance on the reward task. The x-axis represents performance on the probabilistic learning task as the proportion of correct choices of the more frequently rewarded stimulus over its alternative. The y-axis shows the spatial extent of reward-induced DA release in percentage total voxels surpassing the Bonferroni-corrected threshold of statistically significant activation. The strength of the associations (R 2) and statistical significance level (p) between performance on the task and spatial extent of DA release are collapsed across groups.
Table 2. Associations between reward-induced [18F]fallypride displacement and its spatial extent (i.e. DA release) in all ROIs and reward learning, as measured by the probabilistic stimulus selection task, and daily-life reward-oriented behavior captured by the ecological momentary assessments
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200311170536323-0204:S0033291717003476:S0033291717003476_tab2.gif?pub-status=live)
R, right; L, left; VST, ventral striatum; p, p value; B, β-coefficient; t, t-statistic; z, z-statistic.
Likewise, more extensive reward-induced DA release in all ROIs was associated with greater propensity for reward-oriented behavior in both groups (Table 2). In terms of its association with the amplitude of reward-induced DA release, none of the ROIs reached statistical significance.
Additionally, reward-oriented behavior in the daily life was related to higher winnings (p = 0.007, β = 0.018, z = 2.71) and proportion of correct choices in the task (p = 0.015, β = 0.0024, z = 2.44).
Discussion
We detected unaltered striatal DAergic modulation of responsiveness to experimental and naturalistic rewards in FDR, contradicting the hypothetical prediction of DAergic basis for reward dysfunction in liability to psychosis. Collectively, these results, along with findings of normative behavioral performance on the task, and intact reward-oriented behavior in the everyday life, as well as absence of (subclinical) positive, negative, and depressive symptoms in FDR, provide an initial hint at adequate reward function in the FDR group.
The findings of adequate reward learning add to the growing evidence for intact capacity to acquire reward contingencies in FDR (Grimm et al. Reference Grimm, Heinz, Walter, Kirsch, Erk and Haddad2014; Hanssen et al. Reference Hanssen, van der Velde, Gromann, Shergill, de Haan and Bruggeman2015), and the finding of intact DAergic modulation of this process aligns with all existing, albeit sparse, fMRI studies reporting normal (Hanssen et al. Reference Hanssen, van der Velde, Gromann, Shergill, de Haan and Bruggeman2015) or supranormal (de Leeuw et al. Reference de Leeuw, Kahn and Vink2015) striatal BOLD signal to reward feedback in this group. Complementary insight can also be derived from pharmacological manipulation study using DA-enhancing v. DA-receptor blocking agents in healthy individuals performing a similar task (Pessiglione et al. Reference Pessiglione, Seymour, Flandin, Dolan and Frith2006). In a complete agreement with our findings, pharmacologically increased DA levels were associated with better performance on the task and greater VST activation to prediction errors (Pessiglione et al. Reference Pessiglione, Seymour, Flandin, Dolan and Frith2006), corroborating the essential role of striatal DAergic modulation of reward learning detected in our FDR and HV samples.
Computational modeling has showed that during the development, adult learners gradually add goal-directed behavior to their more habitual reinforcement learning repertoire (Decker et al. Reference Decker, Otto, Daw and Hartley2016). Goal-directed control of reinforcement learning, rather than habit, were found to correlate with presynaptic VST DA levels (Deserno et al. Reference Deserno, Huys, Boehme, Buchert, Heinze and Grace2015), however, suggesting that the participants in our study might have employed this strategy during the learning task. Another [(18)F]DOPA study linked presynaptic VST DA also to fluid intelligence, indicating its relevance for real-world functioning.
Furthermore, the current study provided a crucial ecological relevance to these findings by suggesting, for the first time, a relationship between reward-induced striatal DA release and reward-driven behavior in the everyday life, and a preservation of this mechanism in the FDR. These results should, however, be interpreted with due consideration of the limitations of the study. Firstly, the assumptions of the model used to analyze the PET imaging data impose that the control condition is always followed by the experimental condition. This could affect the results if the FDR were more sensitive to fatigue than the HV, but this is unlikely, considering that FDR matched HV in psychopathology, performance, IQ, and reward-induced DA release. The other side of this argument for the FDR group being comprised of high-functioning adults is that they might not be fully representative of the entire population of FDR. Although we attempted to minimize any selection bias by recruiting all participants via the media, these individuals might have been on the higher end of the spectrum of mental resilience and clinical health. Nevertheless, this also holds true for the controls, giving us solid reasons to believe that the group composition and comparison was justified and the findings merited.
Finally, given that this study does not include a group with a full-blown psychotic disorder, our findings and conclusions are restricted to their FDR. Complementary neuromolecular studies in individuals further on the psychosis continuum are warranted to elucidate the precise nature of the putative deviation from optimal modulation of reward function at the synapse in order to facilitate the design of rational strategies to intervene at this level.
Acknowledgements
The authors thank Rayyan Tutunji, Nele Soons, Siamak Mohammadkhani Shali, Ye Rong, Oliver Winz, Wendy Beuken, Bernward Oedekoven, and Ron Mengelers. This work was supported by an ERC consolidator grant (ERC-2012-StG, project 309767-INTERACT) and FWO Odysseus grant to Inez Myin-Germeys, Post-doctoral Mandate of the KU Leuven to Zuzana Kasanova and by a FWO (Research Foundation Flanders) postdoctoral fellowship to Dr Jenny Ceccarini.
Conflict of Interest
None.
Ethical Standards
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.