Background
One outstanding feature of obsessive–compulsive disorder (OCD) is the repetitive nature of the compulsive actions that are performed by people suffering from the disorder. Patients report that they cannot feel safe until they have repeatedly performed the compulsive acts according to strict routines (Schwartz, Reference Schwartz1999). It seems as if OCD patients cannot use the feedback they receive about the results of their actions to update their action goals. Several researchers therefore proposed the hypothesis that OCD symptoms are due to failures in processing action-related feedback (Pitman, Reference Pitman1987; Otto, Reference Otto1992; Bohne et al. Reference Bohne, Savage, Deckersbach, Keuthen, Jenike, Tuschen-Caffier and Wilhelm2005; Olley et al. Reference Olley, Malhi and Sachdev2007). Such a failure would compromise recognition that the goal of an action was achieved, thereby leading to doubt and the urge to repeat actions, instead of being able to plan new action goals (Szechtman & Woody, Reference Szechtman and Woody2004).
Neurobiological evidence supports this hypothesis. Frontal–subcortical circuits, underlying the learning of new responses on the basis of their outcome, including the orbitofrontal cortex (OFC) and anterior cingulate cortex (ACC), show deviant activity in OCD patients compared to normal controls at rest and during learning tasks (Aouizerate et al. Reference Aouizerate, Guehl, Cuny, Rougier, Bioulac, Tignol and Burbaud2004; Remeijnse et al. Reference Remijnse, Nielen, van Balkom, Cath, van Oppen, Uylings and Veltman2006). When OCD patients make errors in cognitive tasks, they show enhanced event-related brain potentials (ERPs) originating in the ACC (Gehring et al. Reference Gehring, Himle and Nisenson2000; Endrass et al. Reference Endrass, Klawohn, Schuster and Kathmann2008) and increased functional magnetic resonance imaging (fMRI) blood oxygen level-dependent (BOLD) activity in the ACC, compared to controls (Fitzgerald et al. Reference Fitzgerald, Welsh, Gehring, Abelson, Himle, Liberzon and Taylor2005).
In a recent review of cognitive dysfunction in OCD (Olley et al. Reference Olley, Malhi and Sachdev2007), one hypothesis proposed was that OCD patients have a selective deficit in learning new task rules on the basis of external feedback. This could explain why OCD patients generally perform poorly on set-shifting tasks such as the Wisconsin Card Sorting Test (WCST) and the Intra-/Extradimensional (IED) set-shifting task but normally on most other neuropsychological tests. More direct support for this hypothesis comes from reports of a selective deficit of OCD patients in associative learning tasks, relative to non-OCD anxiety patients (Leplow et al. Reference Leplow, Murphy and Nutzinger2002) and normal controls (Murphy et al. Reference Murphy, Nutzinger, Paul and Leplow2004). In these tasks, participants learn arbitrary associations between stimuli on the basis of trial-and-error and verbal feedback. Leplow, Murphy and colleagues explained their finding in terms of increased attention to the feedback stimuli and increased behavioral inhibition, both provoked by amplified error-detection signals. In light of the foregoing evidence, however, a deficit in the ability to use the external feedback to update performance could also be an interesting explanation. Overall, the evidence suggests that in tasks in which the stimulus–response rules are fixed and practiced in advance, performance is normal (although the error-related ERP is still enhanced in OCD, suggesting that patients feel disproportionally ‘error prone’; Gehring et al. Reference Gehring, Himle and Nisenson2000), and only in tasks in which the stimulus–response rule first has to be learned on the basis of external feedback, task performance deteriorates in OCD.
The aim of the present study was to investigate this hypothesis further. We used an associative learning task to test predictions from reinforcement learning theory (e.g. Sutton & Barto, Reference Sutton and Barto1998; Braver & Cohen, Reference Braver, Cohen, Monsell and Driver2000; Passingham et al. Reference Passingham, Toni and Rushworth2000; Holroyd & Coles, Reference Holroyd and Coles2002). In this theory, learning is based on the difference (called the ‘error signal’) between expected and actual outcome of an action, and takes place in different ways at different times. Learning new stimulus–response rules is initially dominated by error signals elicited by the external feedback stimuli provided after a response. Later, as the new stimulus–response associations are learned, it is increasingly based on internal error signals (e.g. corollary discharge signals) elicited by the associated response. Parallel to this change in error processing, the ACC and frontal cortex show enhanced activation during the initial learning phase, and a decline in activation during later learning phases (Toni & Passingham, Reference Toni and Passingham1999). Therefore, if OCD patients have a selective deficit in external feedback processing, they should show this deficit during initial (externally based) learning in a reinforcement learning task, but not during later more internally driven learning. In the studies of Murphy et al. (Reference Murphy, Nutzinger, Paul and Leplow2004) and Leplow et al. (Reference Leplow, Murphy and Nutzinger2002), only total error scores were analyzed. In the present study we therefore analyzed the learning curves of the hit rates in a number of consecutive task blocks, so as to analyze feedback-based learning separately from response-based learning.
Actions can be reinforced by providing feedback stimuli that reward correct actions and punish incorrect actions. Although the anxiety-related phenomenology of clinical OCD symptoms (Olley et al. Reference Olley, Malhi and Sachdev2007) and hyperactive error monitoring in OCD (Gehring et al. Reference Gehring, Himle and Nisenson2000) suggest that learning by punishment would be more affected in patients than learning by reward, Remeijnse et al. (Reference Remijnse, Nielen, van Balkom, Cath, van Oppen, Uylings and Veltman2006) found a reverse pattern in set-reversal performance and OFC activation. The performance and OFC activation of patients were normal under punishment but abnormal under reward conditions. To obtain further evidence on whether feedback-based learning by OCD patients in associative learning tasks is influenced by the affective value of the feedback, we used two learning conditions in which the feedback was accompanied by either a monetary reward or a monetary penalty. O'Doherty et al. (Reference O'Doherty, Kringelbach, Rolls, Hornak and Andrews2001) showed that both types of learning activate the OFC.
In sum, the current study investigated feedback processing in OCD with an associative learning task. In this task, participants learn new stimulus–response combinations, initially on a trial-and-error basis, and later, as learning from feedback proceeds, on the basis of fixed rules. If OCD patients fail to update performance based on external feedback more often than controls, they would initially be slower in the acquisition of the stimulus–response associations. Furthermore, if OCD patients are impaired in processing the affective meaning of feedback (monetary gain or loss), they may show worse punishment-based learning than reward-based learning (Olley et al. Reference Olley, Malhi and Sachdev2007), the reverse (Remeijnse et al. Reference Remijnse, Nielen, van Balkom, Cath, van Oppen, Uylings and Veltman2006), or that both types of learning are equally impaired.
Method
Participants
The study was approved by the local ethics committee of the University Hospital Groningen. Twenty-nine patients (20 women) who met DSM-IV (APA, 1994) criteria for OCD participated in the study. These criteria were confirmed in a diagnostic interview by an experienced clinician. Severity of obsessive and compulsive symptoms was assessed with the Yale–Brown Obsessive–Compulsive Scale (YBOCS; Goodman et al. Reference Goodman, Price, Rasmussen, Mazure, Fleischmann, Hill, Heninger and Charney1989). Healthy adults (n=28, 20 women) were recruited by advertisements and were paid for participation. They were matched to the patient group on age, sex and estimated IQ (short version of the Raven Progressive Matrices, parts B, C and D). To record depressive and anxiety symptoms, participants completed the Hamilton Depression Rating Scale (HAMD) and the Hamilton Anxiety Rating Scale (HAMA) (Hamilton, Reference Hamilton1960). All patients had lower scores on the HAMD than indicating major depression (17), and patients who fulfilled criteria for major depression in the clinical interview were excluded. All subjects provided written informed consent after the study procedure had been explained to them. Exclusion criteria for all participants were medical or neurological illness, alcohol or substance abuse (by interview), and a score >16 on the HAMD. Seventeen OCD patients were taking psychotropic medication at the time of the study: 11 used selective serotonergic reuptake inhibitors (SSRIs), five the antidepressant clomipramine, and one used beta-blockers. One healthy participant was removed from the analyses because of extremely low accuracy scores in the reward condition.
As reliable subtypes of OCD that differ not only on clinical variables but also in neuroimaging effects and treatment response (Mataix-Cols et al. Reference Mataix-Cols, Rosario-Campos and Leckman2005) have been defined, we performed separate analyses on the subgroup of patients (n=23) having obsessive/compulsive symptoms and no contamination/cleaning symptoms (the ‘checkers’), as indicated by the YBOCS. Six patients who mainly had contamination/cleaning symptoms and only some obsessive/compulsive symptoms (the ‘washers’) were excluded from this analysis. The ‘checkers’ had no other critical symptom YBOCS scores.
Experimental task
We used a two-choice associative learning task, adapted from Iaboni et al. (Reference Iaboni, Douglas and Baker1995). It required subjects to learn to associate five different two-digit numbers with a go response (pressing the space bar) and five other numbers with a no-go response (not pressing the space bar). The 10 two-digit numbers were drawn randomly (range 10–90) and were presented, one at a time in random order, for 2000 ms on a black computer screen. During this interval, subjects were required to press the space bar or not. After 3500 ms from stimulus presentation, feedback on the response was provided. In case of a correct response ‘YOU WIN!’ was displayed and in case of an error ‘YOU LOSE!’. In the reward condition, for every correct response 10 cents were added. Subjects lost 0 cents for an incorrect response. In the punishment condition, subjects started with 200 cents, lost 10 cents for every error, and won 0 cent for a correct response. The monetary gain of the response (either +10, 0 or −10 cents) and the total amount of money gained hitherto accompanied the feedback message.
This stimulus–response–feedback cycle was repeated 50 times in one task block. A pilot revealed that participants reached a ceiling performance with a response rule after five consecutive task blocks. The two feedback conditions were therefore run in two sessions, each consisting of five task blocks with 50 trials each. Between blocks there was a short break of 30 s. The 10 relevant numbers in the first session were not used in the second session. Order of condition (‘reward’ or ‘punishment’) was counterbalanced between subjects. The task was implemented with Micro Experimental Laboratory (MEL) software version 2.01 (Psychology Software Tools Inc., Pittsburgh, PA, USA).
Procedure
Before the experiment started, participants completed the Raven intelligence test and the HAMD and HAMA questionnaires. After explanation of the task, a short practice session (30 trials) was run. Subjects were told that they could keep the total amount of money gained in the task, urging them to do their best. After completion of the tasks, all subjects received the same net amount of money (approximately 11 euros) for participating in the experiment. In all cases, this exceeded the actual amount won in the task. The experiment lasted 2 h, including a 15-min break.
Data analyses
Data were analyzed using SPSS (Nie et al. Reference Nie, Hadlai Hull, Jenkins, Steinbrenner and Bent1986). Demographic and clinical characteristics of OCD patients and healthy controls were compared with t tests or the χ2 test. Initial inspection of the data showed that there was a strong practice effect from session 1 to session 2, so we included the factor condition-order in the main design. Repeated-measures general linear model (GLM) ANOVAs were used to analyze accuracy and reaction times (RTs). Post-hoc paired or unpaired t tests (two-tailed) were used to examine the nature of potential interaction effects. In each of the two feedback conditions we computed the learning curve, consisting of the increase in hit rate across the five consecutive task blocks.
Pearson product moment correlation coefficients were computed to estimate correlations between clinical characteristics of OCD patients and task performance. We used p<0.05 as the criterion significance level.
Results
Demographic and clinical characteristics
Patients did not differ from normal controls (NC) with respect to age (p=0.315) and intelligence (p=0.557), but differed clearly in levels of depression and anxiety (p<0.05; see Table 1). Mean total YBOCS score for OCD patients was 20.4±5.03, with a mean score of 9.79±2.96 for the obsession subscale and 10.62±3.22 for the compulsion subscale. The OCD checkers subgroup had equivalent scores on the YBOCS, HAMD and HAMA. There were no demographic or clinical differences between medicated and unmedicated patients. In the OCD group, YBOCS obsession scores were positively correlated with HAMD depression scores (r=0.40, p<0.034). The YBOCS compulsion and total scores were not significantly correlated with HAMD and HAMA scores. HAMD and HAMA scores were strongly correlated (r=0.699, p<0.002).
YBOCS, Yale–Brown Obsessive–Compulsive Scale; HAMD, Hamilton Depression Rating Scale; HAMA, Hamilton Anxiety Rating Scale; n, sample size; NC, normal controls; OCD, obsessive–compulsive disorder; OCD-Chk, OCD checkers subgroup.
Correlations between clinical symptoms and absolute task performance in separate task blocks were not significant. Correlations between clinical symptoms (YBOCS total, YBOCS obsessions, YBOCS compulsions) and learning rate between blocks (the difference in accuracy between every pair of two consecutive blocks of the task) showed that higher YBOCS (compulsion and total) scores were significantly associated with lower learning rates between blocks 2 and 3 (r=−0.487, p<0.007 and r=−0.403, p<0.03) only (all other p>0.05). The HAMD scores were not associated with learning rates (all p>0.16). Higher HAMA scores were significantly associated with higher learning rates between blocks 4 and 5 (p<0.049). Independent samples t tests showed that medicated and unmedicated patients did not differ in learning rates [all t(27)<1.88, all p>0.077].
Learning performance: accuracy
Hit rates consisting of the sum of the hits (i.e. correct button press) and correct rejections (correct no-go response) were computed for each of the five blocks of the two conditions. These were subjected to a 2×5×2×2 ANOVA with feedback type and block as within-subject factors and Condition Order and Group as between-subject factors. As Fig. 1 shows, hit rates increased as learning progressed from block 1 to block 5 [main effect of blocks, F(4, 50)=126.22, p<0.000, η2=0.910], patients overall produced lower hit rates than controls [main effect of group, F(1, 53)=10.31, p<0.002, η2=0.163], and the difference in hit rates between controls and patients was larger in early blocks than in later blocks (group×blocks interaction [F(4, 50)=2.76, p<0.038, η2=0.181]). Independent samples t tests (two-tailed), carried out for each block, showed that patients only had significantly lower hit rates than controls in blocks 1 and 2 of both feedback conditions [reward block 1: t(55)=3.79, p<0.000; reward block 2: t(55)=3.35, p<0.001; punishment block 1: t(55)=2.23, p<0.030; punishment block 2: t(55)=2.47, p<0.016].
The analysis further showed significant feedback type×condition order [F(1, 53)=7.57, p<0.008, η2=0.125] and feedback×block×order interactions [F(4, 50)=2.60, p<0.047, η2=0.172]. No other effect was significant. The feedback×order interaction indicates that feedback type modulated a repeated 0session practice effect, with hit rates being larger in the second than in the first session (Fig. 2), especially in the reward condition (Fig. 3). The three-way feedback×block×order interaction indicates that the feedback×order effect differed as a function of learning progression within a session.
Analyses of the within-subjects contrasts revealed significant quadratic components for this feedback×block×order and a feedback×block×order×group interaction [both F(1, 53)=4.72, p<0.034, η2=0.082]. This indicates that (1) feedback type influenced hit rates in the middle blocks more than in blocks 1 and 5, but only in session 2, and (2) this effect was larger in the patients than in the controls (Fig. 3).
To evaluate learning deficits in a phenomenologically more consistent OCD subtype, we analyzed the data of relatively pure checkers only (n=23) compared to the controls. This analysis showed essentially the same results as for the whole patient group, except for one finding: the factor feedback condition now produced a significant main effect [F(1, 47)=5.20, p<0.027, η2=0.100], indicating that in the reward condition learning accuracy was better than in the punishment condition.
In summary, the results showed that (1) learning the stimulus–response associations in the task improved with repetition of the task, (2) OCD patients displayed lower learning rates than controls in blocks 1 and 2 but not in the later blocks, (3) accuracy was better overall in session 2. This between-sessions practice effect was influenced by feedback type, more in blocks 2–4 than in blocks 1 and 5, (4) this block-specific practice effect was larger in patients than in controls, especially on the basis of reward, and (6) learning on the basis of reward was better than on the basis of punishment after excluding the washer OCD patients.
Learning performance: RTs
The RTs were analyzed with the same factors as the hit rates in the GLM ANOVAs. As Fig. 4 shows, RTs decreased as learning progressed from block 1 to block 5 [F(4, 50)=18.39, p<0.0005, η2=0.595]. Patients produced slower RTs than controls [F(1, 53)=11.99, p<0.001, η2=0.184]. The group×blocks interaction was not significant. Patients had significantly slower RTs than controls in all blocks of both feedback conditions [t(55, max)=3.20, p<0.002; t(55, min)=2.15, p<0.036]. The only other significant effect was a group×order interaction [F(1, 53)=4.38, p<0.041, η2=0.076]. This indicates a larger improvement in response speed between sessions in the OCD group than in the control group. The within-subjects contrasts in the ANOVA revealed quadratic components of block×group×order [F(1, 53)=4.35, p<0.042, η2=0.076] and of feedback×block×order [F(1, 53)=5.93, p<0.018, η2=0.101]. Thus, as with the accuracy data, the RT data indicate that feedback type influenced a practice effect across sessions during the middle blocks (2–4). In RTs, however, the groups did not differ in feedback-type modulation of the practice effect, but only in its size. To evaluate the effect of feedback type in different learning phases, we performed block-wise paired-samples t tests. Only in block 2 [t(56)=2.33, p<0.024] and only in patients [t(28)=2.48, p<0.019; in controls: p>0.05] did reward increase response speed more than punishment.
We again analyzed the data of the checkers separately (n=23). This analysis showed the same results as with the whole patient group, except that the three-way feedback type×block×order interaction was now significant [F(4, 44)=2.62, p<0.048, η2=0.192], with a highly significant quadratic component in the within-subjects contrasts [F(1, 47)=8.68, p<0.005, η2=0.156]. This underlines that in the middle blocks (2–4), the task repetition effect was larger with reward than with punishment.
In sum, the results indicate that (1) learning the stimulus–response associations in the task increased response speed with repetition of the task, (2) the patients displayed slower RTs than controls in all blocks, (3) RTs were faster in session 2 than in session 1; this between-sessions practice effect was influenced by feedback type, mostly in the middle blocks, and (4) the practice effect in these middle blocks was larger in patients than in controls. Feedback-type modulation of the practice effect was significant when the six washers were not considered.
Discussion and conclusions
The aim of this study was to test the hypothesis that OCD patients have a selective impairment in using external feedback signals during reinforcement learning. We applied an associative learning task in which stimulus–response combinations were learned on the basis of the feedback given after an initial trial-and-error response. The hypothesis predicts lower hit rates in OCD patients compared to controls during externally driven learning in the first blocks of the task, and normal hit rates during internally driven learning in the final blocks (Passingham et al. Reference Passingham, Toni and Rushworth2000; Holroyd & Coles, Reference Holroyd and Coles2002). In addition, we explored whether affective valence of external feedback (monetary gain or loss) would influence learning rate differently in OCD patients than in controls.
The results broadly supported the hypothesis. During performance of the five blocks, hit rates increased and RTs decreased in both controls and patients, indicating improvement in learning the stimulus–response associations in both groups. Compared to the controls, the patients displayed lower hit rates in the initial two blocks but not in the final blocks. Higher YBOCS (compulsion and total) scores were significantly associated with lower learning rates, but only between blocks 2 and 3. In addition, patients displayed longer RTs in all blocks of the task. These findings suggest that patients learned fewer stimulus–response associations and needed more time to select responses during the early, external feedback-driven phase of learning. As in Leplow et al. (Reference Leplow, Murphy and Nutzinger2002), no effects were observed of medication, co-morbid anxiety and depression on learning rates.
These findings are consistent with studies showing that OCD patients exhibit a selective deficit in associative learning in the context of preserved performance on standard neuropsychological tests when compared to normal controls (Murphy et al. Reference Murphy, Nutzinger, Paul and Leplow2004) and non-OCD anxiety patients (Leplow et al. Reference Leplow, Murphy and Nutzinger2002). They are also consistent with the frequently observed selective impairment of OCD patients in learning new response rules in WCST and set-shifting tasks (Olley et al. Reference Olley, Malhi and Sachdev2007). Our results extend these studies by indicating the dynamics of these effects. Learning takes place in two distinct phases (e.g. Holroyd & Coles, Reference Holroyd and Coles2002). During initial learning, responses are guessed and their outcomes are provided by external feedback signals. During later stages of learning, responses generate their own feedback signals through re-efference (or corollary discharge) systems in the brain, providing internally produced (expected) outcomes. If the actual outcome is worse or better than expected, an ‘adaptive critic’ (associated with basal ganglia function, see Holroyd & Coles, Reference Holroyd and Coles2002) sends negative or positive error signals respectively to a response selection module (associated with ACC function, see Holroyd & Coles, Reference Holroyd and Coles2002), which updates its stimulus-related strength accordingly. Our results suggest that in OCD this system is dysfunctional in response selection when it has to process external feedback signals during initial learning of a new task, but not when it has to process internally produced feedback signals during later stages. This may suggest that, in OCD, response selection driven by OFC signals is impaired whereas response selection driven by internal error signals is normal, consistent with the abnormal OFC and ACC activation found in OCD (Gehring et al. Reference Gehring, Himle and Nisenson2000; Aouizerate et al. Reference Aouizerate, Guehl, Cuny, Rougier, Bioulac, Tignol and Burbaud2004; Fitzgerald et al. Reference Fitzgerald, Welsh, Gehring, Abelson, Himle, Liberzon and Taylor2005; Remeijnse et al. Reference Remijnse, Nielen, van Balkom, Cath, van Oppen, Uylings and Veltman2006; Endrass et al. Reference Endrass, Klawohn, Schuster and Kathmann2008).
The error signals generated by the learned response during later stages, however, may also not be normal in OCD. Although the block-wise t tests on hit rates showed no differences between the groups in the later blocks, RTs in those blocks were still slower in the OCD group, and electrophysiological evidence indicates that in OCD patients there may be normal task performance but still disrupted error signals (Gehring et al. Reference Gehring, Himle and Nisenson2000). Veale et al. (Reference Veale, Sahakian, Owen and Marks1996) and Purcell et al. (Reference Purcell, Maruff, Kyrios and Pantelis1998) found psychomotor slowing in OCD only in tasks with high response–outcome uncertainty (e.g. Tower of London), suggesting a non-generalized impairment.
An alternative view of cognitive impairment in OCD is that patients have problems with inhibiting intrusive thoughts and behaviors. This seems to be reflected in their symptoms and also in the higher perseverance of an old response rule when external feedback signals that a new rule has to be learned (in WCST and set-shifting tasks; for review see Chamberlain et al. Reference Chamberlain, Blackwell, Fineberg, Robbins and Sahakian2005). Although this is an appealing view, the current evidence supports the view that cognitive impairment in OCD is not the result of impaired inhibition of old responses but the result of an impairment of learning new responses. Investigating the unlearning of old task rules in the WCST and other set-shifting tasks is confounded with the effects of learning new response rules. Underperformance of these tasks, therefore, does not discriminate between the two alternatives. The use of an associative learning task, as in the present study and the former studies by Leplow et al. (Reference Leplow, Murphy and Nutzinger2002) and Murphy et al. (Reference Murphy, Nutzinger, Paul and Leplow2004), circumvents this confounding because no unlearning of old rules is required. Furthermore, the psychomotor slowness only seen in OCD with uncertain outcome tasks and the slow RTs of patients in the present study are more consistent with decreased selective motor activation than with decreased motor inhibition. Clinical checking symptoms also seem to be rooted in problems with external feedback-based learning of new responses. Despite excessive checking by patients, the cognitive evidence they obtain does not alleviate the anxiety associated with not checking again. It may be that the uncertainty associated with not checking induces abnormally amplified negative error signals that remain despite performing repetitive checking behaviors.
When we compared a more diagnostically ‘pure’ group of patients (only patients with predominantly checking symptoms) with the controls, the factor effects generally became stronger, supporting the idea that OCD is a multi-dimensional disorder, probably with different neurocognitive impairments associated with different clinical subtypes (Mataix-Cols et al. Reference Mataix-Cols, Rosario-Campos and Leckman2005). This is also clearly a limitation of the generalizability of the present study to other OCD subtypes such as hoarding and contamination/washing.
A second aim was to explore the influence of the affective valence of external feedback signals on learning new responses in OCD. There was no evidence that learning new response rules on the basis of monetary reward or loss was different in patients than in controls. This is inconsistent with the findings of Remeijnse et al. (Reference Remijnse, Nielen, van Balkom, Cath, van Oppen, Uylings and Veltman2006) and Olley et al. (Reference Olley, Malhi and Sachdev2007) but is consistent with the views that, in OCD, there is a primary dysfunction of the OFC–striatal circuits and that reward and punishment both activate this brain area (O'Doherty et al. Reference O'Doherty, Kringelbach, Rolls, Hornak and Andrews2001). Only in the second session did learning in patients increase more with reward than with punishment, also relative to the controls, and in both learning accuracy and response speed. This generalized learning effect is confounded, however, by potential effects of session 1 on the psychological state of the participants. For example, receiving reward may increase overall motivation to learn well and punishment may decrease motivation. Those participants receiving reward in the second session always received punishment in the first (and vice versa), so their increased hit rates in the second session may be due to increased motivation. Nevertheless, this effect was larger in OCD patients than in controls, raising the question why patients were more influenced by affective feedback, and making it worthwhile to further investigate this issue in future studies.
To summarize, we can draw several conclusions from our findings. The first is that, relative to controls, OCD patients are impaired in learning new behavior on the basis of external feedback signals but not on the basis of internal feedback signals. Only external feedback signals seem to result in distorted error signals to the response selection system. Second, OCD patients initially seem not to learn differently than normal controls as a function of the affective value of feedback. During later learning stages, however, patients normalize with reward and remain impaired with punishment.
Finally, understanding of this external feedback-processing deficit in OCD patients may help to improve cognitive behavioral therapies. For instance, emphasis may be given to reinforcement by internally generated feedback. In addition, future research could evaluate therapies by relating the longitudinal observations of the learning task and clinical symptoms with outcome variables and type of treatment.
Acknowledgments
We thank Femke Nijboer and Jacobien van Peer for their assistance in data collection. We are also indebted to Pieter Hoekstra for his help in the recruitment of OCD patients.
Declaration of Interest
None.