Clark's model emphasizes the processing of prediction error, and, in section 4.2, this is applied to an understanding of hallucinations, delusions, and schizophrenia. This commentary emphasizes three points related to these themes, with the overall goal of demonstrating that Clark's view, at present, does not provide a fully adequate heuristic for understanding psychotic phenomena.
Clark's theory emphasizes anti-Hebbian feedforward processing, in which correlated activity across neurons is suppressed, presumably because no deviation from what is expected is present, therefore allowing any signals related to deviation from what is expected (i.e., prediction error) to become relatively more salient. While this would appear to be a useful data-compression strategy for coding invariant background information, it does not account for cases in which it is precisely the correlation between stimulus elements that codes their object properties, thereby signaling stimulus significance. Numerous demonstrations exist (e.g., Kinoshita et al. Reference Kinoshita, Gilbert and Das2009; Silverstein et al. Reference Silverstein, Berten, Essex, Kovács, Susmaras and Little2009; Singer Reference Singer1995) wherein increasing the correlation between an aspect of elements (e.g., stimulus orientation in contour integration paradigms) leads to increased signal strength. Of course, it is possible to argue, as Clark does, that this is due to a cancellation of the activity in error units and subsequent enhancement of the signal coding the contour or shape. However, it is not clear how these competing hypotheses could be pitted against each other in a definitive study.
Consistent with Clark's view, evidence exists that, for example, as random orientational jitter is applied to disconnected contour elements, increases in fMRI BOLD signal are observed (Silverstein et al. Reference Silverstein, Berten, Essex, Kovács, Susmaras and Little2009). Clark's view is also consistent with Weber's (Reference Weber2002) view that much of our direct understanding of visual forms results from perception of “metamorphoses of geometry” or topological (isotopic) alterations of basic forms, a view consistent with evidence that topological invariants are the primitives to which our visual system responds most strongly (Chen Reference Chen2005). However, it is also the case that compared to a non-informative background of randomly oriented Gabors, perception of a contour is associated with increased activity (Silverstein et al. Reference Silverstein, Berten, Essex, Kovács, Susmaras and Little2009). Clarifying the extent to which these two forms of signal increase represent functioning of different circuits is an important task for future research. Until this is clarified, Clark's view appears to be most appropriate for understanding signaling of objects in the environment, as opposed to brain activity involved in creating representations of those objects. This is relevant for schizophrenia, as it is characterized by a breakdown in coordinating processes in perception and cognition (Phillips & Silverstein Reference Phillips and Silverstein2003; Silverstein & Keane Reference Silverstein and Keane2011). A challenge for Clark's view is to account for these phenomena, which have been previously understood as reflecting a breakdown in Hebbian processing, and reduced self-organization at the local circuit level, involving reduced lateral (and re-entrant) excitation.
Clark notes that while perceptual anomalies alone will not typically lead to delusions, the perceptual and doxastic components should not be seen as independent. However, there are several syndromes (e.g., Charles Bonnet Syndrome, Dementia with Lewy Bodies, Parkinson's Disease Dementia) where visual hallucinations are prominent and delusions are typically absent (Santhouse et al. Reference Santhouse, Howard and ffytche2000). Moreover, it would appear to be difficult to explain the well-formed hallucinations characteristic of these syndromes as being due to prediction error, given their sometimes improbable content (e.g., very small people dressed in Victorian era attire), and apparent errors in size constancy (ffytche & Howard Reference ffytche and Howard1999; Geldmacher Reference Geldmacher2003) that argue against Bayes-optimal perception in these cases. There are also many cases of schizophrenia where delusions are present without hallucinations. Finally, while evidence of reduced binocular depth inversion illusions in schizophrenia (Keane et al., in press; Koethe et al. Reference Koethe, Kranaster, Hoyer, Gross, Neatby, Schultze-Lutter, Ruhrmann, Klosterkötter, Hellmich and Leweke2009) provides evidence, on the one hand, for a weakened influence of priors (or of the likelihood function) (Phillips Reference Phillips2012) on perception, this evidence also indicates more veridical perception of the environment. Therefore, these data suggest that, rather than prediction error signals being falsely generated and highly weighted (as Clark suggests), such signals appear not to be generated to a sufficient degree, resulting in a lack of top-down modulation, and bottom-up (but not error) signals being strengthened. Indeed, this is exactly what was demonstrated in recent studies using dynamic causal modeling of ERP and fMRI data from a hollow-mask perception task in people with schizophrenia (Dima et al. Reference Dima, Roiser, Dietrich, Bonnemann, Lanfermann, Emrich and Dillo2009; Reference Dima, Dietrich, Dillo and Emrich2010). A developing impairment such as this would lead to subjective changes in the meaning of objects and the environment as a whole, and of the self – which, in turn, can spawn delusions (Mattusek Reference Mattusek, Cutting and Sheppard1987; Sass Reference Sass1992; Uhlhaas & Mishara Reference Uhlhaas and Mishara2007), even though the delusional thoughts are unrelated to the likelihood functions and beliefs that existed prior to the onset of the delusion.
Finally, Clark's view of hallucinations is similar to many models of schizophrenia, in that it is based on computational considerations only. But, as noted, delusions often grow out of phenomenological changes and emotional reactions to these (see also Conrad Reference Conrad1958), and this cascade is typically ignored in computational models. It also must be noted that the delusions that patients develop are not about random events, but typically are framed in reference to the self, with appreciation of the statistical structure of the rest of the world being intact. Similarly, auditory hallucinations often involve negative comments about the self, and it has been suggested, due to the high prevalence of histories of childhood physical and sexual abuse in people with schizophrenia (Read et al. Reference Read, van Os, Morrison and Ross2005), that voices are aspects of memory traces associated with the abuse experience that have been separated from other aspects of the memory trace due to hippocampal impairment secondary to chronic cortisol production (Read et al. Reference Read, Perry, Moskowitz and Connolly2001) (as opposed to being due to top-down expectancy driven processing). A purely computational theory of hallucinations and/or delusions is like a mathematical theory of music – it can explain aspects of it, but not why one piece of music creates a strong emotional response in one person yet not in another. Psychotic symptom formation must be understood within the context of personal vulnerability and emotional factors, and these are not well accounted for by a Bayesian view at present.
Clark's model emphasizes the processing of prediction error, and, in section 4.2, this is applied to an understanding of hallucinations, delusions, and schizophrenia. This commentary emphasizes three points related to these themes, with the overall goal of demonstrating that Clark's view, at present, does not provide a fully adequate heuristic for understanding psychotic phenomena.
Clark's theory emphasizes anti-Hebbian feedforward processing, in which correlated activity across neurons is suppressed, presumably because no deviation from what is expected is present, therefore allowing any signals related to deviation from what is expected (i.e., prediction error) to become relatively more salient. While this would appear to be a useful data-compression strategy for coding invariant background information, it does not account for cases in which it is precisely the correlation between stimulus elements that codes their object properties, thereby signaling stimulus significance. Numerous demonstrations exist (e.g., Kinoshita et al. Reference Kinoshita, Gilbert and Das2009; Silverstein et al. Reference Silverstein, Berten, Essex, Kovács, Susmaras and Little2009; Singer Reference Singer1995) wherein increasing the correlation between an aspect of elements (e.g., stimulus orientation in contour integration paradigms) leads to increased signal strength. Of course, it is possible to argue, as Clark does, that this is due to a cancellation of the activity in error units and subsequent enhancement of the signal coding the contour or shape. However, it is not clear how these competing hypotheses could be pitted against each other in a definitive study.
Consistent with Clark's view, evidence exists that, for example, as random orientational jitter is applied to disconnected contour elements, increases in fMRI BOLD signal are observed (Silverstein et al. Reference Silverstein, Berten, Essex, Kovács, Susmaras and Little2009). Clark's view is also consistent with Weber's (Reference Weber2002) view that much of our direct understanding of visual forms results from perception of “metamorphoses of geometry” or topological (isotopic) alterations of basic forms, a view consistent with evidence that topological invariants are the primitives to which our visual system responds most strongly (Chen Reference Chen2005). However, it is also the case that compared to a non-informative background of randomly oriented Gabors, perception of a contour is associated with increased activity (Silverstein et al. Reference Silverstein, Berten, Essex, Kovács, Susmaras and Little2009). Clarifying the extent to which these two forms of signal increase represent functioning of different circuits is an important task for future research. Until this is clarified, Clark's view appears to be most appropriate for understanding signaling of objects in the environment, as opposed to brain activity involved in creating representations of those objects. This is relevant for schizophrenia, as it is characterized by a breakdown in coordinating processes in perception and cognition (Phillips & Silverstein Reference Phillips and Silverstein2003; Silverstein & Keane Reference Silverstein and Keane2011). A challenge for Clark's view is to account for these phenomena, which have been previously understood as reflecting a breakdown in Hebbian processing, and reduced self-organization at the local circuit level, involving reduced lateral (and re-entrant) excitation.
Clark notes that while perceptual anomalies alone will not typically lead to delusions, the perceptual and doxastic components should not be seen as independent. However, there are several syndromes (e.g., Charles Bonnet Syndrome, Dementia with Lewy Bodies, Parkinson's Disease Dementia) where visual hallucinations are prominent and delusions are typically absent (Santhouse et al. Reference Santhouse, Howard and ffytche2000). Moreover, it would appear to be difficult to explain the well-formed hallucinations characteristic of these syndromes as being due to prediction error, given their sometimes improbable content (e.g., very small people dressed in Victorian era attire), and apparent errors in size constancy (ffytche & Howard Reference ffytche and Howard1999; Geldmacher Reference Geldmacher2003) that argue against Bayes-optimal perception in these cases. There are also many cases of schizophrenia where delusions are present without hallucinations. Finally, while evidence of reduced binocular depth inversion illusions in schizophrenia (Keane et al., in press; Koethe et al. Reference Koethe, Kranaster, Hoyer, Gross, Neatby, Schultze-Lutter, Ruhrmann, Klosterkötter, Hellmich and Leweke2009) provides evidence, on the one hand, for a weakened influence of priors (or of the likelihood function) (Phillips Reference Phillips2012) on perception, this evidence also indicates more veridical perception of the environment. Therefore, these data suggest that, rather than prediction error signals being falsely generated and highly weighted (as Clark suggests), such signals appear not to be generated to a sufficient degree, resulting in a lack of top-down modulation, and bottom-up (but not error) signals being strengthened. Indeed, this is exactly what was demonstrated in recent studies using dynamic causal modeling of ERP and fMRI data from a hollow-mask perception task in people with schizophrenia (Dima et al. Reference Dima, Roiser, Dietrich, Bonnemann, Lanfermann, Emrich and Dillo2009; Reference Dima, Dietrich, Dillo and Emrich2010). A developing impairment such as this would lead to subjective changes in the meaning of objects and the environment as a whole, and of the self – which, in turn, can spawn delusions (Mattusek Reference Mattusek, Cutting and Sheppard1987; Sass Reference Sass1992; Uhlhaas & Mishara Reference Uhlhaas and Mishara2007), even though the delusional thoughts are unrelated to the likelihood functions and beliefs that existed prior to the onset of the delusion.
Finally, Clark's view of hallucinations is similar to many models of schizophrenia, in that it is based on computational considerations only. But, as noted, delusions often grow out of phenomenological changes and emotional reactions to these (see also Conrad Reference Conrad1958), and this cascade is typically ignored in computational models. It also must be noted that the delusions that patients develop are not about random events, but typically are framed in reference to the self, with appreciation of the statistical structure of the rest of the world being intact. Similarly, auditory hallucinations often involve negative comments about the self, and it has been suggested, due to the high prevalence of histories of childhood physical and sexual abuse in people with schizophrenia (Read et al. Reference Read, van Os, Morrison and Ross2005), that voices are aspects of memory traces associated with the abuse experience that have been separated from other aspects of the memory trace due to hippocampal impairment secondary to chronic cortisol production (Read et al. Reference Read, Perry, Moskowitz and Connolly2001) (as opposed to being due to top-down expectancy driven processing). A purely computational theory of hallucinations and/or delusions is like a mathematical theory of music – it can explain aspects of it, but not why one piece of music creates a strong emotional response in one person yet not in another. Psychotic symptom formation must be understood within the context of personal vulnerability and emotional factors, and these are not well accounted for by a Bayesian view at present.