Gilead et al. propose an “ontology” of representation types, argue that this ontology captures “meaningful diversity in the representational substrates of the mind,” and criticize the architectural assumptions of predictive-coding models as “overly simplistic” (sect. 6 paras. 2 and 4, respectively). It is never entirely clear what the elements of this ontology are – a table would have helped – but the following all seem to be included: beliefs, desires, intentions, conditions of satisfaction, subjectively distinguishable objects, features and relations represented as modality-specific, multimodal, or categorical abstractions, episodes, “lemmas” defining words, semantic and temporal networks, hierarchies, “predicators” functioning as “mentalese” verbs, models, scripts, and simulations. The distinctions between these various entities are localized to Marr's computational level of analysis (sect. 1 para. 3); however, the critique of “sub-symbolic” architectures and focus on a “layer of language like mental representations” (sect. 5.2.3, para. 7) suggest an implementation-level analysis. This distinction is critical, as few would argue that different types of representations at different levels of abstraction do not have different roles in cognition. Neuroimaging results demonstrating functional localization, for example, support functional but not architectural distinctions between types of representations.
An unstated assumption of this ontology appears to be that the structure of conscious experience is a reliable guide to the architecture of the neurocognitive system that implements this experience, including the structure of its representations. The “rich and intricate theoretical conceptualizations” that predictive-coding models are claimed to have ignored (Introduction, para. 6) are conceptualizations of the structure of a particular kind of experience, the experience of thinking. Hoffman (Reference Hoffman and Serences2018) and Hoffman et al. (Reference Hoffman, Singh and Prakash2015) have argued on evolutionary grounds that perceptual experience is an “interface” onto the external world that supports the prediction of fitness consequences of actions but provides no reliable guide to the structure or dynamics – the architecture – of the external world. This argument can easily be inverted: conceptual experience is an interface that provides no reliable guide to the architecture of cognition. Just as humans have, in general, no need to know how computers work to use them effectively, humans have no need to know how their minds work to operate effectively in the world. A simplified folk “theory of mind” on the interface is good enough. Similar points have been made before, for example, by Chater (Reference Chater2018).
Relinquishing the assumption that the experience of cognition constrains the architecture of cognition is, we argue, the key to making significant progress in cognitive science. It enables asking: how is the experience of cognition produced as an output, and what inputs and inferential processes are needed to produce that output? Assuming the Church–Turing thesis, all computation is platform-independent: any collection of diverse representations can be generated, in principle, by any Turing-complete virtual machine. The central claim of artificial intelligence, often rendered just as “cognition is computation,” is actually that cognition is platform-independent. It remains far from obvious, however, how to implement cognition on any platform, including the human brain–body system. Nor is it obvious that understanding one implementation of cognition would provide useful hints toward understanding other implementations.
The claim that cognition is scale-free is far stronger than platform independence: it is the claim that a single computational architecture works “all the way down” – describing every virtual machine at every useful level of analysis. Gilead et al. recognize this when they describe the theory of active inference, the dominant current scale-free proposal, as claiming that “the complexity of cognition can naturally arise from a canonical computation repeated across different layers of a single continuum of representational abstractness” (sect. 5.2.3., para. 3). The theory of active inference is scale-free because the free-energy principle on which it is based is scale free (Friston Reference Friston2013; Friston et al. Reference Friston, Levin, Sengupta and Pezzulo2015), with its underlying basis, the existence of Markov blankets, derivable from classical (Kuchling et al. Reference Kuchling, Friston, Georgiev and Levin2019) and even quantum (Fields and Marcianò Reference Fields and Marcianò2019) physics. We have proposed an alternative, category-theoretic, scale-free formulation in which inferential coherence is enforced by commutativity between within-scale and between-scale mappings (Fields & Glazebrook Reference Fields and Glazebrook2019); our proposal is in the spirit of Goguen's (Reference Goguen1991) dictum, within computer science, that abstraction always corresponds to the construction of a category-theoretic “cocone” as a maximal representation of inferential coherence.
Scale-free models have the advantage of being rigorously testable at every experimentally-accessible level of analysis, from those of basic physics, intracellular, and cellular processes up to the whole-organism scale and beyond. At every level, they must specify explicitly what inputs are required and what outputs are produced; indeed the role of the Markov blanket is to provide an explicit encoding of these inputs and outputs. This requirement for theoretical explicitness illuminates a key question that Gilead et al. appear to have missed: What is it about experiences of “mental travel,” whether in time or across social relations, that identify them as such? What experientially distinguishes a memory from an imagined future? What distinguishes another's imagined thought from one's own? What makes the distinction between Gilead et al.'s “ontologically” distinct representations?
In scale-free models, such distinctions can only be made by scale-dependent inputs, the sources of the experienced “epistemic feelings” of reality, memory, and imagination that distinguish the functions of representations that may have the same “propositional” content. Hence, identifying these inputs is a crucially important theoretical and experimental task. Considerable progress has been made in understanding how inputs from the body and the external world are combined to locate the experienced self in the here and now (Craig Reference Craig2010) and describing these processes within a predictive-coding framework (Seth & Tsakiris Reference Seth and Tsakiris2018). The signals identifying memories, future projections, and thoughts of others are less well-characterized, though it is clear that specific activities in rostral prefrontal cortex (Simons et al. Reference Simons, Garrison and Johnson2017) and the insula – cingulate salience network (Uddin Reference Uddin2015) are involved. The disruption of these signals in pathology and their potential for therapeutic modulation, for example, with entheogens (Thomas et al. Reference Thomas, Malcolm and Lastra2017), give their mechanistic understanding clinical urgency. Such understanding cannot be accomplished if the distinctions they signal are simply taken as given.
Gilead et al. propose an “ontology” of representation types, argue that this ontology captures “meaningful diversity in the representational substrates of the mind,” and criticize the architectural assumptions of predictive-coding models as “overly simplistic” (sect. 6 paras. 2 and 4, respectively). It is never entirely clear what the elements of this ontology are – a table would have helped – but the following all seem to be included: beliefs, desires, intentions, conditions of satisfaction, subjectively distinguishable objects, features and relations represented as modality-specific, multimodal, or categorical abstractions, episodes, “lemmas” defining words, semantic and temporal networks, hierarchies, “predicators” functioning as “mentalese” verbs, models, scripts, and simulations. The distinctions between these various entities are localized to Marr's computational level of analysis (sect. 1 para. 3); however, the critique of “sub-symbolic” architectures and focus on a “layer of language like mental representations” (sect. 5.2.3, para. 7) suggest an implementation-level analysis. This distinction is critical, as few would argue that different types of representations at different levels of abstraction do not have different roles in cognition. Neuroimaging results demonstrating functional localization, for example, support functional but not architectural distinctions between types of representations.
An unstated assumption of this ontology appears to be that the structure of conscious experience is a reliable guide to the architecture of the neurocognitive system that implements this experience, including the structure of its representations. The “rich and intricate theoretical conceptualizations” that predictive-coding models are claimed to have ignored (Introduction, para. 6) are conceptualizations of the structure of a particular kind of experience, the experience of thinking. Hoffman (Reference Hoffman and Serences2018) and Hoffman et al. (Reference Hoffman, Singh and Prakash2015) have argued on evolutionary grounds that perceptual experience is an “interface” onto the external world that supports the prediction of fitness consequences of actions but provides no reliable guide to the structure or dynamics – the architecture – of the external world. This argument can easily be inverted: conceptual experience is an interface that provides no reliable guide to the architecture of cognition. Just as humans have, in general, no need to know how computers work to use them effectively, humans have no need to know how their minds work to operate effectively in the world. A simplified folk “theory of mind” on the interface is good enough. Similar points have been made before, for example, by Chater (Reference Chater2018).
Relinquishing the assumption that the experience of cognition constrains the architecture of cognition is, we argue, the key to making significant progress in cognitive science. It enables asking: how is the experience of cognition produced as an output, and what inputs and inferential processes are needed to produce that output? Assuming the Church–Turing thesis, all computation is platform-independent: any collection of diverse representations can be generated, in principle, by any Turing-complete virtual machine. The central claim of artificial intelligence, often rendered just as “cognition is computation,” is actually that cognition is platform-independent. It remains far from obvious, however, how to implement cognition on any platform, including the human brain–body system. Nor is it obvious that understanding one implementation of cognition would provide useful hints toward understanding other implementations.
The claim that cognition is scale-free is far stronger than platform independence: it is the claim that a single computational architecture works “all the way down” – describing every virtual machine at every useful level of analysis. Gilead et al. recognize this when they describe the theory of active inference, the dominant current scale-free proposal, as claiming that “the complexity of cognition can naturally arise from a canonical computation repeated across different layers of a single continuum of representational abstractness” (sect. 5.2.3., para. 3). The theory of active inference is scale-free because the free-energy principle on which it is based is scale free (Friston Reference Friston2013; Friston et al. Reference Friston, Levin, Sengupta and Pezzulo2015), with its underlying basis, the existence of Markov blankets, derivable from classical (Kuchling et al. Reference Kuchling, Friston, Georgiev and Levin2019) and even quantum (Fields and Marcianò Reference Fields and Marcianò2019) physics. We have proposed an alternative, category-theoretic, scale-free formulation in which inferential coherence is enforced by commutativity between within-scale and between-scale mappings (Fields & Glazebrook Reference Fields and Glazebrook2019); our proposal is in the spirit of Goguen's (Reference Goguen1991) dictum, within computer science, that abstraction always corresponds to the construction of a category-theoretic “cocone” as a maximal representation of inferential coherence.
Scale-free models have the advantage of being rigorously testable at every experimentally-accessible level of analysis, from those of basic physics, intracellular, and cellular processes up to the whole-organism scale and beyond. At every level, they must specify explicitly what inputs are required and what outputs are produced; indeed the role of the Markov blanket is to provide an explicit encoding of these inputs and outputs. This requirement for theoretical explicitness illuminates a key question that Gilead et al. appear to have missed: What is it about experiences of “mental travel,” whether in time or across social relations, that identify them as such? What experientially distinguishes a memory from an imagined future? What distinguishes another's imagined thought from one's own? What makes the distinction between Gilead et al.'s “ontologically” distinct representations?
In scale-free models, such distinctions can only be made by scale-dependent inputs, the sources of the experienced “epistemic feelings” of reality, memory, and imagination that distinguish the functions of representations that may have the same “propositional” content. Hence, identifying these inputs is a crucially important theoretical and experimental task. Considerable progress has been made in understanding how inputs from the body and the external world are combined to locate the experienced self in the here and now (Craig Reference Craig2010) and describing these processes within a predictive-coding framework (Seth & Tsakiris Reference Seth and Tsakiris2018). The signals identifying memories, future projections, and thoughts of others are less well-characterized, though it is clear that specific activities in rostral prefrontal cortex (Simons et al. Reference Simons, Garrison and Johnson2017) and the insula – cingulate salience network (Uddin Reference Uddin2015) are involved. The disruption of these signals in pathology and their potential for therapeutic modulation, for example, with entheogens (Thomas et al. Reference Thomas, Malcolm and Lastra2017), give their mechanistic understanding clinical urgency. Such understanding cannot be accomplished if the distinctions they signal are simply taken as given.