Hostname: page-component-745bb68f8f-s22k5 Total loading time: 0 Render date: 2025-02-07T06:59:35.775Z Has data issue: false hasContentIssue false

The Leabra architecture: Specialization without modularity

Published online by Cambridge University Press:  22 October 2010

Alexander A. Petrov
Affiliation:
Department of Psychology, Ohio State University, Columbus, OH 43210. apetrov@alexpetrov.comhttp://alexpetrov.com
David J. Jilk
Affiliation:
eCortex, Inc., Boulder, CO 80301. david.jilk@e-cortex.comhttp://www.e-cortex.com
Randall C. O'Reilly
Affiliation:
Department of Psychology and Neuroscience, University of Colorado, Boulder, CO 80309. Randy.OReilly@colorado.eduhttp://psych.colorado.edu/~oreilly

Abstract

The posterior cortex, hippocampus, and prefrontal cortex in the Leabra architecture are specialized in terms of various neural parameters, and thus are predilections for learning and processing, but domain-general in terms of cognitive functions such as face recognition. Also, these areas are not encapsulated and violate Fodorian criteria for modularity. Anderson's terminology obscures these important points, but we applaud his overall message.

Type
Open Peer Commentary
Copyright
Copyright © Cambridge University Press 2010

Anderson's target article adds to a growing literature (e.g., Mesulam Reference Mesulam1990; Prinz Reference Prinz and Stainton2006; Uttal Reference Uttal2001) that criticizes the recurring tendency to partition the brain into localized modules (e.g., Carruthers Reference Carruthers2006; Tooby & Cosmides Reference Tooby, Cosmides, Barkow, Cosmides and Tooby1992). Ironically, Anderson's critique of modularity is steeped in modularist terms such as redeployment. We are sympathetic with the general thrust of Anderson's theory and find it very compatible with the Leabra tripartite architecture (O'Reilly Reference O'Reilly1998; O'Reilly & Munakata Reference O'Reilly and Munakata2000). It seems that much of the controversy can be traced back to terminological confusion and false dichotomies. Our goal in this commentary is to dispel some of the confusion and clarify Leabra's position on modularity.

The target article is vague about the key term function. In his earlier work, Anderson follows Fodor (Reference Fodor2000) in “the pragmatic definition of a (cognitive) function as whatever appears in one of the boxes in a psychologist's diagram of cognitive processing” (Anderson Reference Anderson2007c, p. 144). Although convenient for a meta-review of 1,469 fMRI experiments (Anderson Reference Anderson2007a; Reference Anderson2007c), this definition contributes little to terminological clarity. In particular, when we (Atallah et al. Reference Atallah, Frank and O'Reilly2004, p. 253) wrote that “different brain areas clearly have some degree of specialized function,” we did not mean cognitive functions such as face recognition. What we meant is closest to what Anderson calls “cortical biases” or, following Bergeron (Reference Bergeron2007), “working.”

Specifically, the posterior cortex in Leabra specializes in slow interleaved learning that tends to develop overlapping distributed representations, which in turn promote similarity-based generalization. This computational capability can be used in a myriad of cognitive functions (O'Reilly & Munakata Reference O'Reilly and Munakata2000). The hippocampus and the surrounding structures in the medial temporal lobe (MTL) specialize in rapid learning of sparse conjunctive representations that minimize interference (e.g., McClelland et al. Reference McClelland, McNaughton and O'Reilly1995). The prefrontal cortex (PFC) specializes in sustained neural firing (e.g., Miller & Cohen Reference Miller and Cohen2001; O'Reilly Reference O'Reilly2006) and relies on dynamic gating from the basal ganglia (BG) to satisfy the conflicting demands of rapid updating of (relevant) information, on one hand, and robust maintenance in the face of new (and distracting) information, on the other (e.g., Atallah et al. Reference Atallah, Frank and O'Reilly2004; O'Reilly & Frank Reference O'Reilly and Frank2006). Importantly, mostFootnote 1 of this specialization arises from parametric variation of the same underlying substrate. The components of the Leabra architecture differ in their learning rates, the amount of lateral inhibition, and so on, but not in the nature of their processing units. Also, they are in constant, intensive interaction. Each high-level task engages all three components (O'Reilly et al. Reference O'Reilly, Braver, Cohen, Miyake and Shah1999; O'Reilly & Munakata Reference O'Reilly and Munakata2000).

We now turn to the question of modularity. Here the terminology is relatively clear (e.g., Carruthers Reference Carruthers2006; Fodor Reference Fodor1983; Reference Fodor2000; Prinz Reference Prinz and Stainton2006; Samuels Reference Samuels and Stainton2006). Fodor's (Reference Fodor1983) foundational book identified nine criteria for modularity. We have space to discuss only domain specificity and encapsulation. These two are widely regarded as most central (Fodor Reference Fodor2000; Samuels Reference Samuels and Stainton2006).

A system is domain-specific (as opposed to domain-general) when it only receives inputs concerning a certain subject matter. All three Leabra components are domain-general in this sense. Both MTL and PFC/BG receive convergent inputs from multiple and variegated brain areas. The posterior cortex is an interactive multitude of cortical areas whose specificity is a matter of degree and varies considerably.

The central claim of Anderson's massive redeployment hypothesis (MRH) is that most brain areas are much closer to the general than the specific end of the spectrum. This claim is hardly original, but it is worth repeating because the subtractive fMRI methodology tends to obscure it (Uttal Reference Uttal2001). fMRI is a wonderful tool, but it should be interpreted with care (Poldrack Reference Poldrack2006). Any stimulus provokes a large response throughout the brain, and a typical fMRI study reports tiny differencesFootnote 2 between conditions – typically less than 1% (Huettel et al. Reference Huettel, Song and McCarthy2008). The importance of Anderson's (2007a; 2007c) meta-analyses is that, even if we grant the (generous) assumption that fMRI can reliably index specificity, one still finds widespread evidence for generality.

MRH also predicts a correlation between the degree of generality and phylogenetic age. We are skeptical of the use of the posterior-anterior axis as a proxy for age because it is confounded with many other factors. Also, the emphasis on age encourages terms such as reuse, redeployment, and recycling, that misleadingly suggest that each area was deployed for one primordial and specific function in the evolutionary past and was later redeployed for additional functions. Such inferences must be based on comparative data from multiple species. As the target article is confined to human fMRI, the situation is quite different. Given a fixed evolutionary endowment and relatively stable environment, each human child develops and/or learns many cognitive functions simultaneously. This seems to leave no room for redeployment but only for deployment for multiple uses.

Anderson's critique of modularity neglects one of its central features – information encapsulation. We wonder what predictions MRH makes about this important issue. A system is encapsulated when it exchangesFootnote 3 relatively little information with other systems. Again, this is a matter of degree, as our Figure 1 illustrates. The degree of encapsulation depends on factors such as the number of exposed (input/output) units relative to the total number of units in the cluster, and the density and strength of distal connections relative to local ones. Even when all units are exposed (as cluster D illustrates), the connections to and from each individual unit are still predominantly local because the units share the burden of distal communication. Long-range connections are a limited resource (Cherniak et al. Reference Cherniak, Mokhtarzada, Rodrigues-Esteban and Changizi2004) but are critical for integrating the components into a coherent whole. The Leabra components are in constant, high-bandwidth interaction, and parallel constraint satisfaction among them is a fundamental implicit processing mechanism. Hence, we eschew the terms module and encapsulation in our theorizing. This is a source of creative tension in our (Jilk et al. Reference Jilk, Lebiere, O'Reilly and Anderson2008) collaboration to integrate Leabra with the ACT-R architecture, whose proponents make the opposite emphasis (J. R. Anderson Reference Anderson2007; J. R. Anderson et al. Reference Anderson, Bothell, Byrne, Douglass, Lebiere and Qin2004). Much of this tension is defused by the realization that the modularist terminology forces a binary distinction on what is fundamentally a continuum.

Figure 1. Information encapsulation is a matter of degree. Four neuronal clusters are shown, of which A is the most and D the least encapsulated. Black circles depict exposed (input/output) units that make distal connections to other cluster(s); grey circles depict hidden units that make local connections only.

Footnotes

1. There are exceptions, such as the use of a separate neurotransmitter (dopamine) in the basal ganglia.

2. Event-related designs do not escape this criticism because they too, via multiple regression, track contingent variation around a common mean.

3. Encapsulation on the input side is usually distinguished from inaccessibility on the output side. We discuss them jointly here because of space limitations. Also, the reciprocal connectivity and the task-driven learning in Leabra blur the input/output distinction.

References

Anderson, J. R. (2007) How can the human mind occur in the physical universe? Oxford University Press.CrossRefGoogle Scholar
Anderson, J. R., Bothell, D., Byrne, M. D., Douglass, S., Lebiere, C. & Qin, Y. (2004) An integrated theory of mind. Psychological Review 111:1036–60.CrossRefGoogle Scholar
Anderson, M. L. (2007a) Evolution of cognitive function via redeployment of brain areas. The Neuroscientist 13:1321.CrossRefGoogle ScholarPubMed
Anderson, M. L. (2007c) The massive redeployment hypothesis and the functional topography of the brain. Philosophical Psychology 21(2):143–74.CrossRefGoogle Scholar
Atallah, H. E., Frank, M. J. & O'Reilly, R. C. (2004) Hippocampus, cortex, and basal ganglia: Insights from computational models of complementary learning systems. Neurobiology of Learning and Memory 82(3):253–67.CrossRefGoogle ScholarPubMed
Bergeron, V. (2007) Anatomical and functional modularity in cognitive science: Shifting the focus. Philosophical Psychology 20(2):175–95.CrossRefGoogle Scholar
Carruthers, P. (2006) The architecture of the mind: Massive modularity and the flexibility of thought. Clarendon Press/Oxford University Press.CrossRefGoogle Scholar
Cherniak, C., Mokhtarzada, Z., Rodrigues-Esteban, R. & Changizi, K. (2004) Global optimization of cerebral cortex layout. Proceedings of the National Academy of Sciences USA 101:1081–86.CrossRefGoogle ScholarPubMed
Fodor, J. (1983) The modularity of mind. MIT Press.CrossRefGoogle Scholar
Fodor, J. (2000) The mind doesn't work that way. MIT Press.CrossRefGoogle Scholar
Huettel, S. A., Song, A. W. & McCarthy, G. (2008) Functional magnetic resonance imaging. Sinauer.Google Scholar
Jilk, D. J., Lebiere, C., O'Reilly, R. C. & Anderson, J. R. (2008) SAL: An explicitly pluralistic cognitive architecture. Journal of Experimental and Theoretical Artificial Intelligence 20:197218.CrossRefGoogle Scholar
McClelland, J. L., McNaughton, B. L. & O'Reilly, R. C. (1995) Why there are complementary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory. Psychological Review 102:419–57.CrossRefGoogle ScholarPubMed
Mesulam, M.-M. (1990) Large-scale neurocognitive networks and distributed processing for attention, language and memory. Annals of Neurology 28:597613.CrossRefGoogle ScholarPubMed
Miller, E. K. & Cohen, J. D. (2001) An integrative theory of prefrontal cortex function. Annual Review of Neuroscience 24:167202.CrossRefGoogle ScholarPubMed
O'Reilly, R. C. (1998) Six principles for biologically based computational models of cortical cognition. Trends in Cognitive Sciences 2:455–62.CrossRefGoogle ScholarPubMed
O'Reilly, R. C. (2006) Biologically based computational models of high-level cognition. Science 314(5796):9194.CrossRefGoogle ScholarPubMed
O'Reilly, R. C., Braver, T. S. & Cohen, J. D. (1999) A biologically based computational model of working memory. In: Models of working memory: Mechanisms of active maintenance and executive control, ed. Miyake, A & Shah, P, pp. 375411. Cambridge University Press.CrossRefGoogle Scholar
O'Reilly, R. C. & Frank, M. J. (2006) Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia. Neural Computation 18:283328.CrossRefGoogle Scholar
O'Reilly, R. C. & Munakata, Y. (2000) Computational explorations in cognitive neuroscience: Understanding the mind by simulating the brain. MIT Press.CrossRefGoogle Scholar
Poldrack, R. A. (2006) Can cognitive processes be inferred from neuroimaging data? Trends in Cognitive Sciences 10:5963.CrossRefGoogle ScholarPubMed
Prinz, J. (2006) Is the mind really modular? In: Contemporary debates in cognitive science, ed. Stainton, R. J., pp. 2236. Blackwell.Google Scholar
Samuels, R. (2006) Is the human mind massively modular? In: Contemporary debates in cognitive science, ed. Stainton, R. J., pp. 3756. Blackwell.Google Scholar
Tooby, J. & Cosmides, L. (1992) The psychological foundations of culture. In: The adapted mind: Evolutionary psychology and the generation of culture, ed. Barkow, J., Cosmides, L. & Tooby, J., pp. 19136. Oxford University Press.CrossRefGoogle Scholar
Uttal, W. R. (2001) The new phrenology: The limits of localizing cognitive processes in the brain. MIT Press.Google Scholar
Figure 0

Figure 1. Information encapsulation is a matter of degree. Four neuronal clusters are shown, of which A is the most and D the least encapsulated. Black circles depict exposed (input/output) units that make distal connections to other cluster(s); grey circles depict hidden units that make local connections only.