Hostname: page-component-745bb68f8f-lrblm Total loading time: 0 Render date: 2025-02-06T17:44:38.016Z Has data issue: false hasContentIssue false

Visual word recognition models should also be constrained by knowledge about the visual system

Published online by Cambridge University Press:  29 August 2012

Pablo Gomez
Affiliation:
Psychology Department, DePaul University, Chicago, IL 60614. pgomez1@depaul.eduhttp://condor.depaul.edu/~pgomez1/WNPL
Sarah Silins
Affiliation:
Law School, Northwestern University, Chicago, IL 60611. ssilins@gmail.com

Abstract

Frost's article advocates for universal models of reading and critiques recent models that concentrate in what has been described as “cracking the orthographic code.” Although the challenge to develop models that can account for word recognition beyond Indo-European languages is welcomed, we argue that reading models should also be constrained by general principles of visual processing and object recognition.

Type
Open Peer Commentary
Copyright
Copyright © Cambridge University Press 2012 

Any computational or mathematical model has to negotiate the tension between parsimony and a diverse and often fragmented empirical landscape. Frost's article correctly points out that the field is extremely Anglocentric, and that there is overwhelming evidence (particularly from studies done with Semitic languages) which indicates that current models of visual word recognition have a rather limited descriptive adequacy beyond the data sets (obtained mostly from English readers) that are used as benchmarks.

We interpret Frost's call for a universal model of reading to mean that the same general architecture, with different parameter values, should account for reading behavior (or at least visual word recognition) across a variety of languages. In other words, a universal model of reading should assume that Hebrew readers and English readers engage in basically the same processes while identifying written words. This process might produce different outcomes depending on the linguistic context (e.g., the presence or absence of roots or prefixes). A key finding pointing towards a unified mechanism with different parameters is that Hebrew readers show flexibility in the encoding of letter position (just like English readers) when presented with Hebrew words that are non-root-derived. It is impossible for the readers to anticipate whether they are about to encounter a root-derived or non-root-derived word, so one could not argue for different strategies depending on the kind of word; instead, the same process produces different outcomes depending on the nature of the input.

We argue that the starting point of a “universal” theory of visual word recognition should be the visual system that is shared by readers in all languages. In fact, some of the current models that Frost is critical of assume that the process of letter-position coding shares principles with other forms of visual processing. Namely, both the overlap model (Gomez et al. Reference Gomez, Ratcliff and Perea2008) and the Bayesian reader (Norris & Kinoshita, in press) claim that there is no “code to crack,” and that the transposed letter (TL) effects are a by-product of object-position uncertainty as described by general models of visual attention (e.g., Ashby et al. Reference Ashby, Prinzmetal, Ivry and Maddox1996; Logan Reference Logan1996).

The overlap model as currently formulated could not account for the data presented by Velan and Frost (Reference Velan and Frost2011), which show different outcomes depending on the type of word presented. So, how would a model that assumes that TL effects are merely a by-product of the visual system's organization account for the linguistic context effects mentioned by Frost? The target article raises important issues in its discussion of the differences between Hebrew and English. Whereas in English the presence of a particular letter is rather uninformative about other letters in a word, in Hebrew the probability of a letter being in a given position is highly predictive of the other letters present in the word. From a Bayesian point of view, one can think of this process as generating posterior functions from a prior and a likelihood (see Mamassian et al. Reference Mamassian, Landy, Maloney, Rao, Olshausen and Lewicki2002). In English, the priors are essentially uninformative because almost any letter can go into any letter position. In Hebrew, on the other hand, the priors may be quite informative especially for the majority of Hebrew words that include a root. So, the mere identification of one of the letters in a root-derived Hebrew word constrains the other possible letters in the word reducing position uncertainty. Models could be modified in order to implement informative prior knowledge.

In short, we believe that the target article by Frost is a timely piece that hopefully will push researchers into expanding the frontiers of the benchmark phenomena in visual word recognition. We believe that these extensions can be done in a principled manner, and that cross-linguistic work might be the next frontier for visual word recognition models.

References

Ashby, F. G., Prinzmetal, W., Ivry, R. & Maddox, W. T. (1996) A formal theory of feature binding in object perception. Psychological Review 103:165–92.CrossRefGoogle ScholarPubMed
Gomez, P., Ratcliff, R. & Perea, M. (2008) The overlap model: A model of letter position coding. Psychological Review 115:577601.CrossRefGoogle Scholar
Logan, G. D. (1996) The CODE theory of visual attention: An integration of space-based and object-based attention. Psychological Review 103:603–49.Google Scholar
Mamassian, P., Landy, M. S. & Maloney, L. T. (2002) Bayesian modelling of visual perception. In: Probabilistic models of the brain: Perception and neural function, ed. Rao, R., Olshausen, B. & Lewicki, M., pp. 1336. MIT Press.Google Scholar
Norris, D. & Kinoshita, S. (in press) Reading through a noisy channel: Why there's nothing special about the perception of orthography. Psychological Review.Google Scholar
Velan, H. & Frost, R. (2011) Words with and without internal structure: What determines the nature of orthographic and morphological processing? Cognition 118:141–56.Google Scholar