A universal approach to modeling visual word recognition and reading: Not only possible, but also inevitable

Ram Frost

doi:10.1017/S0140525X12000635

A universal approach to modeling visual word recognition and reading: Not only possible, but also inevitable

Published online by Cambridge University Press: 29 August 2012

Ram Frost

Show author details

Ram Frost*: Affiliation:
Department of Psychology, The Hebrew University, Jerusalem 91905, Israel, and Haskins Laboratories, New Haven, CT 06511. frost@mscc.huji.ac.ilhttp://psychology.huji.ac.il/en/?cmd=Faculty.113&letter=f&act=read&id=42~frost/http://www.haskins.yale.edu/staff/ramfrost.html

Article contents

Abstract
Introduction
Top-down theoretical scope and bottom-up implementations
Neurobiology, perception, and modeling visual word recognition
The characteristics of reading universals
The scope of cross-linguistic research
A developmental approach to orthographic processing
Descriptive adequacy of current implementations
The universal model of reading and the Strong Phonological Theory (SPT)
Summary and future directions
Footnotes
References

Rights & Permissions

Abstract

I have argued that orthographic processing cannot be understood and modeled without considering the manner in which orthographic structure represents phonological, semantic, and morphological information in a given writing system. A reading theory, therefore, must be a theory of the interaction of the reader with his/her linguistic environment. This outlines a novel approach to studying and modeling visual word recognition, an approach that focuses on the common cognitive principles involved in processing printed words across different writing systems. These claims were challenged by several commentaries that contested the merits of my general theoretical agenda, the relevance of the evolution of writing systems, and the plausibility of finding commonalities in reading across orthographies. Other commentaries extended the scope of the debate by bringing into the discussion additional perspectives. My response addresses all these issues. By considering the constraints of neurobiology on modeling reading, developmental data, and a large scope of cross-linguistic evidence, I argue that front-end implementations of orthographic processing that do not stem from a comprehensive theory of the complex information conveyed by writing systems do not present a viable approach for understanding reading. The common principles by which writing systems have evolved to represent orthographic, phonological, and semantic information in a language reveal the critical distributional characteristics of orthographic structure that govern reading behavior. Models of reading should thus be learning models, primarily constrained by cross-linguistic developmental evidence that describes how the statistical properties of writing systems shape the characteristics of orthographic processing. When this approach is adopted, a universal model of reading is possible.

Type: Author's Response
Information: Behavioral and Brain Sciences , Volume 35 , Issue 5 , October 2012 , pp. 310 - 329

DOI: https://doi.org/10.1017/S0140525X12000635 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2012

R1. Introduction

My target article is a critique of the recent paradigmatic shift in modeling visual word recognition, characterized by extensive preoccupation with noisy letter-position coding. The main theoretical claim driving this critique is that orthographic processing cannot be researched, explicated, or understood without considering the manner in which orthographic structure represents phonological, semantic, and morphological information in a given writing system. This is because any orthographic effect obtained in a given language, such as sensitivity to letter order, is an emerging product of the full linguistic environment of the reader, not just of the structure of the graphemic sequence. In a nutshell, I have argued that a theory of reading should be a theory of the interaction of the reader with his/her linguistic environment. This sets the criteria for a novel approach to studying and modeling visual word recognition. The models should describe and explain the common cognitive principles involved in processing printed words across orthographies, taking into account the commonalities and differences between systems.

The article presents, then, a series of related claims that logically follow from one another. The various commentaries refer to all of these claims – the general theoretical agenda, the constraints on the evolution of writing systems, the nature of orthographic processing, the universality of letter-position flexibility, and the advantages of different modeling approaches. Naturally, a number of commentaries have expressed contrasting views. Other commentaries have suggested important fine-tuning of some of the theoretical claims. Quite a few commentaries have extended the scope of the debate further, bringing into the discussion additional perspectives. My response deals with all of these issues with the aim of fleshing out fine distinctions, so as to settle on a broader theoretical approach that incorporates the additional input offered by the various commentaries.

The rest of the response therefore comprises eight sections: Section R2 is devoted to the general theoretical agenda I advocate, and the reciprocal relations between a reading theory and its possible implementations. Section R3 discusses the constraints of neurobiology and perception on modeling visual word recognition. Section R4 expands on the concept of reading universals. Section R5 deals with the scope of cross-linguistic research. Section R6 outlines the merits of a developmental approach to orthographic processing. This is an important extension of the target article, and it traces directions for future research. Section R7 discusses the descriptive adequacy of current implementations. Section R8 provides important extensions of the present theoretical framework to include phonological processing, and Section R9 summarizes the discussion by outlining possible future directions.

R2. Top-down theoretical scope and bottom-up implementations

The claim that a theory of reading is a theory of the interaction of the reader with his or her linguistic environment sets the perimeter of possible sources of constraints for our models of reading. Grainger & Hannagan label the quest of finding commonalities in reading across different writing systems through cross-linguistic research a “top-down approach to scientific theorizing,” which ignores “the details of implementation.” This criticism deserves a lengthy discussion, as it concerns the basic foundations of scientific research in the domain of reading. Grainger & Hannagan see researchers of visual word recognition as faced with a binary choice: either pursuing bottom-up implementations using few general principles, which eventually leads to a model that provides an adequate description of the data (presumably the “right” approach to science), or engaging in a top-down search for a good theory without bothering about the details (the “bad” approach to science). Our scientific investigation, however, is always a combination of both, because the choice between possible bottom-up implementations is not and cannot be independent of our top-down theorizing regarding what constraints are relevant for assessing these implementations, and what set of data should be modeled to begin with. Without a theoretical framework that determines the full scope of relevant constraints and the range of data to simulate, the search for adequate bottom-up implementations may miss critical phenomena with important explanatory power.Footnote ¹

The question then is not whether one can suggest common operations in orthographic processing across writing systems, but rather, what type of information would be relevant for finding them. The common principles according to which writing systems have evolved to represent orthographic information in all languages seem critical because they reveal the complexity of information that is conveyed by orthographic structure, aside from letter identity and letter position. Borrowing Perfetti's words, orthographic structure allows the language to be “seen” through the print, since writing systems are notational systems for the language – phonology and morphology included. This insight leads to further insight regarding how the cognitive system picks up this complex information and illuminates the nature of this information. The significant advantage of this ecological approach is that it considers in parallel the information processing system and the environment on which it operates. This theoretical perspective sets the perimeter of possible relevant implementations, and suggests that a much broader data set should be considered in our modeling enterprise.

R2.1. Sources of constraints on implementations

The intimate interaction of theory and consequent implementation is well exemplified by several of the commentaries. Pitchford, van Heuven, Kelly, Zhang, & Ledgeway (Pitchford et al.), for example, argue that vision, development, bilingualism, and the statistical properties of letter distribution across languages, are all relevant sources of constraints for implementation in modeling of visual word recognition. Goswami and Deacon very convincingly argue why data from reading acquisition across writing systems is imperative for understanding what information is picked up by readers from the orthography for the purpose of visual word recognition. McBride-Chang, Chen, Kasisopa, Burnham, Reilly, & Leppänen (McBride-Chang et al.) refer to the additional complexity related to the nature of word units across orthographies, and the inherent ambiguity regarding the definition of word boundaries. Friedmann & Gvion discuss the implications of cross-linguistic differences considering the density of lexical space. Liversedge, Blythe, & Drieghe (Liversedge et al.) demonstrate how sentential context determines patterns of orthographic processing, such as sensitivity to letter position. Feldman & Moscoso del Prado Martín discuss the interaction of semantic and orthographic processing in different languages. Pelli, Chung, & Legge (Pelli et al.) show how letter-by-letter decoding, whole word shape, and sentence context determine eye movements and reading speed.

In this context, the approach advocated by Grainger & Hannagan represents a notable exception. Grainger & Hannagan would probably not deny that all of the aforementioned are important aspects of reading research. Nevertheless, by considering only the front-end part of visual word recognition, they focus mainly on the architecture of the visual system of primates and the child's pre-existing visual object-recognition system. This approach is best demonstrated in a recent report by Grainger and colleagues showing that baboons can be trained to efficiently distinguish hundreds of words from nonwords composed of nonsense combinations of letters (Grainger et al. Reference Grainger, Dufau, Montant, Ziegler and Fagot2012). Since baboons do not have any linguistic representations, but, nevertheless, can perform similar to humans in a lexical decision task, Grainger et al. (Reference Grainger, Dufau, Montant, Ziegler and Fagot2012) reach the conclusion that orthographic processing in humans and primates probably employs similar principles of visual object processing. The logic of this argument lies on the inference that if primates can be shown to do what humans do, it must be that the underlying cognitive processing of humans and primates is similar. To reiterate, if primates lacking linguistic skills can do well in recognizing statistical dependencies of orthographic symbols relying solely on their object-recognition abilities, then orthographic processing in humans probably draws upon object-recognition as well.

Aside from the logical fault underlying such inference the “environment” in this approach to reading is consequently restricted to the world of visual objects, rather than the characteristics of the linguistic environment. This determines to a large extent the range of constraints that are relevant for testing specific implementation. Grainger & Hannagan are thus inspired by bioinformatics, suggesting that “string kernels,” also used for protein function predictions, can be usefully applied to reading research (see Hannagan & Grainger, in press: “Protein analysis meets visual word recognition”). They argue that this approach provides a better fit to a set of established benchmark phenomena, but here is the snag: It is the theory that eventually determines the scope of actual “benchmark phenomena” that are considered relevant to validate a model, and it is this scope that traces the thin line between “a modest proposal” and a narrow one. Adopting Grainger & Hannagan's approach would inevitably lead to an impoverished theory of orthographic processing that does not consider the rich scope of statistical correlations that exist between various sublinguistic representations in a given language. Consequently, such theory indeed would not differentiate between performance of humans and performance of primates, who lack linguistic knowledge. The surprising richness of information extracted from print by readers during orthographic processing is well described by Homer, Miller, & Donnelly (Homer et al.).

My critique of the “new age of orthographic processing” discusses in great detail the shortcomings of considering only front-end constraints when studying reading and visual word recognition, and of researching them only within one language – English. Not unexpectedly, some of the present commentaries focus on outlining the merits of the agenda of cracking the orthographic code in a uniform linguistic environment. Let me concede up front that any scientific investigation has merits. I therefore agree with Grainger & Hannagan that it is important to study how the cognitive system treats letters in a specific linguistic environment (in fact, I have been doing so myself in Hebrew for years). I agree with Pitchford et al. that the role of early visual processing in reading research has been largely overlooked. I agree with Whitney that this agenda has produced important insights regarding low-level processing, thereby describing the neurocircuitry involved in visual word recognition. The shift to explore the front end of word perception has no doubt contributed to a wealth of data, outlined meticulously by Whitney. The question at hand, however, is whether front-end implementations of orthographic processing that do not stem from a comprehensive theory of the complex information conveyed by writing systems, and are not constrained by developmental and cross-linguistic evidence, present a viable approach for understanding reading. My answer to this is a decisive no.

R3. Neurobiology, perception, and modeling visual word recognition

Admittedly, even if it is established that cross-linguistic evidence is a main source of constraints for any universal model of reading, as I have argued, the question of neurobiological constraints still lingers. Thus, the fact that a theory of visual word recognition cannot do without a detailed analysis of the properties of writing systems indeed does not imply that the theory should not be constrained by the properties of the visual system and the brain. Several commentaries have addressed this issue. Szwed, Vinckier, Cohen, & Dehaene (Szwed et al.) convincingly argue for a universal neurobiological architecture of reading acquisition. Their brief report provides helpful examples of the insights that neurobiological data can provide for understanding how the brain neurocircuitry adapts to deal with different writing systems, suggesting that in the course of learning, the visual system internalizes orthographic units that are relevant to morphological and lexical knowledge. I embrace this suggestion with both hands. A word of caution though: This research enterprise is contingent on working within a developmental perspective, as indeed suggested by Szwed et al. Observing correlations between a discovered reading behavior and some patterns of brain processing, then describing this behavior in terms of brain processing, and then using this description as explanation, would not advance us much in understanding reading. Insight is gained mainly by considering how the brain adapts to a writing system in the course of literacy acquisition.

R3.1. Linguistic modulation of perceptual processes

If both cross-linguistic evidence and neurobiological evidence are sources of constraints for a theory of reading, an important question concerns the extent of penetrability (or susceptibility) of primary visual processing to linguistic modulation. Returning to the question of letter transposition, several commentaries have addressed the question of what is universal and what is language-specific regarding letter coding. To put it in other words, where does vision “end” and language “begin” in reading? This is certainly not a simple question. For example, Martelli, Burani, & Zoccolotti (Martelli et al.) remind us that crowding poses visual constraints on orthographic codes, suggesting how constraints of visual span interact with word-length and letter-position insensitivity. Similarly, Pelli et al. provide an insightful account of the complex interactions of reading speed with crowding, text size, and comprehension.

In this context, the proposal offered by Norris & Kinoshita and by Gomez & Silins deserves a serious discussion. Both Norris & Kinoshita and Gomez & Silins suggest that the primary perceptual processes involved in visual word recognition are universal and that, akin to visual object recognition, they are characterized by perceptual noise. By this view, the product of the primary visual analysis, in which letter position is ambiguous, is then shaped by the properties of the language, producing cross-linguistic differences such as transposed-letter (TL) priming effects. Similarly, Perea & Carreiras argue in a convincing commentary that perceptual uncertainty is characteristic of the cognitive system. This account suggested by Norris & Kinoshita, Gomez & Silins, Perea & Carreiras, as well as Whitney, is probably true to some extent, and certainly hard to refute. I have acknowledged it at the onset of the target article. Obviously, there must be a primary level of visual processing that is common to all incoming visual information: objects, words, or visual scenes. Similarly, there must be some level of noise regarding letter position, given the properties of the visual system. As Liversedge et al. rightly argue, the common nature of eye movements in reading, along with the physiological make-up of the retina, determine how information is delivered to the cognitive system.

Having acknowledged that, the suggestion offered by Norris & Kinoshita – according to which the “perceptual system” fully completes its task, and only then does the “linguistic system” come into play to produce differential effects of transposition – has the flavor of bottom-up feed-forward processing, which is not very probable. The idiosyncratic distributional properties of letters in a language result in perceptual learning – a means to facilitating fast and efficient recognition of visual configurations that are frequently encountered by the organism (e.g., Gilbert et al. Reference Gilbert, Sigman and Crist2001; Sigman & Gilbert Reference Sigman and Gilbert2000). As demonstrated for both nonverbal and verbal stimuli, the frequency and amount of retinal training determines the way the distal stimulus is processed. For example, Nazir et al. (Reference Nazir, ben-Boutayab, Decoppet, Deutsch and Frost2004) have demonstrated reading-related effects of retinal perceptual learning that were stimulus specific (e.g., whether the stimulus is a word or a nonword), as well as language specific (whether the script is Hebrew or English). In this study, we found that legibility of target letters differentially varied with locations on the retina for Hebrew and Roman scripts. Nazir et al. (Reference Nazir, ben-Boutayab, Decoppet, Deutsch and Frost2004) therefore concluded that reading habits affect the functional structure of early stages in the visual pathway. To some extent, this suggestion is echoed by Szwed et al., and also resonates with Laubrock & Hohenstein's review of how language modulates print processing already in the parafovea. Thus, the demarcation line beyond which “perceptual” processing ends and “linguistic” processing begins is hard to discern. The idea that the perceptual system feeds a uniform output to the linguistic system across orthographies is, therefore, not supported by the data. As Perea & Carreiras argue, the evidence regarding letter-position flexibility in many languages is uncontested, but so is the evidence regarding letter-position rigidity in other writing systems. Thus, from a perspective of a theory of reading, the interesting discussion concerns the way the linguistic environment shapes readers' indifference or rigidity regarding letter order, as well as other characteristics of orthographic processing. This is the main thrust of the quest for a universal model of reading.

R3.2. The time course of linguistic effects

A critical empirical question then is how “early” during processing the characteristics of writing systems exert their influence on the perceptual processes of print. As Nazir et al. (Reference Nazir, ben-Boutayab, Decoppet, Deutsch and Frost2004) suggest, reading habits that are related to writing systems develop at early stages in the visual pathway. Szwed et al. refer to brain evidence from magnetoencephalography (MEG) experiments, showing that already 130 msec after word onset, distributional properties of letter combinations modulate responses (e.g., Simos et al. Reference Simos, Breier, Fletcher, Foorman, Castillo and Papanicolaou2002; Solomyak & Marantz Reference Solomyak and Marantz2010). Similarly, in behavioral studies, recent results from our laboratory suggest that readers of Hebrew differentiate between letter transpositions occurring in words with or without a Semitic structure already at first fixation (Velan et al., under review). Thus, TL interference is found for root-derived words, but not for simple words, in the earliest measure of eye movements. Interestingly, for first-fixation latencies, what matters for Hebrew readers is whether a legal root is contained in the letter sequence irrespective of whether the letter string is a word or a nonword. Thus, even if there is a phase of processing where all printed input is treated alike, the inevitable conclusion is that the statistical properties of the linguistic environment of readers shape letter processing very early on, resulting in systematic cross-linguistic differences. This suggestion is well supported by Laubrock & Hohenstein, who demonstrate differential parafoveal preview benefit effects (Rayner Reference Rayner1975) in various European languages and in Chinese. All this should outline a shift in the agenda of reading research towards a developmental approach, focusing on how the information that readers pick up from their linguistic environment in general, and from their writing system in particular, shapes and determines visual analysis and orthographic processing characteristics, as reading proficiency increases.

R4. The characteristics of reading universals

A major claim of the target article was that cross-linguistic empirical research should reveal common cognitive operations involved in processing printed information across writing systems. These I labeled reading universals, and the term incurred a variety of responses. Given the very strong opinions regarding Chomsky's theory of universal grammar (UG) (e.g., Grainger & Hannagan, and see Evans & Levinson Reference Evans and Levinson2009), the mere use of the word “universal” in the realm of psychology and language seems to involve significant risk, as well as possible misinterpretations. A preliminary discussion of the basic differences between “reading universals” and UG is, therefore, required.

Since writing systems are a code designed by humans to represent their language, in contrast to the notion of UG (e.g., Chomsky Reference Chomsky1965; Reference Chomsky1995; Reference Chomsky2006), reading universals are not innate or modular linguistic computational abilities that mirror the common structure of natural languages. Rather, they are general cognitive mechanisms designed to process the characteristic information provided by the code we call “orthography.” In this respect, both Levy and Behme overextend the concept of “reading universals,” attaching to it incorrect and unnecessary Chomskyan associations. Similarly, Coltheart & Crain draw a parallel between Chomsky's linguistic universals (e.g., recursivity, structure-dependence, etc.) and reading universals, asking whether there is something common to all writing systems in the same sense as the allegedly common internal structure of natural languages, and whether there is something common in processing them. Share draws identical parallels.

Reading universals are labeled so because they mirror the universality constraint, which requires models of reading to entertain high-level principles that simultaneously provide a systematic explanation for cross-linguistic similarities in processing printed words, on the one hand, and cross-linguistic differences, on the other. Thus, a good theory of reading should explain why readers of different writing systems consistently display similar behaviors in a given experimental setting, and also why they consistently display different behaviors in other experimental settings. This explanation should be based on few high-level, basic, and general mechanisms that characterize the cognitive behavior of reading, given what writing systems are meant to convey. It is up to us scientists to reveal these mechanisms, and once we have revealed them, they should be part of our models.

This approach indeed suggests that there are common invariant cognitive operations involved in processing printed information across writing systems, which are not too general or trivial. Coltheart & Crain as well as Behme are right, however, in suggesting that this claim is not self-evident and requires convincing argumentation. The claim for “common operations” in reading rests then on two tiers. The first argues that there is something common to the type of information provided by writing systems and the way this information is conveyed in print. Writing systems with all of their variety, therefore, constitute an environment with specific characteristics. The second argues that human cognition is characterized by general procedures for picking up statistical information from the environment, and that processing printed information draws upon these general procedures.

R4.1. The evolution of writing systems

The discussion of the evolution of writing systems and the description of Chinese, Japanese, Finnish, English, and Hebrew sets the grounds for the first tier. I agree with Behme that the evolution of writing systems should not be regarded as entirely deterministic in the sense that their final characteristics could not have been otherwise. Norris & Kinoshita provide arguments along the same lines, so do Beveridge & Bak, and so does Share. Clearly, some arbitrary historical events may have tilted the evolution of a given writing system this way or the other. However, as a general argument, historical events and cultural influences could not have resulted in just any arbitrary change in writing systems, because the structure of the language constrains and determines the array and direction of possible changes. Our theory of reading should draw upon the logic of these constraints. Considering, for example, Serbo-Croatian, if in the nineteenth century, Vuk Karadzic, a Serbian philologist and linguist, would not have reformed the Serbian alphabet to be entirely phonetic, perhaps the writing system of Serbo-Croatian would not have been as transparent as it is today. However, the point to be made in this context is that the reform of the Serbian-Cyrillic writing system was initiated and made possible given the phonological and morphological characteristics of that language.

Seidenberg (Reference Seidenberg, McCardle, Miller, Lee and Tzeng2011) has labeled this state of affairs “grapholinguistic equilibrium,” but in the sense of a functional equilibrium of effort. By his view, languages with complex inflectional morphology move towards shallow orthographies because of constraints regarding the amount of complexity they can impose on their speakers. Whether this specific functional hypothesis is true or not, the trade-off of inflectional morphology and orthographic depth is but one example of equilibrium, of a trade-off found in writing systems. The tendency of shallow orthographies to allow for extensive compounding in order to pack in more orthographic information is yet another form of equilibrium, related to the trade-off between the transparency of phonological computation and orthographic complexity. The tendency of deep orthographies, such as Hebrew, to reduce phonological and thereby orthographic information in order to make morphological (root) information more salient, is another example of a trade-off. The “equilibrium” phenomenon, therefore, is much more complex than that noted by Seidenberg, and does not necessarily emerge from his suggested functionalist argumentation.

R4.2. The theoretical significance of optimality considerations

Several commentaries (Behme, Perfetti, Levy, Norris & Kinoshita, Seidenberg, and Share) discuss the claim that orthographies optimally represent the language, focusing on criteria of optimality, arguing that my claim for optimality is unwarranted for a variety of reasons. As explicated earlier, writing systems are an invention, a code, created to represent the spoken language and its morphological structure. The evolution of this code, like any invented code, is naturally shaped by efficiency constraints, as most forms of communication are. However, in contrast to the evolution of species, such shaping does not require thousands of years to develop, as Norris & Kinoshita seem to suggest. The introduction of vowel marks in Hebrew, and their subsequent natural omission, given changes in the linguistic environment of Hebrew speakers, is a typical example of this relatively fast process of natural evolution. Phonological transparency at the expense of morphological saliency was introduced into the Hebrew writing system when the language ceased to be spoken by any one Jewish community, given historical events; morphological saliency at the expense of phonological transparency naturally evolved when the Hebrew language became widely spoken again, and in a relatively short period of time. Inefficient communication forms tend to vanish, to be replaced by more efficient ones, even without the intervention of an enlightened monarch, as Perfetti suggests.

I have to agree, however, with Perfetti and Behme that a strong claim regarding optimality requires a definition of an optimization algorithm. I also have to agree that writing systems are not analog to self-correcting networks, since historical events and cultural influences naturally come into play to shape their forms. Seidenberg makes a similar claim. In this context, the evidence provided by Hyönä & Bertram regarding the impact of compounding in Finnish is in line with the view that writing systems could be perhaps sub-optimal rather than optimal. Following the work of Bertram et al. (Reference Bertram, Kuperman, Baayen and Hyönä2011), Hyönä & Bertram make the case that hyphens introduced into three-constituent compounds at morphemic boundaries facilitate recognition, demonstrating that some price is incurred in excessive packing of orthographic information, thereby casting doubt on the idea of the optimal efficiency of Finnish.

These are convincing arguments, and I agree that it is indeed difficult, if not impossible, to assess whether the current form of a writing system is “fully optimal,” or just “good enough.” However, for the purpose of the logic and theoretical stand advocated here, this is not a critical distinction. I would be happy to concede that writing systems evolve and adapt to provide a representation of phonology and morphology that is just good enough or sub-optimal rather than mathematically optimal, whatever mathematically optimal means in this context. The heart of the argument is that there are common principles that govern the direction and rules of this adaptation and evolution, and the main claim is that our theory of how the orthographic written code is processed must consider what exactly renders a writing system efficient for a specific linguistic environment. So, yes, the statement that “languages get the writing systems they deserve” (Halliday Reference Halliday1977)Footnote ² still stands, even though one could provide an argument why a specific language perhaps deserves a writing system that is even better than the one it currently has.

R4.3. Common cognitive operations underlying reading universals

As I have outlined, the claim for common operations in reading rests also on the assertion that there are typical procedures for picking up information in the environment of printed languages. Some commentaries voiced skepticism regarding the possibility of converging on such common operations. Similar to Coltheart & Crain, who question the likelihood of outlining linguistic features that are common to all writing systems, Plaut argues that if there are common operations in processing all writing systems, they would be too general to be informative. Reiterating Plaut's well-articulated Wittgensteinian analogy on the concept of “game,” the expected commonalities in processing print across languages, according to Plaut, would be as instructive for understanding reading as would be the theoretical commonalities of all sporting games for understanding soccer. However, in contrast to Philosophical Investigations (Wittgenstein Reference Wittgenstein and Anscombe1953), this is an empirical question, not a philosophical one. Unlike sporting games, writing systems have a well-defined common goal – to convey meaning – and they do so by common and relatively simple principles, which are tuned to human cognitive abilities. Note that the position of the new age of orthographic processing was that letter-position flexibility is a commonality in processing print across writing systems. This view has been challenged by empirical evidence. Similarly, future cross-linguistic research would have to assemble evidence for alternative commonalities.

Admittedly, this task is not trivial. For example, McBride-Chang et al. raise a well-argued concern regarding the plausibility of finding common processing principles across orthographies, given the inherent inconsistency in defining word boundaries across writing systems. Analyzing Chinese, Thai, and Finnish, they show that the definition of a “word” unit is far from being unequivocal, and this ambiguity would make it impossible to offer universal parsing principles across writing systems. McBride-Chang et al. thus convincingly show that implemented solutions for producing in a model a behavior that fits the data for one language may not do the work in another language. Reading universals, however, are not common parsing routines. They are principles of efficiency in picking up semantic, morphological, and phonological information from the orthographic structure whatever it is, given the statistical properties of the language.

In concatenated morphological systems such as those of European languages, for example, processing involves decomposing affixes, and the target of search is the base form (e.g., Rastle & Davis Reference Rastle and Davis2008; Rastle et al. Reference Rastle, Davis and New2004; Taft & Nillsen, in press). What drives this process is the saliency of affixes, given their high distributional properties, and their predetermined location at the beginning or the end of the word. In Semitic languages, the game is quite different. The target of search is a noncontiguous root morpheme with its letter constituents distributed across the word without a predetermined location. Here, readers are tuned to the conditional probabilities of letters, which are determined by the distributional properties of word patterns. What is different, then, is the parsing procedure and the definition of units for lexical access (base forms vs. roots). However, what is common is the principle of picking up the specific statistical properties of the language from print, zooming in on those sublinguistic units which are best correlated with meaning. By this view, both rapid affix-stripping in European languages and root extraction in Semitic languages reflect a reading universal – the priority of locating and extracting units of morphological information in the distal stimulus. McBride-Chang et al. thus convincingly show that structured models would have immense difficulties in satisfying the universality constraint, whereas learning models are better fit for the task.

The efficiency of writing systems is determined, on the one hand, by the nature of the information that has to be transmitted (the language's phonological space, its morphological structure, and the way it conveys meaning), and by the characteristics of the cognitive system that has to pick this up, on the other. Reading universals are then related to both. Thus, to answer Coltheart & Crain as well as Plaut, the claims – that the recovery of morphological information takes precedence in encoding orthographic structure; that letter processing is not determined just by letter position but mostly by the informational properties that individual letters carry; that orthographic coding simultaneously considers phonological, morphological, and semantic information; that the transitional probabilities of individual letters serve as critical cues for processing letter sequences; that eye-movement measures during reading such as length of fixation and landing position are modulated by such cues – are all potential reading universals, and when validated, they should be part of our theory of reading and the models it produces. Liversedge et al., for example, present compelling arguments regarding universal stylized patterns of saccades during reading that are cross-culturally uniform. Since these saccade patterns determine how orthographic information is delivered to the language- processing system, Liversedge et al. rightly suggest that the regularities of eye movements could be considered as universal characteristics that should constrain a theory of reading (see also Pelli et al.). This analysis brings us yet again to the understanding that cross-linguistic research in reading is a main source of constraints to modeling visual word recognition. This claim is at the heart of the present approach.

R4.4. Reading universals and statistical learning

Considering the common cognitive operations for picking up the information packed into the orthography, the perspective I advocate then stands in sharp contrast to Chomsky's UG, because these cognitive operations are by no means modular abilities exclusive to the faculty of language. They reflect general learning mechanisms related to sensitivity to correlations in the environment, on the one hand, and the specific medium of writing systems – graphemes representing meaning and phonology – on the other. The claim that languages are characterized by idiosyncratic statistical regularities which encompass all of the word's dimensions (orthographic, phonological, and morphological structure) is hardly controversial. Similarly, it is well established that the cognitive system is a correlation-seeking device, and that adults, children, and even newborns can pick up subtle statistics from the environment (e.g., Evans et al. Reference Evans, Saffran and Robe-Torres2009; Gebhart et al. Reference Gebhart, Newport and Aslin2009; Gomez Reference Gomez and Gaskell2007). As convincingly argued by Winkler et al. (Reference Winkler, Denham and Nelken2009), predictive processing of information is a necessary feature of goal-directed behavior, whether language related or not, and thus brain representations of statistical regularities in the environment determine primary perceptual processes in the visual and auditory modalities. Hence, the appreciation that the processing of printed information is mainly governed by the statistical properties of writing systems is supported by studies from a variety of languages.

McBride-Chang et al. provide a nice example from Thai, where there are no spaces between words, and so eye movements to the optimal viewing position (OVP) are directed by the distributional properties of initial and final graphemes (e.g., Kasisopa et al. Reference Kasisopa, Reilly, Burnham, Shen, Bai, Yan and Rayner2010). Additional arguments along this line are suggested by Szwed et al. and Pitchford et al., and in fact, the notion of perceptual learning argued above is fully contingent on how the statistical properties of the environment train the perceptual system to process information efficiently. By this view, language is considered an example of a very rich environment characterized by complex correlations and distributional properties to which the cognitive system is tuned. This stand is not the one advocated by the Chomskyan approach. Our research should focus on understanding and mapping the statistical cues that determine orthographic processing in visual word recognition, such as flexibility or rigidity of letter position, as well as other benchmark effects of reading. These cues would enable us to explore and test hypotheses regarding the architecture of our models.

R5. The scope of cross-linguistic research

Insights regarding the common operations involved in reading can be reached only by observing systematic differences across languages. Observing these differences through empirical research leads to higher-level theoretical constructs which provide a unified explanation as to why language X brings about behavior A and language Y brings about behavior B. This is the essence of reading universals. Once this approach to reading research is adopted, it becomes evident that the progress in formulating a universal theory of reading would benefit from evidence from a wide variety of languages. Note that this stand does not mean that visual word recognition should become a branch of structural linguistics. Rather, in the present context, examining different writing systems would be considered a clever experimental manipulation employed to test hypotheses regarding what determines reading behavior in a given linguistic environment.

A good example is provided by Friedmann & Gvion. By comparing TL effects in Hebrew and Arabic, they point to an important interaction of morphological and orthographic structure. Hebrew and Arabic are both Semitic languages with very similar morphological systems. However, among other things, they differ in that Arabic has a different form for some letters in the initial, middle, and final position, whereas Hebrew only has a few letters which are written differently when in final position.Footnote ³ Friedmann & Gvion elegantly demonstrate how letter-position errors in Arabic are constrained by this unique orthographic feature, in which readers learn complex interactions of letter identity by shape, that is dependent on position (Friedmann & Haddad-Hanna, in press a). Another example is provided by Kim, Lee, & Lee (Kim et al.), who review letter-transposition effects in Korean (e.g., Lee & Taft Reference Lee and Taft2009; Reference Lee and Taft2011). These studies took advantage of the special features of Hangul, mainly, that it demarcates space-wise between onset and coda positions for each consonant. By using this unique feature of Korean, Kim et al. convincingly argue that subsyllabic structure modulates letter-position coding, suggesting that modeling letter position requires a level of description that takes into account this constraint. In the same vein, Rao, Soni, & Chatterjee Singh (Rao et al.) provide evidence from the alphasyllabic Devanagari, showing how morphological complexity modulates orthographic processing.

Just as the Anglocentricity of reading research (Share Reference Share2008a) resulted in an overemphasis of the role of phonological awareness in reading, European “alphabetism,” as Share calls it, resulted in an overemphasis on letter-position flexibility. Beveridge & Bak provide, in this context, important statistics regarding the extremely biased ratio of research articles on disorders of written language describing Indo-European languages versus other languages. This has implications for understanding (or perhaps misunderstanding) not only reading, but also aphasia, alexia, or agraphia. As Beveridge & Bak point out, the manner by which phonology and morphology interact to determine orthographic structure becomes transparent only by considering a wide variety of languages, so that the possible contribution of culture to this evolution can be assessed. Share brings into the discussion examples of other less researched languages.

This leads our discussion to the question of the range of data that should serve as the basis for our models. Models or theories of reading are constrained by benchmark effects. What makes an emergent effect “a benchmark effect” is its generalizability across experimental settings. Writing systems consist of such “experimental settings” no less than any clever within-language manipulation, since important variables such as phonological transparency, morphological saliency, and so forth, are systematically held constant. Hence, data reported from different writing systems must be part of any hypothesized computational reading mechanisms. Whether this approach will indeed result in a universal computational model of reading, remains to be seen. Some commentaries expressed optimism, whereas others expressed pessimism. What seems to be uncontested is the merit of this approach for understanding the common cognitive or computational principles that govern reading behavior, as well as the inadequacy of modeling approaches which are based on one homogeneous linguistic system.

R6. A developmental approach to orthographic processing

A caveat with most current structured models of reading is that the benchmark effects they describe focus solely on the behavior of proficient readers. Hence, these models are end-state models. They are set and built to reproduce end-state behaviors. The disadvantage of this approach is that it considers where the reader is in terms of his/her behavior without considering how he/she got there. However, for our theory of reading to have sufficient explanatory adequacy – that is, to provide “why” answers – it must consider the data describing how behavior emerges and how it is learned. Insights can be gained mainly by focusing on the trajectory that connects a beginning state to an end-state. This trajectory provides us with critical data regarding what it is exactly that the reader learns to pick up from the orthography. This information should tell us something interesting about the mechanisms underlying orthographic processing. The “why” answers are hidden there. This is well explicated by Rueckl, who describes the merits of learning models. As Rueckl argues, a developmental learning perspective has the significant advantage of explaining the organization of the reading system rather than just stipulating it, as structured models do.

Goswami's review of developmental evidence regarding spelling acquisition in English provides illuminating examples supporting this approach. Goswami points to a large set of patterns of spelling errors by school children, demonstrating how the phonological space of English and its morphological structure are reflected in spelling errors, and in the developmental trajectory of learning correct spelling. The many examples provided by Goswami demonstrate how developmental data of the print production of beginning readers lead to important insights regarding print processing in proficient readers, thereby demonstrating how the linguistic environment of English leads to an idiosyncratic language-specific strategy of orthographic processing. The same approach is echoed in Deacon's commentary, where she focuses on how reading experience shapes orthographic processing cross-linguistically, considering data from a variety of languages, such as English, Hebrew, Chinese, and Korean. This set of data brings Deacon to the same conclusion – a universal model of reading must involve a developmental perspective. A developmental approach is also the main message of Perea & Carreiras, who discuss a series of findings concerning brain plasticity as well as behavioral evidence, all demonstrating how letter-position flexibility develops with reading experience in European languages.

This has straightforward implications: To model the reader's end-state of orthographic processing, one should consider the information that has been picked up in the long process of literacy acquisition in a given linguistic environment. Each language presents to the reader a writing system that is characterized by a wide distribution of correlations. Some correlations determine the possible co-occurrences of letter sequences, which eventually result in establishing orthographic representations. Each writing system is also characterized by idiosyncratic correlations in the mapping of graphemes to phonemes, and these consistent correlations eventually result in mapping orthographic representations to phonological ones. In addition, writing systems are characterized by systematic correlations, where letter clusters consistently convey features of semantic meaning, which reflect morphological structure.

Ravid's commentary resonates very well with this triangular view. Like Goswami, Ravid reviews evidence from spelling rather than reading, and her spelling model is based on similar argumentation (see also Ravid [Reference Ravid2012] for a detailed discussion). In languages where morphological variations often result in phonological variations, learning to spell cannot rely on simple mapping of phonology to orthography, but has to draw on a triangular system where phonological, morphological, and orthographic sublinguistic units are inter-correlated. In the process of learning to spell, what is acquired is a network of phono-morpho-orthographic statistical patterns, which are shaped by the idiosyncratic specificities of the language. This approach suggests that each language implicates a differential tuning to statistical structure, given the language's idiosyncratic linguistic characteristics. By this view, native speakers who are proficient readers implicitly develop differential sensitivities to the statistical properties of their own language in the long process of literacy acquisition. Effects of letter transposition, as Perea & Carreiras demonstrate, indeed change with reading proficiency in European languages, but, just as well, they do not evolve in the same way in Semitic languages because of differences in how phonology, morphology, and orthography are interrelated.

All of these arguments lead to the suggestion that to model the end-state behavior of readers, one should have a clear theory of what has been learned by readers and how their linguistic environment has shaped their processing system to extract specific cues from the graphemic array. A model of orthographic processing, therefore, should be sensitive to the idiosyncratic developmental trajectory that characterizes readers in a given writing system, and, consequently, the model should be constrained by cross-linguistic developmental data.

R7. Descriptive adequacy of current implementations

As expected, some of the commentaries addressed my general critique of current models of visual word recognition, arguing for the descriptive adequacy of a given model or approach. Since, from the onset, the aim of the target article was not to offer an alternative implementation, but to discuss the general approach to modeling, the following discussion does not go into the architectural details of any specific model, but rather centers on its main working hypotheses and its descriptive adequacy.

Bowers presents a well-argued case for position invariance and for context-independent processing in letter identification. However, he correctly concedes that the challenge is indeed to develop a model in which positional uncertainty varies as a function of the linguistic environment. Note that, to some extent, Norris & Kinoshita's commentary has a similar flavor, arguing that primary perceptual processing is universally noisy, but then the processing demands of different languages shape the noisy product to produce the cross-linguistic differences in letter-position flexibility. However, even if positional invariance identification is universal, the main constraint on any theory of reading is the combination of this invariance with language-specific processing demands. Thus, the architecture of any universal model of reading should be tuned to the linguistic factors that determine actual flexibility or rigidity regarding letter position, along with positional invariance.

Considering the SERIOL model and the open-bigram approach, the question then is not whether they can produce results for Hebrew root-derived words as Whitney suggests. Open bigrams are perhaps well suited for Hebrew words because they encode the order of non-contiguous letters, and root letters are indeed non-contiguous. The critical question is whether the SERIOL model (Whitney Reference Whitney2001; Whitney & Cornelissen Reference Whitney and Cornelissen2008), inherently produces differential flexibility and rigidity depending on the internal structure of words (Velan & Frost Reference Velan and Frost2011). I agree with Bowers that the answer seems negative, given the nature of open bigrams. The solution that Whitney offers to overcome this problem and salvage her modeling approach is to insert inhibitory and excitatory connections with varying strength between bigrams and morphological units. This type of solution is rightly labeled by Rueckl as reverse engineering. The body of evidence regarding the processing of Hebrew root-derived words is identified, a lexical architecture and computational mechanism are then posited, they are evaluated in terms of their ability to generate the desired behavior, and finally they gain the status of theoretical explanations. Rueckl's commentary outlines very convincingly the dangers of this approach for understanding any complex phenomena, and reading is no exception. His criticism then is right on target.

R7.1. Cracking the orthographic code

Both Bowers and Davis discuss the spatial coding model. All of the arguments provided by Davis and by Bowers regarding the need to solve the alignment problem are well taken. A theory of reading in alphabetic orthographies indeed has to furnish an adequate description regarding the commonality in processing (build and rebuild, for example), while the identification of letters cannot be bound to a specific position. My article, however, asserts that this is not the only phenomenon that has to be described and explained by the theory. The question is, then, whether a principled solution can be offered to account for data from different writing systems, and, if so, what are the blueprints for finding such a solution. On this issue, there seems to be a clear divergence between the approach I advocate here and the one suggested by Davis.

The main thrust of Davis's commentary is that for skilled readers, printed words are identified on the basis of orthographic information, and once words have been identified via their constituent letters, phonological and semantic information subsequently follows. This view of temporal modularity indeed leads to the conclusion that one has to first “crack the orthographic code,” as Davis suggests. Note that in the present context, temporal modularity (Andrews Reference Andrews and Andrews2006) is not a pragmatic strategy for developing models (see Grainger & Hannagan). Rather, it reflects a theoretical stand regarding reading, and therefore merits careful scrutiny. What underlies Davis's approach is the assumption that orthographic processing is determined solely by the set of individual letters that carry little linguistic information. This is perhaps the case for some languages such as English, but it is not a universal feature of orthographic systems. The main thrust of the present response article is that phonological, semantic, and morphological characteristics penetrate early orthographic processing to determine its outcome. Hence, in contrast to Davis's approach, semantic or phonological features are not the product of orthographic processing, but are componential factors that often determine its outcome. The distributional characteristics of individual Hebrew letters, for example, are correlated with the semantic meaning the letters carry, and therefore control on-line eye movements and early perceptual processes. Similarly, Kim et al. demonstrate how the linguistic characteristics of individual letters in Korean (the ambiguity in their assignment to onset, vowel, or coda slots) affect orthographic processing and consequently affect letter transposition. A universal model of reading, therefore, cannot assume that a similar orthographic code is cracked across writing systems and then serves as the basis for subsequent phonological and semantic activation.

Bowers suggests that the spatial coding scheme offered by Davis (Reference Davis2010) can in principle accommodate the range of TL effects across languages when parameters of position uncertainty are set to zero. However, again, setting the parameters of a model to a given value to accommodate desired results would inevitably lead us into the reverse-engineering trap described by Rueckl. The question at hand is whether a model of orthographic processing learns to simultaneously produce TL priming for European words, inhibition rather than facilitation for Hebrew-like words (e.g., Velan & Frost Reference Velan and Frost2011), then again TL priming for Hebrew morphologically simple words. Contra Davis, I am not confident that simple orthographic neighborhood density considerations would suffice. As Bowers notes, additional constraints need to be added to the spatial coding model to produce and simulate reading in Semitic languages, and only time will tell whether it will emerge as a viable universal model of reading. Similarly, once the benchmark effects to assess the descriptive adequacy of a model include the differential sensitivity to letter position in different orthographies, given the internal structure of words, the promise of string kernel modeling, as suggested by Grainger & Hannagan, can be evaluated.

R7.2. The promise of learning models

This steers our discussion toward the clear advantage of learning models in the search for a universal model of reading. I agree with Perea & Carreiras that hardwired-structured models have the advantage of being simple models. However, whether they indeed advance us in understanding what must be learnt by the reader, as Davis suggests, is not at all evident. One could argue that it is actually the other way around. A hardwired model that does not stem from a comprehensive and general theory of reading is often structured to mimic the modeler's intuition about the source of end-state behaviors of proficient readers. Thus, instead of telling us something about what readers actually learn, the model reveals the modeler's emerging solution to computationally produce the reader's observed end-state behavior. When this solution is then presented as a behavioral explanation, we end up with the reverse-engineering pitfall of structured models as described by Rueckl.

If the main source of constraints for our theory of reading is the learning trajectory of readers in various linguistic environments, then obviously learning models have a much higher probability to advance our understanding of what is actually learnt by readers in a given writing system. Recent work by Baayen (under review) provides a good example. Using the framework of naïve discriminative learning (Baayen et al., Reference Baayen, Milin, Durdevic, Hendrix and Marelli2011), Baayen (under review) compared the sensitivity to letter order and the costs of letter transposition in English versus biblical Hebrew, when strings of letters in the two languages (text taken from the book of Genesis, or random selection of words from the database of phrases from the British National Corpus) were aligned with their meanings. Baayen demonstrated that pairs of contiguous letters (which capture order information in naïve discriminative learning) had a much greater functional load than single letters in Hebrew relative to English, thereby confirming the greater sensitivity to letter order in Semitic languages. Moreover, the simulations revealed that the model captured the differential statistical properties of the two languages, resulting in much greater TL disruption in biblical Hebrew when compared with English.

The results of recent preliminary computational work done in our lab (Lerner & Frost, in preparation) are consistent with Baayen's results. We have shown that in a simple three-layer neural network, trained with the classical back-propagation algorithm to match orthographic information of Hebrew and English words to their meaning (as represented by COAL [correlated occurrence analogue to lexical semantic] vectors containing co-occurrence measures), TL words lead to a smaller activation of the output layer, where meaning is stored, compared to their corresponding real words; but this difference was by far greater for Hebrew than for English. Thus, our results echo Baayen's findings using naïve discriminator learning. Unlike Baayen et al., we did not define any a priori restrictions on the representation of serial order (i.e., no specific bigram representations were hardwired to the input), and our network could use the order information in whatever way required by the algorithm to accomplish the learning phase. Therefore, our simple model emphasizes how the difference between the TL effects of Hebrew and English could be entirely dependent on the different statistical properties of Hebrew and English orthography. These preliminary results demonstrate that the differential effects of letter transposition indeed arise from the different distributional statistics of Hebrew and English. More relevant to the present discussion, they show the promise of learning models in teaching us something important about how the linguistic environment shapes different reading behaviors.

R8. The universal model of reading and the Strong Phonological Theory (SPT)

Rastle raises an important point: the implications of the present theoretical approach for previous claims regarding the strong phonological theory (SPT) (Frost Reference Frost1998). The main driving argument of the SPT is that all human languages are meant to convey meaning by spoken words, and therefore the core of words' lexical representation is phonological. By this view, the connection between spoken words and semantic meaning is the primary association formed in the process of language acquisition. The main claim behind the SPT is that phonology is always implicated in visual word recognition and mediates the recovery of meaning from print. Note that in the context of reading universals, Perfetti (Reference Perfetti, McCardle, Miller, Lee and Tzeng2011) has convincingly argued for a universal role of phonology in reading in any orthography. However, if writing systems aim to provide morphological information at the expense of phonological information, as I argue here, what then is the role of phonological representations in word recognition?

The theoretical construct that bridges the gap between the SPT and the present framework is the minimality constraint on lexical access assumed in the SPT (Frost Reference Frost1998, p. 79), and the impoverished and underspecified character of phonological representations for lexical access (pp. 80–81). The SPT claims that the initial contact with the lexicon is assumed to occur through an interface of phonological access representation that is relatively impoverished or underspecified. This is characteristic mainly of deep orthographies in which morphological variations are characterized by phonological variations as in the case of “heal” and “health.” Thus, according to the theory, the computation of phonology in deep orthographies, such as English or Hebrew, results in a non-detailed phonological representation in which vowel information is missing or underspecified. To reiterate, the precedence of morphology over phonological information does not mean that morphological information is provided instead of phonological information, or that meaning is computed without any reference to phonology. Rather, morphological considerations dictate that the computed phonological information remains underspecified in the initial phase of lexical access. In a sense, what we have here is a morpho-phonological equilibrium.

R8.1. Morpho-phonological variations and phonological underspecification

Hebrew again can be taken as a good example. What I have shown so far is that orthographic processing of letter sequences in Hebrew aims at extracting the letters that provide highest diagnosticity in terms of meaning, that is, the letters belonging to the root. This was the basis for my claim that morphology and therefore semantics must be part of any universal model of reading, since morphology takes precedence over phonology in the evolution of writing systems. However, the core representation of roots in Hebrew is necessarily phonological, because native speakers acquire them by exposure to the spoken language. As more and more words with the same word pattern are perceived by the speaker of the language, their repetitive phonological structure is acquired, and the salience of the three consonants of the root emerges. Speakers of Hebrew, therefore, have a phonological representation of root consonants onto which orthographic representations map. The three phonemes of the root are one side of the coin, whereas the three corresponding consonant letters are the other side. The tri-literal entity is in fact a tri-consonantal entity. This observation was confirmed long ago by Bentin and Frost (Reference Bentin and Frost1987). In this study, Bentin and Frost presented subjects with unpointed tri-literal consonantal strings (e.g., SFR) that could be read in more than one way by assigning different vowel configurations (e.g., sefer/safar). Bentin and Frost (Reference Bentin and Frost1987) showed that lexical decision latencies for these heterophonic homographs were faster than latencies for any of the disambiguated pointed alternatives. These findings suggested that lexical access was based on the impoverished and underspecified representation shared by the different phonological alternatives (see also Frost Reference Frost, Kinoshita and Lupker2003; Frost & Yogev Reference Frost and Yogev2001; Frost et al. Reference Frost, Ahissar, Gottesman and Tayeb2003; Gronau & Frost Reference Gronau and Frost1997).

To summarize this point, the present theoretical framework is in line with the claim that phonological representations are the core mediating lexical representations of words. However, it extends this framework significantly to incorporate morphology into the approach, with a predictable morphology–phonology trade-off. This trade-off determines a priori in which writing systems mediating phonological representations would be fully specified, and in which they would be underspecified. The main theoretical claims advocated in the SPT of visual word recognition are therefore maintained in the present framework. However, the role of morphological structure, the intimate link between orthographic structure and the way phonological space represents meaning, and the consideration of orthographic structure as an equitable weighting of phonological and morphological information, are important expansions of the original SPT.

R9. Summary and future directions

As expected, the present large number of commentaries necessarily brings about a variety of opinions flashing out disagreements, so that some fencing regarding theoretical stands is inevitable. Nevertheless, there is a surprising convergence of views on several key issues that enables the tracing of constructive directions for future reading research. Let me then summarize these issues:

1. Overall, most commentaries agreed one way or the other with the main claim of the target article, that orthographic representations are the product (whether optimal or just satisfactory) of the full linguistic environment of the reader, and that the modeling of orthographic processing requires considering the phonological space of the language and the way it conveys meaning through morphological structure.
2. There is a wide consensus that cross-linguistic research should serve as a primary constraint for a theory or a model of visual word recognition.
3. Quite a few commentaries suggested that an adequate theory of proficient reading has to be an acquisition theory that focuses on what readers pick up and learn from their linguistic environment. Modeling end-state behavior of readers without considering constraints of developmental data is often incomplete.
4. A significant number of commentaries, whether explicitly or implicitly, referred to the theoretical importance of understanding the underlying statistical properties embedded in a writing system for comprehending how it modulates eye movement or governs orthographic processing, either in isolated word recognition or in sentence reading. These statistical relations go far beyond bigram or trigram frequency or orthographic neighborhood density, as they concern the ortho-phono-morphological correlations of sublinguistic units.

These points of relative consensus should lead us to the appreciation that any front-end implementation should be primarily constrained by what we know about the hidden cues packed into the orthography of a given writing system. As I have argued, the mapping and understanding of these cues are questions of empirical investigation, whether through the assembly of comparative brain evidence, or comparative developmental and behavioral data. Once the scope of these cues across writing systems is mapped and understood, a universal theory that focuses on the fundamental phenomena of reading can be formulated. This approach outlines a series of research questions that are by no means novel, but gain perhaps greater saliency in the current framework. Rueckl provides a series of important theoretical challenges for future reading research. In the following, I mention just two examples of research questions that resonate with these challenges, mainly for the sake of demonstration.

R9.1. Individual differences in statistical learning

Given the accumulating evidence tying statistical properties of writing systems to processing strategies in visual word recognition, one challenge of reading research is to provide a comprehensive theory that directly links cognitive statistical learning abilities with literacy acquisition. A main empirical question, then, concerns the possible dimensions underlying the human capacity to pick up correlations from the environment. Another question concerns the predictive value of this capacity in determining ease or difficulty in registering the subtle correlations that exist in a language between orthography, phonology, morphology, and meaning, thereby affecting reading performance. Thus, if individuals vary in their sensitivity to statistical information, these differences could potentially have consequences for the speed of reading acquisition, the organization of the reading system, the ability to learn the statistical properties of another language, and for efficiently processing orthographic information in a second language. Indeed, considerable work along these lines has already been conducted (e.g., Ahissar Reference Ahissar2007; Banai & Ahissar Reference Banai and Ahissar2009; Misyak & Christiansen Reference Misyak and Christiansen2012; Pacton et al. Reference Pacton, Perruchet, Fayol and Cleeremans2001). Expanding the scope of this research to include evidence from different writing systems could provide novel insight.

R9.2. Multilingualism and visual word recognition

Learning how to read in more than one language requires extensive plasticity when contrastive structural properties of writing systems have to be assimilated. For example, Bialystok et al. (Reference Bialystok, Luk and Kwan2005) have shown that the transfer of literacy skills is indeed easy when both languages have a similar writing system. However, if languages present to their readers very different structural properties, the question at hand is how the acquired knowledge of the structural properties of one's native language and the assimilation of its characteristic statistical regularities hinders or facilitates the learning of the structural properties of a second language and its implicit statistical attributes.

To exemplify, Semitic languages are characterized by morphemic units that are noncontiguous, where roots and word patterns are intertwined. Therefore, speakers and readers of Hebrew and Arabic must develop an enhanced sensitivity to non-adjacent statistics. However, subsequent exposure to European languages presents to these readers a different form of statistical dependencies, mainly adjacent dependencies. How does knowing the statistical properties of one's native language affect the assimilation of a different type of statistical regularity? Note that parallel questions have been raised from the perspective of the neural circuitry involved in language processing. For example, results of the work by Perfetti and colleagues (Liu et al. Reference Liu, Dunlap, Fiez and Perfetti2007; Perfetti et al. Reference Perfetti, Liu, Fiez, Nelson, Bolger and Tan2007; Tan et al. Reference Tan, Spinks, Feng, Siok, Perfetti, Xiong, Fox and Gao2003) suggest two possible mechanisms for neuronal reorganization triggered by learning to read in a second language: assimilation and accommodation – assimilation in the sense that the neural circuitry must pick up the new set of linguistic regularities which are characteristic to the new language, and accommodation in the sense that the neural circuits involved in mapping orthography, phonology, and meaning must be modified in order to deal with the demands of reading in the new language, given its statistical structure. Thus, although the data presented so far clearly suggest that flexibility in orthographic processing characterizes the cognitive system, what requires further investigation are the rules that govern and constrain this flexibility, given exposure to multiple linguistic environments.

These two research questions are examples of potential directions that could lead towards a universal model of reading. As argued in several commentaries, the new age of orthographic processing has contributed to reading research important theoretical discussions regarding front-end computational solutions. These should be harnessed to provide an adequate theory of the interaction of the reader with his or her linguistic environment. This approach is not only possible; it is also the only viable one for understanding reading.

ACKNOWLEDGMENTS

The preparation of this article was supported in part by the Israel Science Foundation (Grant 159/10 awarded to Ram Frost) and by the National Institute of Child Health and Human Development (Grant HD-01994 awarded to Haskins Laboratories, and Grant R01 HD 067364 awarded to Ken Pugh and Ram Frost). I am indebted to Asher Cohen and Jay Rueckl for their insightful comments in preparing this response.

Footnotes

1. This theoretical approach is indeed argued by Marr (Reference Marr1982) with reference to the visual system.

2. One unexpected preoccupation with this statement concerned the question of who should get the credit for coining the catchy sentence “Every language gets the writing system it deserves.” Admitting from the onset that it is not mine, I would happily concur with Perfetti that the credit should be given to Halliday (Reference Halliday1977), rather than to Mattingly. I hope that other contenders to the title accept the verdict.

3. Note the TL priming experiments in Hebrew never used transposition of final letters.

References

Ahissar, M. (2007) Dyslexia and the anchoring-deficit hypothesis. Trends in Cognitive Sciences 11:458–65.Google Scholar

Andrews, S. (2006) All about words: A lexicalist perspective on reading. In: From inkmarks to ideas: Current issues in lexical processing, ed. Andrews, S., pp. 314–48. Psychology Press.Google Scholar

Baayen, R. H. (under review). Learning from the Bible: Computational modeling of the costs of letter transpositions and letter exchanges in reading Classical Hebrew and Modern English.Google Scholar

Baayen, R. H., Milin, P., Durdevic, D. F., Hendrix, P. & Marelli, M. (2011) An amorphous model for morphological processing in visual comprehension based on naive discriminative learning. Psychological Review 118:428–81.CrossRef Google Scholar PubMed

Banai, K. & Ahissar, M. (2009) Perceptual learning as a tool for boosting working memory among individuals with reading and learning disability. Learning and Perception 1:115–34.CrossRef Google Scholar

Bentin, S. & Frost, R. (1987) Processing lexical ambiguity and visual word recognition in a deep orthography. Memory and Cognition 15:13–23.Google Scholar

Bertram, R., Kuperman, V., Baayen, R. H. & Hyönä, J. (2011) The hyphen as a segmentation cue in triconstituent compound processing: It's getting better all the time. Scandinavian Journal of Psychology 52:530–44.Google Scholar

Bialystok, E., Luk, G. & Kwan, E. (2005) Bilingualism, biliteracy, and learning to read: Interaction among languages and writing systems. Scientific Studies of Reading 9:43–61.Google Scholar

Chomsky, N. (1965) Aspects of the theory of syntax. MIT Press.Google Scholar

Chomsky, N. (1995) The minimalist program. MIT Press.Google Scholar

Chomsky, N. (2006) Language and mind, 3rd edition. Cambridge University Press.Google Scholar

Davis, C. J. (2010) The spatial coding model of visual word identification. Psychological Review 117:713–58.Google Scholar

Evans, N. & Levinson, S. C. (2009) The myth of language universals: Language diversity and its importance for cognitive science. Behavioural and Brain Sciences 32:429–92.Google Scholar

Evans, J., Saffran, J. & Robe-Torres, K. (2009) Statistical learning in children with specific language impairment. Journal of Speech, Language, and Hearing Research 52:321–36.Google Scholar

Friedmann, N. & Haddad-Hanna, M. (in press a) Letter position dyslexia in Arabic: From form to position. Behavioural Neurology 25. doi: 10.3233/BEN-2012-119004.Google Scholar

Frost, R. (1998) Toward a strong phonological theory of visual word recognition: True issues and false trails. Psychological Bulletin 123(1):71–99.Google Scholar

Frost, R. (2003) The robustness of phonological effects in fast priming. In: Masked priming the state of the art, ed. Kinoshita, S. & Lupker, S. J., pp. 173–92. [The Macquarie Monographs in Cognitive Science.] Psychology Press.Google Scholar

Frost, R., Ahissar, M., Gottesman, R. & Tayeb, S. (2003) Are phonological effects fragile? The effect of luminance and exposure duration on form priming and phonological priming. Journal of Memory and Language 48:346–78.Google Scholar

Frost, R. & Yogev, O. (2001) Orthographic and phonological computation in visual word recognition: Evidence from backward masking in Hebrew. Psychonomic Bulletin and Review 8:524–30.CrossRef Google Scholar PubMed

Gebhart, A. L., Newport, E. L. & Aslin, R. N. (2009) Statistical learning of adjacent and nonadjacent dependencies among nonlinguistic sounds. Psychonomic Bulletin & Review 14:486–90.Google Scholar

Gilbert, C. D., Sigman, M. & Crist, R. E. (2001) The neural basis of perceptual learning. Neuron 13:681–97.Google Scholar

Gomez, R. (2007) Statistical learning in infant language development. In: Oxford handbook of psycholinguistics, ed. Gaskell, M. G.. Oxford University Press.Google Scholar

Grainger, J., Dufau, S., Montant, M., Ziegler, J. C. & Fagot, J. (2012) Orthographic processing in baboons (Papio papio). Science 336:245–48.Google Scholar

Gronau, N. & Frost, R. (1997) Prelexical phonologic computation in a deep orthography: Evidence from backward masking in Hebrew. Psychonomic Bulletin and Review 4:107–112.Google Scholar

Halliday, M. A. K. (1977) Ideas about language. In: Aims and perspectives in linguistics. Occasional Papers, No. I, pp. 32–55. Applied Linguistics Association of Australia.Google Scholar

Hannagan, T. & Grainger, J. (in press) Protein analysis meets visual word recognition: A case for string kernels in the brain. Cognitive Science. doi:10.1111/j.1551-6709.2012.01236.x Google Scholar

Kasisopa, B., Reilly, R. & Burnham, D. (2010) Orthographic factors in reading Thai: An eye tracking study. In: Proceedings of the Fourth China International Conference on Eye Movements (CICEM), Tianjin, China, May 24–26, 2010, ed. Shen, D., Bai, X., Yan, G. & Rayner, K.. p. 8. Tianjin Normal University.Google Scholar

Lee, C. H. & Taft, M. (2009) Are onsets and codas important in processing letter position? A comparison of TL effects in English and Korean. Journal of Memory and Language 60(4):530–42. doi:10.1016/j.jml.2009.01.002.Google Scholar

Lee, C. H. & Taft, M. (2011) Subsyllabic structure reflected in letter confusability effects in Korean word recognition. Psychonomic Bulletin and Review 18(1):129–34. doi:10.3758/s13423-010-0028-y.Google Scholar

Lerner, I. & Frost, R. (in press) Letter statistics in a learning network model accounts for cross-linguistic differences in letter transposition effects.Google Scholar

Liu, Y., Dunlap, S., Fiez, J. & Perfetti, C. (2007) Evidence for neural accommodation to a writing system following learning. Human Brain Mapping 28:1223–34.Google Scholar

Marr, D. (1982) Vision: A computational investigation into the human representation and processing of visual information. W. H. Freeman.Google Scholar

Misyak, J. B. & Christiansen, M. H. (2012) Statistical learning and language: An individual differences study. Language Learning 62(1):302–31.Google Scholar

Nazir, T., ben-Boutayab, N., Decoppet, N., Deutsch, A. & Frost, R. (2004) Reading habits, perceptual learning, and the recognition of printed words. Brain and Language 88:294–311.Google Scholar

Pacton, S., Perruchet, P., Fayol, M. & Cleeremans, A. (2001) Implicit learning out of the lab: The case of orthographic regularities. Journal of Experimental Psychology: General 130:401–26.Google Scholar

Perfetti, C. A. (2011) Reading processes and reading problems: Progress toward a universal reading science. In: Dyslexia across languages: Orthography and the brain-gene-behavior link, ed. McCardle, P., Miller, B., Lee, J. R. & Tzeng, O. J. L., pp. 18–32. Brookes.Google Scholar

Perfetti, C. A., Liu, Y., Fiez, J., Nelson, J., Bolger, D. J. & Tan, L-H. (2007) Reading in two writing systems: Accommodation and assimilation in the brain's reading network. Bilingualism: Language and Cognition 10(2):131–46. [Special issue on “Neurocognitive approaches to bilingualism: Asian languages,” ed. P. Li.]Google Scholar

Rastle, K. & Davis, M. H. (2008) Morphological decomposition based on the analysis of orthography. Language and Cognitive Processes 23:942–71.Google Scholar

Rastle, K., Davis, M. H. & New, B. (2004) The broth in my brother's brothel: Morpho-orthographic segmentation in visual word recognition. Psychonomic Bulletin and Review 11:1090–98.Google Scholar

Ravid, D. (2012) Spelling morphology: The psycholinguistics of Hebrew spelling. Springer.Google Scholar

Rayner, K. (1975) The perceptual span and peripheral cues in reading. Cognitive Psychology 7:65–81.CrossRef Google Scholar

Seidenberg, M. S. (2011) Reading in different writing systems: One architecture, multiple solutions. In: Dyslexia across languages: Orthography and the brain-gene-behavior link, ed. McCardle, P., Miller, B., Lee, J. R. & Tzeng, O. J. L., pp. 146–68. Paul H. Brookes.Google Scholar

Share, D. L. (2008a) On the Anglocentricities of current reading research and practice: The perils of overreliance on an “outlier” orthography. Psychological Bulletin 134(4):584–615.Google Scholar

Sigman, M. & Gilbert, C. D. (2000) Learning to find a shape. Nature Neuroscience 3:264–69.Google Scholar

Simos, P. G., Breier, J. I., Fletcher, J. M., Foorman, B. R., Castillo, E. M. & Papanicolaou, A. C. (2002) Brain mechanisms for reading words and pseudowords: An integrated approach. Cerebral Cortex 12(3):297–305.Google Scholar

Solomyak, O. & Marantz, A. (2010) Evidence for early morphological decomposition in visual word recognition. Journal of Cognitive Neuroscience 22(9):2042–57.Google Scholar

Taft, M. & Nillsen, C. (in press) Morphological decomposition and the transposed letter-effect. Language and Cognitive Processes.Google Scholar

Tan, L. H., Spinks, J. A., Feng, C. M., Siok, W. T., Perfetti, C. A., Xiong, J., Fox, P. T. & Gao, J. H. (2003) Neural systems of second language reading are shaped by native language. Human Brain Mapping 18:155–66.Google Scholar

Velan, H., Deutsch, A. & Frost, R. (in press) The flexibility of letter-position flexibility: evidence from eye-movements in reading Hebrew.Google Scholar

Velan, H. & Frost, R. (2011) Words with and without internal structure: What determines the nature of orthographic and morphological processing? Cognition 118:141–56.Google Scholar

Whitney, C. (2001) How the brain encodes the order of letters in a printed word: The SERIOL model and selective literature review. Psychonomic Bulletin and Review 8(2):221–43.Google Scholar

Whitney, C. & Cornelissen, P. (2008) SERIOL reading. Language and Cognitive Processes 23(1):143–64. doi:10.1080/01690960701579771.Google Scholar

Winkler, I., Denham, S. L. & Nelken, I. (2009) Modeling the auditory scene: Predictive regularity representations and perceptual objects. Trends in Cognitive Science 13:532–40.Google Scholar