We emphatically agree with Frost that linguistic diversity is a pre-requisite for the development of universally applicable models of reading. Indeed, we argue that linguistic diversity is a pre-requisite for any meaningful study of language, including research into the representation of language in the brain and its breakdown in brain lesions (aphasia). However, linguistically diverse data in visual word recognition are sadly lacking (Share Reference Share2008a). Similarly, in a recent review of aphasia research since the year 2000, we found a dramatic imbalance in the languages investigated (Beveridge & Bak Reference Beveridge and Bak2011). A small number of closely related Western European languages accounted for the vast majority of aphasia literature: 62% of the articles investigated were in English, 89% in Germanic or Romance languages. In contrast, less than 8% of studies examined non–Indo-European languages.
Inspired by Frost's article, we revisited our database, focusing on writing systems and examining separately articles dealing with disorders of written language: alexia and agraphia. Languages using alphabetic writing were the focus of 94% of total studies and 89% of alexia/agraphia studies (Table 1).
Table 1. Number of studies in a major review of aphasia literature, by script type (proportions in parentheses)
We are currently extending the scope of this investigation by conducting a systematic review of language research in non-clinical populations. Here, too, we find a bias towards alphabetic scripts, albeit less dramatic: Languages using non-alphabetic scripts account for 113 of 750 studies. The dominance of alphabetic languages becomes more apparent when we examine the number of citations generated by each article: Of 181 studies with 10 or more citations, only 4 featured non-alphabetic languages (3 Hebrew, 1 Arabic). Moreover, within the non-alphabetic group, we encounter a very limited number of languages. At the time of writing, the two reviews encompassed 1,935 articles from 114 journals, yet between them they feature only nine non-alphabetic scripts: Arabic, Bengali, Chinese, Hebrew, Hindi, Japanese, Kannada, Persian and Thai. Of these, only four (Chinese, Hebrew, Japanese, and Thai) appear in the non-clinical review, with Chinese and Japanese accounting for 102 of 113 studies.
Moreover, “non-alphabetic” scripts cannot be treated as a uniform group, showing at least as many differences between themselves as they do in comparison to alphabetic ones. Their classification is complex and controversial; almost any term applied can be seen as inadequate (Daniels Reference Daniels, Daniels and Bright1996b). The “logographic” Chinese script includes radicals coding for sounds. “Consonantal” scripts (abjad) are not entirely consonantal, since they include signs for long vowels. “Semi-syllabic” scripts of Ethiopia and India (abugida) are orthographically “alphabetic” (expressing vowels as well as consonants) but visually “syllabic” (characters arranged in form of syllables). These semi-syllabic writing systems, used by hundreds of millions of people in Africa, South and South-East Asia, have been particularly neglected by researchers. Tellingly, they were not among the five example languages chosen by Frost.
Ironically, although we agree with Frost on the importance of cross-linguistic data, it is exactly on the basis of language comparison that we have to reject some of his examples. Far from being a uniquely English phenomenon, morphologically induced phonological alternations are among the most characteristic features of Indo-European languages. Accordingly, many languages developed a trade-off between phonological and morphological transparency. In German, for instance, the plural of Buch (book) is Bücher: the letter “ü” has a similar shape to the “u” of the singular, but the diacritics mark a different pronunciation. Similar phenomena can be observed in Polish (Bóg/Boga, nominative/genitive of “God”) or, expressed through an additional letter rather than a diacritic, in Italian (bianco/bianchi, masculine singular/plural of “white”). An example of related languages solving the same problem in different ways is provided by Welsh and Scottish Gaelic. Both languages are characterised by mutations (lenition) of initial sounds in certain environments (e.g., possessives, propositions). In Welsh, the spelling of the mutated word reflects its pronunciation and is, therefore, visually different from the non-mutated form (e.g., mawr/fawr, masculine/feminine of “big”). In Gaelic, however, the lenited form is expressed through the addition of “h,” making the resulting word morphologically, but not phonologically, transparent (mòr/mhòr, masculine/feminine of “big”; “mh” pronounced as “v”).
Hence, the conservatism of English spelling cannot be attributed to its unique morphology but rather to historical, social, and cultural factors (which also play a major role in other languages cited by Frost, such as Chinese and Japanese). Examples of political, religious, and ideological decisions determining the written form of a language can be found through centuries and across continents. A frequently cited case is the change of Turkish from Arabic to Latin script in 1928 (incidentally, the closely related Azeri language was written in Cyrillic in the Soviet Union and in Arabic script in Iran). Another example is the 19th century change of Romanian from Cyrillic to Latin, motivated by nationalist ideology (Gheţie Reference Gheţie1978). The new spelling was originally strongly etymologising, emphasising the language's Romance morphology (like Frost's English example), but progressive reforms led to today's nearly consistent surface phonemic system (like Frost's Finnish example).
In most cases, the relationship between phonology, morphology, culture, and orthography is not a one-way street. In India, most languages adopted a script including characters representing all sounds of Sanskrit. A notable exception is Tamil, in which voiced and voiceless stops (e.g., “g/k,” “b/p,” etc.) are written with the same character. This makes perfect sense within the rules of Tamil phonology, in which voiced and voiceless stops are allophones, with pronunciation determined by its position within the word. Yet, the same phonological rule applied also to proto-Dravidian and, therefore, to other Dravidian languages such as Malayalam, Telugu, or Kannada (Steever Reference Steever and Steever1998). However, unlike Tamil, these languages adopted the full Sanskrit inventory as well as many Sanskrit loanwords, in which sounds like “g” and “k” form distinct phonemes. The adoption of Sanskrit words encouraged a Sanskrit-based orthography, but equally, Sanskrit-based orthography facilitated Sanskrit borrowings. In contrast, both the orthographic and the lexical influence of Sanskrit is least pronounced in Tamil, which tends to emphasise its own cultural, political, and linguistic identity. We argue, therefore, that orthography is a product of a long and complex interaction of language structure with cultural environment and historical circumstance.
We emphatically agree with Frost that linguistic diversity is a pre-requisite for the development of universally applicable models of reading. Indeed, we argue that linguistic diversity is a pre-requisite for any meaningful study of language, including research into the representation of language in the brain and its breakdown in brain lesions (aphasia). However, linguistically diverse data in visual word recognition are sadly lacking (Share Reference Share2008a). Similarly, in a recent review of aphasia research since the year 2000, we found a dramatic imbalance in the languages investigated (Beveridge & Bak Reference Beveridge and Bak2011). A small number of closely related Western European languages accounted for the vast majority of aphasia literature: 62% of the articles investigated were in English, 89% in Germanic or Romance languages. In contrast, less than 8% of studies examined non–Indo-European languages.
Inspired by Frost's article, we revisited our database, focusing on writing systems and examining separately articles dealing with disorders of written language: alexia and agraphia. Languages using alphabetic writing were the focus of 94% of total studies and 89% of alexia/agraphia studies (Table 1).
Table 1. Number of studies in a major review of aphasia literature, by script type (proportions in parentheses)
aArabic, Hebrew, Persian; bChinese; cJapanese; dHindi, Bengali, Kannada.
We are currently extending the scope of this investigation by conducting a systematic review of language research in non-clinical populations. Here, too, we find a bias towards alphabetic scripts, albeit less dramatic: Languages using non-alphabetic scripts account for 113 of 750 studies. The dominance of alphabetic languages becomes more apparent when we examine the number of citations generated by each article: Of 181 studies with 10 or more citations, only 4 featured non-alphabetic languages (3 Hebrew, 1 Arabic). Moreover, within the non-alphabetic group, we encounter a very limited number of languages. At the time of writing, the two reviews encompassed 1,935 articles from 114 journals, yet between them they feature only nine non-alphabetic scripts: Arabic, Bengali, Chinese, Hebrew, Hindi, Japanese, Kannada, Persian and Thai. Of these, only four (Chinese, Hebrew, Japanese, and Thai) appear in the non-clinical review, with Chinese and Japanese accounting for 102 of 113 studies.
Moreover, “non-alphabetic” scripts cannot be treated as a uniform group, showing at least as many differences between themselves as they do in comparison to alphabetic ones. Their classification is complex and controversial; almost any term applied can be seen as inadequate (Daniels Reference Daniels, Daniels and Bright1996b). The “logographic” Chinese script includes radicals coding for sounds. “Consonantal” scripts (abjad) are not entirely consonantal, since they include signs for long vowels. “Semi-syllabic” scripts of Ethiopia and India (abugida) are orthographically “alphabetic” (expressing vowels as well as consonants) but visually “syllabic” (characters arranged in form of syllables). These semi-syllabic writing systems, used by hundreds of millions of people in Africa, South and South-East Asia, have been particularly neglected by researchers. Tellingly, they were not among the five example languages chosen by Frost.
Ironically, although we agree with Frost on the importance of cross-linguistic data, it is exactly on the basis of language comparison that we have to reject some of his examples. Far from being a uniquely English phenomenon, morphologically induced phonological alternations are among the most characteristic features of Indo-European languages. Accordingly, many languages developed a trade-off between phonological and morphological transparency. In German, for instance, the plural of Buch (book) is Bücher: the letter “ü” has a similar shape to the “u” of the singular, but the diacritics mark a different pronunciation. Similar phenomena can be observed in Polish (Bóg/Boga, nominative/genitive of “God”) or, expressed through an additional letter rather than a diacritic, in Italian (bianco/bianchi, masculine singular/plural of “white”). An example of related languages solving the same problem in different ways is provided by Welsh and Scottish Gaelic. Both languages are characterised by mutations (lenition) of initial sounds in certain environments (e.g., possessives, propositions). In Welsh, the spelling of the mutated word reflects its pronunciation and is, therefore, visually different from the non-mutated form (e.g., mawr/fawr, masculine/feminine of “big”). In Gaelic, however, the lenited form is expressed through the addition of “h,” making the resulting word morphologically, but not phonologically, transparent (mòr/mhòr, masculine/feminine of “big”; “mh” pronounced as “v”).
Hence, the conservatism of English spelling cannot be attributed to its unique morphology but rather to historical, social, and cultural factors (which also play a major role in other languages cited by Frost, such as Chinese and Japanese). Examples of political, religious, and ideological decisions determining the written form of a language can be found through centuries and across continents. A frequently cited case is the change of Turkish from Arabic to Latin script in 1928 (incidentally, the closely related Azeri language was written in Cyrillic in the Soviet Union and in Arabic script in Iran). Another example is the 19th century change of Romanian from Cyrillic to Latin, motivated by nationalist ideology (Gheţie Reference Gheţie1978). The new spelling was originally strongly etymologising, emphasising the language's Romance morphology (like Frost's English example), but progressive reforms led to today's nearly consistent surface phonemic system (like Frost's Finnish example).
In most cases, the relationship between phonology, morphology, culture, and orthography is not a one-way street. In India, most languages adopted a script including characters representing all sounds of Sanskrit. A notable exception is Tamil, in which voiced and voiceless stops (e.g., “g/k,” “b/p,” etc.) are written with the same character. This makes perfect sense within the rules of Tamil phonology, in which voiced and voiceless stops are allophones, with pronunciation determined by its position within the word. Yet, the same phonological rule applied also to proto-Dravidian and, therefore, to other Dravidian languages such as Malayalam, Telugu, or Kannada (Steever Reference Steever and Steever1998). However, unlike Tamil, these languages adopted the full Sanskrit inventory as well as many Sanskrit loanwords, in which sounds like “g” and “k” form distinct phonemes. The adoption of Sanskrit words encouraged a Sanskrit-based orthography, but equally, Sanskrit-based orthography facilitated Sanskrit borrowings. In contrast, both the orthographic and the lexical influence of Sanskrit is least pronounced in Tamil, which tends to emphasise its own cultural, political, and linguistic identity. We argue, therefore, that orthography is a product of a long and complex interaction of language structure with cultural environment and historical circumstance.
ACKNOWLEDGMENTS
We thank Bob Ladd and Martin Haspelmath for their helpful comments and advice.