Writing systems, even those that use the same letters, differ in how those letters combine. That is, the systems have different graphotactic patterns. For example, the letter combination ‹ff› may not appear at the beginnings of English words. In Welsh, in contrast, ‹ff› is found at the beginnings of a number of words, including ffa ‘beans.’ Bilinguals take advantage of the graphotactic differences between their languages to help determine the language to which a written word belongs (e.g., Jared, Cormier, Levy, & Wade-Woolley, Reference Jared, Cormier, Levy and Wade-Woolley2013; Vaid & Frenck-Mestre, Reference Vaid and Frenck-Mestre2002; van Kesteren, Dijkstra, & de Smedt, Reference van Kesteren, Dijkstra and de Smedt2012).
Graphotactic patterns differ across languages, but are there differences within a language? The experiments reported here test the hypothesis that English includes subsets of words that differ in some of their graphotactic patterns and that adults are sensitive to these differences. We do so by using a graphotactic choice task in which people see pairs of nonwords and are asked which one looks more like a word of their language. For example, English-speaking participants may be asked which of ffol or foll looks more wordlike. Children as young as 6 years of age perform above the level expected by random guessing with such pairs, and performance is virtually perfect by adulthood (Cassar & Treiman, Reference Cassar and Treiman1997). Such findings suggest that people possess not only lexical knowledge, or knowledge about the written forms of specific words, but also sublexical knowledge, or knowledge about orthographic patterns that cut across words. For example, readers of English know that words hardly ever begin with double consonants. People seem to pick up these patterns through implicit statistical learning, for the patterns are not usually explicitly taught (Chetail, Reference Chetail2017; Mano, Reference Mano2016; Samara & Caravolas, Reference Samara and Caravolas2014).
Experimental and modeling work on orthographic learning (e.g., Testolin, Stoianov, Sperduti, & Zorzi, Reference Testolin, Stoianov, Sperduti and Zorzi2016) has generally assumed that the vocabulary of a language is a monolithic system in terms of the graphotactic patterns that it follows. However, linguists have suggested that English is not a monolithic system. It includes what is often called a basic system together with several other systems that echo the spelling systems of other languages, and these systems do not always follow the same graphotactic patterns (Albrow, Reference Albrow1972; Carney, Reference Carney1994). Among the nonbasic systems of English, the foremost is what we call Latinate spelling (Albrow calls this System 2, contrasting it with the basic system that he calls System 1). Although both the basic and Latinate systems developed from ancient Latin spelling, Latinate spelling rolls back centuries of evolution to spell words as they would be spelled in Latin. For example, sell is spelled according to the basic system, but cell is spelled like the Latin word cella from which it was borrowed. Following Albrow, we place words that were borrowed from classical Greek, such as aphasia, in the Latinate system. Words that were formed recently from one or more Latin or Greek morphemes, such as phylogenetic, are also so classified. Latinate spelling is not usually applied to words that do not contain Latin or Greek morphemes (e.g., filling, giver), nor to Latin and Greek borrowings that have diverged too far from their classical pronunciations to be recognizable.
Because of their common origin, the basic and Latinate systems overlap substantially. However, several spelling patterns are used extensively in one system but rarely in the other. Carney (Reference Carney1994) has described these patterns as markers of the system to which a word belongs. For example, the following word onsets (the consonants preceding the first vowel) are mostly Latinate: ‹v› (verity), ‹x› (xanthine), ‹z› (zoology), ‹ph› (pharmacy), ‹mn› (mnemonic), ‹ps› (psychology), and ‹rh› (rhetoric). Other onsets are mostly basic, including ‹k› (king, skull), ‹w› (water, wrist, what, twin), ‹sh› (ship), and ‹y› (yeast). Some of these differences reflect phonological differences between the systems, while others arise when the same phoneme is spelled differently in different systems (e.g., ‹ph› for /f/ in Latinate words but ‹f› for /f/ in basic words). Latinate and basic words also differ in their morphological patterns. Because these differences are reflected in spelling, the spellings can serve as markers of system membership. For example, Latinate words are usually composed of several morphemes, and suffixes such as -al, ic, -ity, -ive, and -ous attach very often to Latinate roots. These words tend not to accept the comparative and superlative suffixes -er and -est, and often the agentive suffix is spelled -or (actor). Conversely, words spelled on the basic pattern tend to not take stress-shifting suffixes such as -ic and tend to spell the agentive suffix as -er (runner). A final letter group such as ‹er›, therefore, can mark basis status, while final ‹ic› can mark Latinate status.
The idea that English includes different subsystems is based on linguistic analyses (Albrow, Reference Albrow1972; Carney, Reference Carney1994) that are useful although not very detailed (Ryan, Reference Ryan2018). Litle work has investigated whether people are sensitive to the sometimes subtle differences that have been identified in the linguistic studies (Kemp, Treiman, Blackley, Svoboda, & Kessler, Reference Kemp, Treiman, Blackley, Svoboda and Kessler2015; Treiman, Kessler, & Evans, Reference Treiman, Kessler and Evans2007). The results of Treiman, Kessler, and Evans (Reference Treiman, Kessler and Evans2007) can be interpreted to support the view that skilled readers are sensitive to differences between the Latinate and basic subsets of English. The participants in these experiments were asked to read aloud words and nonwords with initial ‹c› and ‹g› before ‹e› and ‹i›. Some items, such as gelid, ceph, and gilmous, included letter sequences that are common in words of Latin origin, such as ‹id›, ‹ph›, and ‹ous›. Other items, including cildoy and gemsbok, had graphotactic patterns that are typical of basic words. Participants were more likely to use the /s/ pronunciation of ‹c› and the /d͡ʒ/ pronunciation of ‹g› when pronouncing items with Latinate than basic spelling patterns. Conversely, pronunciations with /k/ and /ɡ/ were more common for items that were basic in appearance. These results may be interpreted to suggest that skilled readers use certain spelling patterns as cues that a word belongs to either the Latinate or the basic system of the vocabulary and that these cues influence the pronunciations of initial ‹c› and ‹g›.
Here we tested adults’ sensitivity to graphotactic differences between the Latinate and basic systems more directly, using a graphotactic choice task. In Experiments 1 and 2, we showed participants pairs of nonwords that differed only in their onsets and asked them which looked more like a word. For example, a pair might include an item with the Latinate onset ‹v› and an otherwise identical item with the basic onset ‹w›. If people are sensitive to graphotactic differences between the Latinate and basic systems, they should be more likely to choose the item with the Latinate onset when the items end with a Latinate sequence, such as ‹ic›, than when they end with a basic sequence. Such a result would suggest that people implicitly classify onsets and endings as Latinate or basic and that nonwords with a match (e.g., a basic onset with basic ending) appear more wordlike than those with a mismatch (e.g., a basic onset with Latinate ending).
In Experiments 2 and 3, we used the same strategy for medial consonants and endings. These experiments took advantage of the observation that double medial consonants, in addition to their role as indicators of phonology (e.g., dinner vs. diner), can serve as markers of subsystem membership (Berg, Reference Berg2016; Carney, Reference Carney1994; Evertz & Primus, Reference Evertz and Primus2013). Specifically, double consonants are not very common before ‹ic›, ‹id›, and ‹it›, which are associated with morphemes that often occur in Latinate words. Before the basic endings ‹er›, ‹est›, and ‹ing›, double consonants are rather common, especially when the ending is a morphological suffix and the stem ends with a consonant. If people use medial consonant doubling as a marker of basic status, then a nonword with a double medial consonant and a Latinate ending may appear less wordlike than a nonword with a double medial consonant and a basic ending.
EXPERIMENT 1
Participants saw pairs of nonwords such as phalid versus shalid and phalest versus shalest and were asked to select the more wordlike item from each pair. Both pairs present a choice between an onset that is associated with the Latinate system, ‹ph› in this example, and an onset that is more common in the basic system, here ‹sh›. We asked whether participants were more likely to choose the nonword with the Latinate onset as the more wordlike member of the pair when the ending was Latinate, as in phalid versus shalid, than when the ending was basic, as in phalest versus shalest. Such a result would support the hypothesis that people implicitly code onsets and endings for their status as Latinate or basic and that a letter string appears more wordlike if the status of the onset and ending match than if they do not.
Method
Participants
Thirty-one individuals (13 female) from the Washington University in St. Louis participant pool participated in exchange for course credit or pay. The participants’ mean age was 21.1 years. Almost all of the participants in this and the following experiments reported that English was their first language. The other participants had learned English in early childhood, and English was the first language they had learned to read and write. Given that the participant pool consisted primarily of students from a highly selective university, it is not surprising that participants’ mean standardized score on the reading subtest of the Wide Range Achievement Test (WRAT; Wilkinson & Robertson, Reference Wilkinson and Robertson2006) was above national norms, 118, and that the variability was relatively small (SD=9.7, range 99–134).
Stimuli
We constructed 30 sets of experimental nonwords, each of which contained four nonwords. For example, one set included phalid, shalid, phalest, and shalest. All of the nonwords consisted of an onset followed by ‹a›, ‹e›, ‹i›, or ‹o›, a single consonant, and an ending that began with a vowel letter. Two items in each set had a Latinate onset, ‹ph› in the example, and the other two items had a basic onset of the same length, here ‹sh›. Across sets, there were 11 different Latinate onsets and 12 basic onsets. In line with the fact that Latinate words tend to be used less often in English than basic words, the Latinate onsets in the experiment were less frequent than the basic ones (p=.029 according to a two-tailed t test across the 30 sets for the difference in onset type counts using frequency data from Rubin, Reference Rubin1978; p=.013 for the difference in token counts). Two items in each set had a Latinate ending. We used ‹ic›, ‹id›, and ‹it› as the Latinate endings in this and the following experiments. The other items ended with a basic spelling sequence: ‹er›, ‹ing›, or ‹est›. The items in each set were arranged into two pairs, phalid versus shalid and phalest versus shalest in the example. When constructing experimental nonwords for this and the following experiments, we avoided as much as possible nonwords that could be transformed by a substitution of a single letter into a word that participants might know. We also attempted to avoid items that would sound like a word when pronounced using either the tense or the lax pronunciation of the first vowel. The experimental pairs are listed in the Appendix.
Sixty filler pairs were also constructed. Each filler pair included a nonword that followed the graphotactic constraints of English, such as foob, and a nonword of the same length that contained some or all of the same letters but that was illegal, such as ffob. The more wordlike item was printed on the left of the less wordlike item in half the filler pairs and on the right in the other half. Two additional items that were similar to the fillers were constructed as practice pairs. The more wordlike item was on the left in one practice pair and on the right in the other. The filler and practice pairs are listed in the Appendix.
The 60 experimental pairs were mixed with the 60 filler pairs for presentation to participants. Three different pseudorandom orders were prepared, and approximately one-third of the participants were assigned to each order. Within each order, no more than two experimental pairs or two filler pairs appeared in sequence. Experimental pairs from the same set were separated by at least 5 pairs. Whether the item with the Latinate onset was on the left or right of its mate was randomly determined for each experimental pair in each order. The items were printed on paper for presentation to participants.
Procedure
Participants were tested individually. They were told that they should circle the item in each pair that looked more like a “normal word” of English should look. The experimenter provided feedback about participants’ performance on the practice items and answered any questions that the participant had. The test pairs were then presented. Finally, participants were given the reading subtest of the WRAT (Wilkinson & Robertson, Reference Wilkinson and Robertson2006).
Results
The data and analysis scripts for this and the other experiments are available on https://osf.io/xmyjs/?view_only=7319236d50974ddd9b910b1edf992166. The mixed-model analyses that we report were carried out using the lme4 package (Bates, Mächler, Bolker, & Walker, Reference Bates, Mächler, Bolker and Walker2015) in R version 3.5.0 (R Core Team, 2018).
On the filler pairs, participants chose the graphotactically legal item as more wordlike over 99% of the time. Of most interest is participants’ performance on the experimental pairs. As Table 1 shows, participants were more likely to judge that the item with the Latinate onset was more wordlike than the item with the basic onset when the ending was Latinate than when the ending was basic. This impression was confirmed by a mixed-effect model analysis that was conducted with data at the trial level and that used a logit link function, given that the dependent variable was binary (1=Latinate onset choice, 0=native onset choice). The analysis included data from 1,859 experimental trials; 1 trial on which a participant failed to give a response was not included. The fixed factor was ending type (Latinate vs. basic), and the random effects included random intercepts for participants and item sets. Random slopes for item sets as a function of ending type were also included because their inclusion significantly improved the model’s fit according to a likelihood ratio test. The effect of ending type was statistically significant (β=0.39, SE=0.17, p=.022).
Discussion
When shown a nonword with a Latinate onset and a nonword with a basic onset and asked which looked more like a word of English, participants were more likely to pick the item with the Latinate onset when the ending was Latinate than when it was not. That is, the choice between a Latinate initial letter or letter group (e.g., ‹v› or ‹ph›) and a basic initial letter or letter group (e.g., ‹w› or ‹sh›) was influenced by the letters at the end of the item, a long-distance effect similar to that seen by Treiman, Kessler, and Evans (Reference Treiman, Kessler and Evans2007) for initial ‹c› and ‹g›. Participants were generally less likely to choose Latinate onsets than basic onsets, probably reflecting the lower frequency of the Latinate onsets and a dispreference for less common onsets as compared to more common ones. However, the higher rate of Latinate onset choices when the ending was Latinate as compared to basic supports our hypothesis that people implicitly classify onsets and endings as Latinate or basic. A letter string in which the onset and the ending are in the same category appears more wordlike than one with a mismatch.
EXPERIMENT 2
Experiment 2 went beyond Experiment 1 by including items with double medial consonants. Thus, in addition to pairs such as vomic versus womic and voming versus woming, there were pairs such as vommic versus wommic and vomming versus womming. The first goal of Experiment 2 was to replicate Experiment 1’s finding that items with Latinate onsets (e.g., ‹v›) were considered more wordlike when the ending was Latinate (e.g., ‹ic›) than when it was basic (e.g., ‹ing›). A new question was whether people considered items with Latinate onsets to be less wordlike when the middle consonant was doubled than when it was not. Such a difference would be expected based on the previously mentioned observation that words with the Latinate finals ‹ic›, ‹id›, and ‹it›, such as phonic, valid, and credit, do not usually have double medial consonants (Berg, Reference Berg2016; Carney, Reference Carney1994). In addition, participants in a previous study of spelling production tended to use single rather than double medial consonants when they ended their spellings with one of these Latinate sequences (Treiman & Boland, Reference Treiman and Boland2017). A medial double consonant may be a marker that an item belongs to the basic subset of the vocabulary rather than the Latinate subset, and this may discourage choices of Latinate onsets such as ‹v› and ‹ph›.
Another way in which Experiment 2 went beyond Experiment 1 was in the inclusion of a pronunciation task. In this task, which participants performed after the graphotactic task, they were asked to pronounce the pairs of items from the graphotactic task. Consider a participant who pronounced the first vowels differently in voming in woming. If the participant’s choice in the graphotactic task were influenced than by pronunciation, we might find different results for this item than for items that were pronounced alike from the first vowel on. If the results in the graphotactic task do not vary with pronunciation, we can be more confident that participants focused on the appearance of the items in the graphotactic task, as they were instructed to do.
Method
Participants
We tested 30 people (22 female) from the same pool as Experiment 1. Their mean age was 20.1 years, and their mean standardized score on the reading subtest of the WRAT (Wilkinson & Robertson, Reference Wilkinson and Robertson2006) was 116 (SD=10.2, range 98–138).
Stimuli
We constructed 30 sets of experimental nonwords, each with eight items. The items in a set formed four pairs. The nonwords in a pair differed only in their onsets, one item having a Latinate onset and the other item having a basic onset with the same number of letters. The onsets were the same as in Experiment 1. As in Experiment 1, the Latinate onsets were less common in English than the basic onsets across the 30 sets of items (p=.014 according to a two-tailed t test for the difference in onset type counts using frequency data from Rubin, Reference Rubin1978; p=.011 for the difference in token counts). In two of the pairs in a set, both members had a Latinate ending. There was a single medial consonant in one pair and a double of that consonant in the other. In the other two pairs, both items had a basic ending. Again, one pair had a single medial consonant and the other had a double consonant. We used the same first-syllable vowels and the same endings as in Experiment 1, and we avoided medial consonants that rarely double, such as ‹v›. Table 2 shows a sample pair of each type and the Appendix provides a full list.
The choice task included 120 filler pairs, 60 of which were the same as in Experiment 1. The other 60 fillers, which are listed in the Appendix, were similar in design but longer. The more wordlike item appeared on the left of the less wordlike item in half of the filler pairs and on the right in the other half. The practice pairs were the same as in Experiment 1.
The experimental pairs were divided into two blocks such that the pairs with single and double medial consonants from a given set were in different blocks. Each block included an equal number of pairs with single and double medial consonants. The filler pairs were randomly divided into two equal-sized blocks. For each block of the choice task, three different pseudorandom orders were prepared in the same way as in Experiment 1. One-third of the participants were assigned to each order, and the order of the blocks was balanced across participants.
The pronunciation task used the same experimental pairs as the choice task, with no filler pairs. For each participant, the order of the experimental pairs for the pronunciation task was the same as that in the graphotactic task.
Procedure
Participants were tested individually, beginning with the graphotactic task for one block of items. The items were presented on paper, as in Experiment 1, and the instructions for the graphotactic task were the same as in Experiment 1. After completing the graphotactic task for the first block of items, participants performed the pronunciation task for this block. In this task, they were asked to pronounce the items in each pair the way they thought they would be pronounced if they were “normal words” of English. An experimenter with phonetic training determined whether the pronunciations of the items in each pair rhymed (i.e., were alike from the first vowel on). Participants were then given the reading subtest of the WRAT (Wilkinson & Robertson, Reference Wilkinson and Robertson2006). After a break of about 5 min, participants did the graphotactic task and the pronunciation task for the second block of items.
Results
Participants chose the graphotactically legal item over 98% of time on the filler pairs of the graphotactic task. Table 2 provides information about the proportion of experimental pairs in which participants selected the item with the Latinate onset as more wordlike than the item with the basic onset. We conducted a mixed-effect analysis with the fixed factors ending type (Latinate vs. basic) and medial consonant type (single vs. double). The model also included a factor for whether the participant produced rhyming pronunciations for the items in a pair, as they did for the large majority of pairs (94%). The model included random intercepts for participants and item sets and, because their inclusion significantly improved the fit, random slopes for item sets as a function of ending type. Data from 3,593 experimental trials were included in the analysis; 4 trials on which participants failed to respond and 3 trials on which they circled both items in a pair were not included. There was a significant effect of ending type (β=0.39, SE=0.11, p<.001). This occurred because, as in Experiment 1, participants were more likely to select the item with the Latinate onset as more wordlike than its mate when the ending was Latinate than when it was basic. The effect of medial consonant type was also significant (β=0.25, SE=0.08, p=.002). This effect arose because participants were more likely to select the item with the Latinate onset when there was a single consonant in the middle than when there was a double consonant. There was no significant effect of rhyming pronunciation. The addition of interactions between the variables did not significantly improve the fit of the model.
Discussion
In Experiment 2, as in Experiment 1, participants’ choices of Latinate versus basic onsets were influenced by the spelling sequences at the ends of the items. For example, participants were more likely to pick an item with an initial ‹v› as looking more wordlike than an item with an initial ‹w› when the ending was the Latinate ‹ic› than it was the basic ‹ing›. A new finding is that double medial consonants discouraged choices of Latinate onsets. This outcome is consistent with linguists’ observation that words of Latin origin, such as comic, often have single consonants in contexts where doubling would be likely to occur in basic words (Berg, Reference Berg2016; Carney, Reference Carney1994). Another new finding is that participants’ responses in the graphotactic task did not vary according to whether or not they produced rhyming pronunciations of the items in a pair. This result suggests that participants based their decisions on graphotactics, as instructed, rather than on pronunciation. Overall, the results of Experiment 2 suggest that skilled readers implicitly categorize certain onsets, medial elements, and endings as Latinate or basic and that nonwords in which elements in different positions match appear more wordlike than those with mismatches.
EXPERIMENT 3
Experiment 3 was designed to provide further evidence about whether people use medial consonant doubling as a marker of the system of the vocabulary to which an item belongs. Participants in this experiment saw pairs of nonwords, one with a single medial consonant and one with a double consonant, and judged which member of each pair looked more like a word of English. We asked whether participants were less likely to pick the item with the double medial consonant when the ending was Latinate, as with chabic versus chabbic, than when the ending was basic, as with chabest versus chabbest. We included a pronunciation task, as in Experiment 2, so that we could test whether the results in the graphotactic task differed as a function of whether participants produced the same pronunciations for the items in a pair.
Method
Participants
The participants were 32 individuals (19 female) from the same population as the previous experiments. The participants’ mean age was 19.6 years, and their mean standardized score on the WRAT reading subtest (Wilkinson & Robertson, Reference Wilkinson and Robertson2006) was 115 (SD=10.2, range 92–138).
Stimuli
We constructed 30 sets of four experimental nonwords each. The nonwords in each set formed two pairs. The items in a pair differed only in that one had a single medial consonant and the other had this same consonant repeated twice. One pair in each set ended with one of the Latinate spelling sequences used in the previous experiments and the other ended with one of the basic sequences. For example, one set of items included the pair chabic versus chabbic, with Latinate endings, and the pair chabest versus chabbest, with basic endings. We used the same first-syllable vowels as in the previous experiments, and we again avoided medial consonants that rarely double, such as ‹v›. The experimental pairs are listed in the Appendix. The filler and practice pairs for the graphotactic task were the same as in Experiment 1, and the items were organized into lists following the same procedures. As in the other experiments, the items were presented on paper to participants.
The pronunciation task used the same experimental pairs as the graphotactic task, omitting the filler pairs. Participants received the experimental pairs in the same order as in the graphotactic task, except for two participants who mistakenly received a different order.
Procedure
Participants completed the choice task and, after a break of about 5 min, the pronunciation task. The procedures for these tasks were the same as in the previous experiments. The reading subtest of the WRAT (Wilkinson & Robertson, Reference Wilkinson and Robertson2006) was given last.
Results
On the filler pairs of the graphotactic task, participants chose the legal nonword over 98% of the time. Table 3 shows the results for the experimental pairs. We conducted a mixed-model analysis with the fixed factor ending type (Latinate vs. basic) and whether the participant pronounced the items in the pair alike in the pronunciation task. Pronunciations were the same in 42% of cases; variations usually involved whether the first vowel was pronounced as tense (e.g., /e/ for ‹a›) or lax (e.g., /æ/ for ‹a›). The model had random intercepts for participants and item sets and, because their inclusion significantly improved the fit, random slopes for participants and item sets as a function of ending type. Data from 1,920 experimental trials were included. The effect of ending type was significant (β=1.18, SE=0.26, p<.001). There was no significant effect of pronunciation sameness, and adding the interaction between ending type and pronunciation sameness did not significantly improve the fit of the model. The lack of effects of pronunciation sameness is consistent with the idea that participants responded on the basis of graphotactic acceptability, as instructed, and were not influenced by pronunciation.
Discussion
The most important finding of Experiment 3 is that participants’ judgments about the graphotactic acceptability of single medial consonants as compared to double consonants were influenced by whether the ending spelling sequence was Latinate or basic. Participants picked single consonants only a quarter of the time when the ending was the basic ‹er›, ‹ing›, or ‹est›. Items with single medial consonants were significantly more likely to be chosen when the ending was Latinate than when it was basic.
The results add to our knowledge about the conditions under which people consider doubling of letters to be acceptable. Previous studies have shown that people are sensitive to the frequencies with which particular letters double and the positions within an item in which doubling occurs (Cassar & Treiman, Reference Cassar and Treiman1997; Pacton, Perruchet, Fayol, & Cleeremans, Reference Pacton, Perruchet, Fayol and Cleeremans2001). According to previous studies, people are also sensitive to the letter sequences that may precede double consonants. Word-final double consonants in English are rarely preceded by sequences of more than one vowel letter, for example, and the participants tested by Hayes, Treiman, and Kessler (Reference Hayes, Treiman and Kessler2006) showed a knowledge of this pattern when they judged items like vaiff as less wordlike than items like vaif. Double consonants in French are rarely preceded by single consonants, and the participants tested by Pacton, Sobaco, Fayol, and Treiman (Reference Pacton, Sobaco, Fayol and Treiman2013) showed a knowledge of this pattern when they judged items like aprrulir as less wordlike than those like apprulir. The present findings extend the earlier findings by showing that judgments about the acceptability of double consonants are also influenced by the letters that follow the consonants. Studies of readers’ sublexical knowledge have often focused on the frequencies of letters and digrams (e.g., Carrillo & Alegría, Reference Carrillo and Alegría2014; Chetail, Reference Chetail2017), but the context in which an element occurs also affects its graphotactic acceptability.
GENERAL DISCUSSION
People possess knowledge about the spellings of specific words, lexical knowledge, and knowledge about sublexical patterns that extend beyond individual words. Previous discussions of sublexical patterns have implicitly assumed that they apply across the vocabulary of a language. However, linguists’ analyses of the English writing system (Albrow, Reference Albrow1972; Carney, Reference Carney1994) suggest that some patterns are more common in some sets of words than others. The present study focused on the two main subsystems of English identified by linguists, basic and Latinate. We asked whether adults are sensitive to the sometimes subtle differences between these two systems in their graphotactic patterns.
The results of Experiments 1 and 2 support the idea that skilled adult readers use certain onset and ending spellings as markers of Latinate versus basic status. A letter string with an onset and an ending that suggest the same system thus appears more wordlike than a letter string with a mismatch. The results of Experiments 2 and 3 suggest that single and double medial consonants are also used as markers. People’s judgments about whether an item with an initial ‹v› looks more wordlike than an item with an initial ‹w› or whether an item with a medial double consonant looks more wordlike than one without are thus influenced by letters in other positions of the items. The findings suggest that at least some of the markers of a word’s status that have been identified in linguistic studies of the English writing system are used by skilled readers. In addition, such effects can occur even when the letters involved are not adjacent to one another. Treiman, Kessler, and Evans (Reference Treiman, Kessler and Evans2007) found long-distance dependencies reflecting Latinate or basic status for items beginning with ‹c› and ‹g›, and the present results show that such effects are not restricted to these cases.
The differences within the English vocabulary complicate the task of learning about the English writing system in some ways. For example, people need to learn not only that ‹n› may double in the middles of words but also that it is more likely to double before some spelling sequences than before others. Conditional patterns of this sort are more difficult to learn than simpler patterns (e.g., Samara & Caravolas, Reference Samara and Caravolas2014). However, learning about the subsystems of the English vocabulary offers benefits to users of the language. It may help them to pronounce written words, for example, as the pronunciation of a letter or letter group may differ according to whether the word in which it appears is Latinate or basic. Indeed, people seem to consider factors related to word origin in pronunciation and spelling tasks (Kemp et al., Reference Kemp, Treiman, Blackley, Svoboda and Kessler2015; Treiman, Kessler, & Evans, Reference Treiman, Kessler and Evans2007).
Graphotactic patterns are linked to phonological and morphological ones, meaning that some the results reported here may reflect a constellation of cues, not only graphotactic ones. For example, people may be more likely to choose initial ‹v› over initial ‹w› for items that end with ‹ic› than for items that end with ‹ing› because they associate ‹v› with the final letter sequence ‹ic› or with a set that includes words with this and other finals. Alternatively, or in addition, people may associate certain phonological onsets with certain phonological endings. The results of Experiment 2 and 3 show that graphotactics can play a role beyond phonology, but additional research will be needed to further examine the influences of different cues.
People probably learn about the distinctions between Latinate and basic words in a variety of ways. For example, they may learn that certain spelling patterns, morphological structures, and pronunciations are more frequent among words that are characteristic of scholarly and scientific communication, among words that are learned at later ages, and among words that are considered appropriate in formal situations (Levin & Novak, Reference Levin and Novak1991). The participants in our studies, who were highly skilled readers in a university setting, would have had many opportunities to learn about academic language and how it differs from less formal language. Future studies could test less skilled readers and younger students, who are less familiar with the Latinate vocabulary (e.g., Bar-Ilan & Berman, Reference Bar-Ilan and Berman2007) and who may show different results.
Although more research is needed, our results suggest that skilled users of English, even those who know no language other than English, are in some sense not monolingual. Just as people who are bilingual in English and Welsh know that written words in these two languages have some spelling differences, so skilled users of English know that their language includes sets of words that are graphotactically different in some ways.
APPENDIX
EXPERIMENT 1
Experimental pairs (item with Latinate onset shown first within each pair, sets of pairs separated by semicolons): dranid dwanid, dranest dwanest; drebid dwebid, dreber dweber; mnalic wralic, mnalest wralest; mnocit wrocit, mnocing wrocing; mnolit wrolit, mnolest wrolest; phalid shalid, phalest shalest; phomit shomit, phomer shomer; phradic spradic, phradest spradest; phrebit shrebit, phrebing shrebing; phremic spremic, phreming spreming; psemic swemic, psemest swemest; psomit swomit, psoming swoming; rhebit wrebit, rhebing wrebing; rhecid whecid, rhecer whecer; rhocid whocid, rhocer whocer; scevit skevit, scevest, skevest; scodit skodit, scoder, skoder; trebic twebic, trebing twebing; trevic twevic, treving tweving; trizit twizit, trizing twizing; vecic wecic, vecer wecer; vefic wefic, vefing wefing; vobid wobid, vobest wobest; xevit yevit, xever yever; xitid yitid, xiting yiting; xizid yizid, xizer yizer; zemid kemid, zemer kemer; zetic ketic, zeter keter; zevic kevic; zevest kevest; zibid kibid, zibest, kibest.
Filler pairs (correct choice shown first within each pair): bleem bbllm, blex bxle, blit lbti, chead chzdh, cleep clllp, clent tnelc, clun nucl, darp dqrp, delf fdlf, dreet dtree, flant tnafl, foit ftti, glorb glbrb, gort gwtt, grem rgme, grike gkkke, jint jjjt, mern mcrn, moag mgao, nalp nlap, ploar ppplr, prap rpap, slarm srrrm, smeck mskce, spup suuu, stoff stttf, thaze hgzae, thoag gthth, treb ttwb, wilk iklw.
Practice pairs (correct choice shown first within each pair): tife oeeeb, barp bnpz.
EXPERIMENT 2
Experimental pairs with single medial consonants (item with Latinate onset shown first within each pair, sets of pairs separated by semicolons): dranic dwanic, dranest dwanest; drebid dwebid, dreber dweber; mnalic wralic, mnalest wralest; mnolit wrolit, mnolest wrolest; phabid shabid, phabest shabest; phebid shebid, phebing shebing; phomit shomit, phomer shomer; phradic spradic, phradest spradest; phranit shranit, phraning shrining; phremic spremic, phreming spreming; phrobid shrobid, phrobing shrobing; psebit swebit, psebest swebest; psemic swemic, psemest swemest; rhemit whemit, rhemer whemer; rhinit wrinit, rhining wrining; rhonid whonid, rhoner whoner; rhonit wronit, rhoning wroning; scedit skedit, sceder skeder; scelit skelit, scelest skelest; tridid twidid, triding twiding; tromic twomic, troming twoming; vamic wamic, vamer wamer; vobid wobid, vobest wobest; vomic womic, voming woming; xanit yanit, xaner yaner; xilid yilid, xiler yiler; xitid yitid, xiting yiting; zemid kemid, zemer kemer; zetic ketic, zeter keter; zibid kibid, zibest kibest.
Experimental pairs with double medial consonants: same as experimental pairs with single medial consonants except that medial consonant is doubled.
Filler pairs: same as those of Experiment 1 plus the following (correct choice shown first within each pair): billat biiaat, bistup ptbsui, bledost bledsds, blitogy yiolbgo, brognet brognmt, chiddup pdpiuch, colabic calobcc, crevan crvven, crolint tniocrl, cuplint cuplidh, dophian pdhianr, dromosite imoordtst, drossle drossss, frestap ersptfa, frossetno frosnptoo, glomite eolitgm, goffate tfaeofg, joclup plcuju, jultiply julpsctu, lacrip lacrnp, lartic icrlpd, lefave levlff, magater aatermg, melity empnly, mindoma aomnndi, nalpure erlpnau, nelmit mlteni, nemolist neionslt, neparp nepprp, norstane tnnoersa, numain amnuai, oramete oaemsrz, palrem rlmaep, peloter elpotrz, plarmite plramuue, plinote pliaooe, prudope udrppdr, pulapid ppulaai, quameth htueaqm, reltace reltvze, ridonette nttreeoir, saspone ssapnso, semilate aeemitss, shion nhsso, shoroge shorgvh, sillant silpnte, siphile iielphs, spocomo spocmcm, toogit oiottg, trostain ttaisnto, tumnost nttsmou, umplifine umupmfaee, vatlay vatyyy, vention venphls, vollap voplpl, vomant vomnmt, vonidous oinousdv, voqueve qveeeuv, wachope cphwaoe, yimello eiolylm.
Practice pairs: same as in Experiment 1.
EXPERIMENT 3
Experimental pairs (within each set, pair with Latinate ending shown before pair with basic ending, sets of pairs separated by semicolons): blipid blippid, bliper blipper; bremic bremmic, breming bremming; chabic chabbic, chabest chabbest; cladid claddid, cladest claddest; cletic clettic, cletest clettest; cramid crammid, cramest crammest; dafic daffic, dafing daffing; drefic dreffic, drefing dreffing; flibic flibbic, fliber flibber; flobic flobbic, flobest flobbest; frecid freccid, frecing freccing; frenid frennid, frening frenning; glecit gleccit, glecing gleccing; glesid glessid, gleser glesser; gosic gossic, goser gosser; mamit mammit, mamer mammer; nenit nennit, nener nenner; pomid, pommid, pomest pommest; shremit shremmit, shreming shremming; shromit shrommit, shromest shrommest; stitid stittid, stitest stittest; stodit stoddit, stoding stodding; swodid swoddid, swoder swodder; thefit theffit, thefer theffer; thibic thibbic, thibing thibbing; thofit thoffit, thofer thoffer; thopid thoppid, thoper thopper; zepit zeppit, zeping zepping; zitic zittic, zitest zittest; zobid zobbid, zobest zobbest.
Filler pairs and practice pairs: same as Experiment 1.