Orthographic effects on L2 production and L2 proficiency in ESL learners with non-alphabetic and orthographically opaque L1

Wenxiyuan Deng; Kit Ying Chan; Ka Man Au Yeung

doi:10.1017/S014271642200039X

Orthographic effects on L2 production and L2 proficiency in ESL learners with non-alphabetic and orthographically opaque L1

Published online by Cambridge University Press: 06 December 2022

and

Wenxiyuan Deng*: Affiliation:
Department of Social and Behavioural Sciences, City University of Hong Kong, Kowloon, Hong Kong
Kit Ying Chan: Affiliation:
Department of Social and Behavioural Sciences, City University of Hong Kong, Kowloon, Hong Kong
Ka Man Au Yeung: Affiliation:
Department of Social and Behavioural Sciences, City University of Hong Kong, Kowloon, Hong Kong
*: *Corresponding author. Email: wenxideng2-c@my.cityu.edu.hk

Article contents

Abstract
Inter-orthographic effects depend on shared scripts and transparent L1 orthographies
The role of L1 transparency in intra-orthographic effects
Cantonese and Mandarin Chinese have non-alphabetic and opaque orthography
L2 proficiency, phonological awareness, and orthographic effects
The present study
Experiment 1
Experiment 2
General discussion
Replication Package
Footnotes
References

Rights & Permissions

Abstract

This study examined the role of first language (L1) transparency in intra-orthographic effects on second language (L2) pronunciation by studying L2 learners with a non-alphabetic and orthographically opaque L1 and an alphabetic L2. Relations between orthographic effects, phonological awareness, and L2 proficiency were examined. Fifty-four Cantonese-speaking English as a second language (ESL) learners participated in Experiment 1 with orthographic effect tasks (homophone and silent-letter read-aloud) and phonological awareness tasks. Thirty Cantonese-speaking and 30 Mandarin-speaking ESL learners participated in Experiment 2 with orthographic effect tasks and an L2 proficiency task. The L2 pronunciation of Cantonese and Mandarin participants was subjected to intra-orthographic effects. Phonological awareness and L2 proficiency were associated with less orthographic effects on L2 pronunciation in Cantonese participants. Mandarin participants did not subject to more orthographic effects than Cantonese participants when controlling L2 proficiency, implying that shared alphabetic scripts between Pinyin and English did not interfere with L2 production. Overall, transferring the L1 reading strategy that relies on orthography to decode phonology to L2 reading seemed not to be the key mechanism behind intra-orthographic effects. L2 graphemes were likely to be decoded with incorrect L2 grapheme-to-phoneme correspondences, resulting in intra-orthographic effects.

Keywords

orthographic effects L2 production non-alphabetic opaque Pinyin

Type: Original Article
Information: Applied Psycholinguistics , Volume 43 , Issue 6 , November 2022 , pp. 1329 - 1357

DOI: https://doi.org/10.1017/S014271642200039X [Opens in a new window]
Copyright: © The Author(s), 2022. Published by Cambridge University Press

Orthographic effects on pronunciation refer to the orthographic form affecting language learners’ pronunciation accuracy. Research has shown that orthography induces both positive and negative impacts on second language (L2) pronunciation (Bürki et al., Reference Bürki, Welby, Clément and Spinelli2019). Some studies found a facilitative effect of orthography with L2 learners making fewer phoneme errors in their L2 production with both orthographic and audio input than solely audio input (Erdener & Burnham, Reference Erdener and Burnham2005; Rafat, Reference Rafat2015). Conversely, other studies found that exposure to orthography led to less target-like pronunciation (Bassetti, Reference Bassetti2017; Bassetti & Atkinson, Reference Bassetti and Atkinson2015). For example, the study on native English-speaking learners of German found that an orthographic input often provides contradictory information regarding German final devoicing (Hayes-Harb et al., Reference Hayes-Harb, Brown and Smith2018).

How L2 orthography affects L2 phonology could depend on whether L1 and L2 use the same script. Bassetti (Reference Bassetti, Piske and Young−Scholten2008) has pointed out the importance of investigating L1 and L2 with different scripts. Bassetti (Reference Bassetti, Piske and Young−Scholten2008) categorized two types of orthographic effects — inter-orthographic and intra-orthographic effects. An inter-orthographic effect refers to L2 learners applying the L1 grapheme-to-phoneme correspondence (GPC) rules to interpret the L2 orthography that resembles L1. An intra-orthographic effect refers to L2 learners recoding L2 graphemes with incorrect L2 GPCs. Therefore, an inter-orthographic effect presents only when L1 and L2 share similar scripts. Otherwise, only intra-orthographic effects are possible.

Inter-orthographic effects depend on shared scripts and transparent L1 orthographies

The study of inter-orthographic effects largely centered on speakers with L1 and L2 that are both alphabetic with a different orthographic depth (Katz & Frost, Reference Katz and Frost1992). Orthographic depth refers to a transparent-to-opaque continuum in which the GPCs of a language writing system vary to different degrees. A language is considered to have a transparent orthography (e.g., Italian and Spanish) if it has unambiguous and regular GPCs such that its orthography reliably represents pronunciation. In contrast, a language is considered to have an opaque orthography (e.g., Dutch and English) if its GPCs are inconsistent such that phonemic interpretations vary with context. For example, in English, the letter “a” is mapped to /æ/ in “apple”, /ɑː/ in “father”, /ə/ in “about”, and /eɪ/ in “base”.

A previous study by Erdener and Burnham (Reference Erdener and Burnham2005) demonstrated that native users of transparent scripts might rely more on orthographies during L2 processing than those whose native languages were less orthographically transparent. L2 learners with a transparent L1 script tend to be misled by L2 orthography when it does not congruently map onto the L2 phonology (i.e., incongruent GPCs), whereas L2 learners with an opaque L1 script may have a weaker connection between orthography and phonology. Moreover, congruence between the L1 and L2 GPCs could aid L2 phonological accuracy, whereas incongruence would inhibit L2 learning (Escudero et al., Reference Escudero, Simon and Mulak2014).

Inter-orthographic effects were found in English as a second language (ESL) learners with Italian as a L1 (Bassetti, Reference Bassetti2017; Bassetti & Atkinson, Reference Bassetti and Atkinson2015). Both Italian and English are alphabetic. Italian is orthographically transparent, while English is orthographically opaque. Italian participants applied the Italian double-letter convention to interpret English graphemes, resulting in non-target-like L2 pronunciation (Bassetti, Reference Bassetti2017; Bassetti & Atkinson, Reference Bassetti and Atkinson2015). They tended to pronounce a longer vowel/consonant duration in digraphs than in singletons in English such that /i:/ was produced longer in “seen” than in “scene”. The same alphabetic letter-based script allowed Italian GPCs to override L2 phonetic knowledge in English. Inter-orthographic effects depend on shared scripts between the L1 and the L2, and L1 with a transparent orthography such that the L2 learners tend to decode L2 orthography using their L1 GPCs.

The role of L1 transparency in intra-orthographic effects

Few studies explored intra-orthographic effects by examining speakers with L1 and L2 having different writing systems. Sokolović-Perović et al. (Reference Sokolović-Perović, Bassetti and Dillon2020) examined the influence of number of letters on the duration of consonants and vowels in the English pronunciation of late Japanese–English sequential bilinguals. Even though Japanese has a non-alphabetic script, Japanese bilinguals produced a longer sound when they saw a double-letter English consonant. There is no correspondence between double letters and extended sounds in Japanese. For English, consonant lengthening (gemination) is not distinctive. For example, the double letters “nn” in “dinner” does not correspond to /n:/. Nevertheless, some multimorphemic words are geminated in English. For instance, “misspell” is pronounced as /ˌmɪsˈspel/. Intra-orthographic effects provided a possible explanation in this case. The Japanese participants developed an incorrect conception of English gemination during their L2 acquisition. It was internal to English since there was no such source in Japanese or English.

Japanese contains two non-alphabetic writing systems, Kanji and Kana. Kanji is based on orthographically opaque characters, while Kana is based on highly transparent syllables. Native users of transparent alphabetic language scripts tend to depend on orthographic information to access L2 phonology (Erdener & Burnham, Reference Erdener and Burnham2005). Sokolović-Perović et al. (Reference Sokolović-Perović, Bassetti and Dillon2020) argued that the transparent though non-alphabetic Kana would make participants rely more on English orthographic forms when producing English words. Unlike inter-orthographic effects, in which specific L1 orthography-phonology correspondences are transferred through shared scripts between the L1 and the L2, intra-orthographic effects might involve the transfer of a general L1 reading strategy that relies heavily on orthography to decode phonology. To test this mechanism, it is important to examine L2 learners with an orthographically opaque L1 such that they do not habitually form a strong connection between orthography and phonology during L1 processing.

Cantonese and Mandarin Chinese have non-alphabetic and opaque orthography

In this study, we investigated intra-orthographic effects in ESL learners with Chinese as a L1 as Chinese script is opaque and non-alphabetic. Chinese is logographic, reflecting semantic designs rather than phonological structure with its written units corresponding to ideographs (Perfetti et al., Reference Perfetti, Rieben and Fayol1997). Chinese characters map onto syllabic morphemes as the speech units. Some characters are the simplest pictographs that cannot be further divided, such as 人 (/jan4/ in Cantonese Chinese, the fourth tone; “man”), 山 (/saan1/ in Cantonese Chinese, the first tone; “mountain”), and 木 (/muk6/ in Cantonese Chinese, the sixth tone; “wood”). Some characters are associative compounds of two or more pictographs or ideographs that refer to meaning. For example, 信 (/seon3/ in Cantonese Chinese, the third tone; “letter” or “believe”) is made up of the pictographs 人 and 言 (/jin4/ in Cantonese Chinese, the fourth tone; “language”); and 休 (/jau1/ in Cantonese Chinese, the first tone; “break”) contains the pictographs 人 and 木. Many of the Chinese characters are phono-semantic compounds made up of semantic radicals and phonetic-related components. For example, 睬 (/coi2/ in Cantonese Chinese, the second tone; “look”) is made up of a semantic radical 目 (“eye”) and a phonetic component 采 (/coi2/ in Cantonese Chinese, the second tone; “pick”).

In contrast to Japanese, Chinese characters are highly opaque as they do not possess clear segmental structures relating to phonology. The connection between Chinese orthography and phonology is considered substantially weak. For example, the pictograph 女 (/neoi5/ in Cantonese Chinese, the fifth tone; “female”) and the compound 安 (/on1/ in Cantonese Chinese, the first tone; “safe”) share the same component 女, but they have entirely different pronunciations. The pictograph 衣 (/ji1/ in Cantonese Chinese, the first tone; “clothes”) and the compound 醫 (/ji1/ in Cantonese Chinese, the first tone; “heal”) look completely different yet they have the same pronunciation. Even for characters with a phonetic-related compound, the phonetic compound may only provide limited phonetic clues due to discrepant tones and consonants. For example, the character 份 (/fan6/ in Cantonese Chinese, the sixth tone; “part”) consists of a phonetic component 分 (/fan1/ in Cantonese Chinese, the first tone; “divide”), but 分 and 份 have different pronunciations due to tonal differences. The character 鍾 (/zung1/ in Cantonese Chinese, the first tone; “clock”) consists of a phonetic component of 童 (/tung4/ in Cantonese Chinese, the fourth tone; “child”), but 童 and 鍾 have different pronunciations due to differences in both consonants and tones.

The current study included both Cantonese ESL learners from Hong Kong and Mandarin ESL learners from Mainland China as participants. Although sharing logographic written forms that are considered highly opaque, Cantonese and Mandarin are considered as two different Chinese dialects with distinctive phonemes, tones, grammars, and vocabularies (Snow, Reference Snow2004). Cantonese is spoken in mainly the south of China (e.g., Hong Kong, Macau, and Guangzhou). Mandarin is the formal language in China and is spoken in mainland China and Taiwan. Unlike Hong Kong people who learn to read Cantonese by rote only, Mainland people learn to read Mandarin by Pinyin, a Romanized phonetic script to represent Mandarin phonemes. Pinyin is considered to have a highly transparent orthography with highly consistent letter-sound correspondences (Bassetti, Reference Bassetti, Andreas, Xin and Yexin2007). English and Mandarin Pinyin share many letters, but they do not sound the same across the two languages.

L2 proficiency, phonological awareness, and orthographic effects

Bassetti et al. (Reference Bassetti, Mairano, Masterson and Cerni2020) and Veivo and Järvikivi (Reference Veivo and Järvikivi2013) found a negative relationship between L2 proficiency and inter-orthographic effects. Escudero et al. (Reference Escudero, Simon and Mulak2014) found that L2 proficiency significantly predicted performance, but only in perceiving difficult pseudoword pairs. Some studies found that orthographic knowledge could lead to L2 misconception regardless of language proficiency. For example, Erdener and Burnham (Reference Erdener and Burnham2005) tested Turkish and English natives using naïve languages, while Bassetti and Atkinson (Reference Bassetti and Atkinson2015) tested Italian experienced ESL learners, both studies reported inter-orthographic effects regardless of L2 proficiency. Studies that found a significant effect argued that low-proficiency learners failed to link the orthographic representations and the phonological representations together; besides, their inadequate phonological representation led to more pronunciation errors (Escudero et al., Reference Escudero, Simon and Mulak2014; Veivo & Järvikivi, Reference Veivo and Järvikivi2013). On the other hand, high-proficiency learners might have integrated orthographic and phonological representations with high phonological accuracy.

Similarly, Bassetti et al. (Reference Bassetti, Mairano, Masterson and Cerni2020) demonstrated that more accurate L2 phonological representation as reflected by higher phonological awareness in L2 learners was linked to weaker inter-orthographic effects on L2 production. Phonological awareness refers to the awareness of and the ability to reflect, analyze, and manipulate speech sounds (Leong et al., Reference Leong, Tan, Cheng and Hau2005; McBride-Chang, Reference McBride-Chang1995). It helps children initially grasp the letter−sound relationships in word reading (Treiman, Reference Treiman, Sawyer and Fox1991). For Italian ESL learners in Bassetti et al. (Reference Bassetti, Mairano, Masterson and Cerni2020), higher L2 phonological awareness on consonant length being not contrastive in English was related to a smaller double-consonant-to-single-consonant duration ratio in their English production. That is, better performance on L2 phonological awareness predicted weaker inter-orthographic effects on L2 production. Actually, higher L2 proficiency also predicted weaker inter-orthographic effects on L2 production in that study. More proficient Italian ESL learners probably had a better understanding that singleton and geminate consonants were not contrastive in English as in Italian; therefore, they were less likely to decode L2 orthography using their L1 GPCs.

These previous studies (e.g., Bassetti et al., Reference Bassetti, Mairano, Masterson and Cerni2020; Veivo & Järvikivi, Reference Veivo and Järvikivi2013) investigated the relationship between inter-orthographic effects and L2 proficiency, and phonological awareness in alphabetic native language users. The relationship between intra-orthographic effects and L2 proficiency, and L2 phonological awareness is yet to be answered. It is possible that L2 proficiency is particularly important for intra-orthographic effect because it is internal to L2 orthography without transfer of the incongruent L1-to-L2 GPCs.

The present study

The current study extended the investigation of intra-orthographic effects to ESL learners with non-alphabetic and orthographically opaque Chinese as L1. Cantonese ESL learners from Hong Kong learn to read Chinese by rote such that they do not have an L1 reading strategy that places great reliance on orthography to decode phonology. By examining intra-orthographic effects in Cantonese ESL learners from Hong Kong in Experiment 1, this study tested if the transfer of such L1 reading strategy was the primary mechanism for intra-orthographic effects on L2 production. Experiment 1 would also test if L2 phonological awareness and L2 proficiency predicted intra-orthography effects on L2 production.

Experiment 2 compared orthographic effects experienced by Cantonese ESL learners from Hong Kong and Mandarin ESL learners from Mainland China. Having learned Chinese through transparent Pinyin scripts, Mandarin ESL learners develop an L1 reading strategy to access phonology through highly consistent orthography-phonology correspondences. As Pinyin shares a script with English, inter-orthographic effects are possible in addition to intra-orthographic effects. Experiment 2 aimed to compare the performance between Cantonese and Mandarin ESL learners to examine the impact of knowing Pinyin on orthographic effects.

Experiment 1

By studying Cantonese experienced ESL learners from Hong Kong, Experiment 1 examined intra-orthographic effects on L2 pronunciation in ESL learners with an L1 having opaque orthography and a completely different writing system from the L2. These native Cantonese speakers from Hong Kong have Cantonese as their dominant language and typically begin to learn English in kindergartens. Although English is one of the official languages in Hong Kong, these Cantonese speakers rarely use English outside classrooms. Hong Kong ESL learners live in a non-immersive environment, and exposure to native spoken English is far from enough (Wong et al., Reference Wong, Dealey, Leung and Mok2021). These Cantonese natives typically speak English with a Cantonese accent despite years of English-learning experience in classrooms throughout their primary, secondary, and tertiary education. Other researchers have considered them as advanced or experienced ESL learners (Chan, Reference Chan2019) or ESL students (Gan, Reference Gan2012) rather than bilinguals. Therefore, our participants from Hong Kong were considered as experienced ESL learners rather than early bilinguals. It is also noteworthy that English teaching in Hong Kong predominantly uses exercises emphasizing rote memorization, such as dictation and grammar practice (Poon, Reference Poon2010). English-teaching classrooms in Hong Kong emphasize reading and writing skills rather than listening and speaking. Public admission examinations in Hong Kong also focus on reading, grammar, and composition rather than communication and speaking accuracy (Evans, Reference Evans1996).

Previous studies have suggested that native users of opaque orthographies may adopt a “whole-picture” mental representation in lexical reading, while speakers of transparent languages depend on segmental orthographic representation (Erdener & Burnham, Reference Erdener and Burnham2005; Lemhöfer et al., Reference Lemhöfer, Dijkstra, Schriefers, Baayen, Grainger and Zwitserlood2008). Without a L1 reading habit of forming a solid connection between orthography and phonology to transfer to L2 processing, Cantonese participants are expected to rely less on English orthographic forms when producing English words. Instead, their L1 reading strategy of using the “whole-picture” to represent lexical items might work well for English as English has an opaque orthography with inconsistent GPCs. Therefore, the use of the L1 reading strategy would predict minimal orthographic effects, at least for the silent-letter stimuli, which involve irregular GPC rules. If intra-orthographic effects are found in English production of Cantonese participants, it implies that the Cantonese participants are relying on the graphemes to decode the pronunciation to certain extent, although such reading strategy is not transferred from their L1. Such a finding would rule out the transfer of the L1 reading strategy that places great reliance on orthography to decode phonology as the key mechanism behind intra-orthographic effects. More importantly, the presence of intra-orthographic effects in Cantonese participants would also suggest that they are not completely transferring the “whole-picture” reading strategy from L1 to process English words.

Experiment 1 also investigated the role of L2 phonological awareness and L2 proficiency in predicting intra-orthographic effects on L2 production. Since the intra-orthographic effects do not involve the transfer of L1 orthography-phonology correspondences to decode L2 orthography, L2 phonological awareness and L2 proficiency should be the primary factors in determining the quality of L2 phonological representation, thereby predicting intra-orthographic effects.

Past literature implied that phonological awareness skill was not always fully developed in adults (Moran & Fitch, Reference Moran and Fitch2001; Spencer et al., Reference Spencer, Schuele, Guillot and Lee2008). ESL learners could show a wide range of English phonological awareness skills depending on their prior experience with alphabetic literacy in their L1s (Holm & Dodd, Reference Holm and Dodd1996; Read et al., Reference Read, Yun-Fei, Hong-Yin and Bao-Qing1986). In this study, the Cantonese ESL learners have an opaque L1 orthography and learn Cantonese phonology by rote memorization. Frequently engaging in syllable-level processing in L1, Cantonese-speaking young adults from Hong Kong were shown to have limited phonological awareness in English and increased difficulty in processing nonwords in English compared to native speakers from Australia, as well as ESL learners from Mainland China and Vietnam, who all have an alphabetic literacy in their L1s (Holm & Dodd, Reference Holm and Dodd1996). These findings suggest that early literacy processing skills from L1 are transferred to ESL learning (Holm & Dodd, Reference Holm and Dodd1996). Hence, it is predicted that Cantonese ESL learners, especially those with low phonological awareness and proficiency in English, would show strong intra-orthographic effects on L2 production in the current study.

To measure the extent of orthography influences on L2 pronunciation, the homophone and silent-letter read-aloud tasks from Bassetti and Atkinson (Reference Bassetti and Atkinson2015) were adopted. The homophone read-aloud task was used to examine L2 speakers’ production accuracy of English homophones to test if participants’ production would be biased by the differences in orthography. For example, “aloud” and “allowed” are spelled differently but have the same pronunciation. Pronouncing the two words differently indicates orthographic effects that participants judge the pronunciation by orthographies. Silent letters refer to letters lacking phonetic correspondences, such as “b” in “lamb” (/læm/) and “l” in “salmon” (/ʼsæmən/). A failure to omit the production of silent letters results in an orthography-induced epenthesis — insertion of a sound with a grapheme but without a phonetic counterpart — indicating orthographic effects on L2 pronunciation (Bassetti & Atkinson, Reference Bassetti and Atkinson2015).

Italian ESL learners in Bassetti and Atkinson (Reference Bassetti and Atkinson2015) produced on average 40% of the stimuli as non-homophonic pairs in the homophone read-aloud task. For silent-letter read-aloud task, 85% of stimuli were pronounced with added phonemes. The findings showed inter-orthographic effects that incongruence between L1 and L2 GPCs led to L2 production mistakes. For Cantonese participants, since there are no shared scripts between the L1 and the L2, transferring L1 orthography-phonology correspondences to L2 is not possible. High error rates of pronouncing homophonic pairs as non-homophonic and a high rate of orthography-induced epenthesis would not indicate inter-orthographic effects, but rather intra-orthographic effects that are internal to the L2.

Phonological awareness in English was measured through a phoneme deletion task and a pseudoword read-aloud task. High accuracy rates in these tasks reflect strong phonological awareness. Participants’ overall English proficiency level was indicated by their scores in the Hong Kong Diploma of Secondary Education (HKDSE) in English language, a standardized public exam for university admission in Hong Kong. HKDSE results are also widely accepted by more than 280 tertiary institutions worldwide.

Method

Participants

Fifty-four undergraduate students with Cantonese as a L1 and English as a L2 from City University of Hong Kong were recruited from the Basic Psychology participant pool and received course credit for their participation. Participants aged between 18 and 23 years (M = 19.2, SD = 1.37)Footnote ¹ with no reported history of language, hearing, or reading impairments. Participants’ onset of English learning ranged from age 1 to 5 years (M = 3.35, SD = 1.03), and their years of learning ESL was 12−20 (M = 15.8, SD = 1.86). Table 1 shows the demographic and language of the participants.

Table 1. Demographic and Language of Participants in Experiment 1

Note. M = mean. SD = standard deviation.

Materials

Materials used in Experiment 1 (as described below) can be accessed via https://osf.io/njd7z/.

Language history questionnaire

A language history questionnaire was used to collect participants’ demographic data regarding their gender, age, grades of English language in public examinations and other English-learning history, such as age of acquisition (AoA) and duration of living in English-speaking countries.

Homophone read-aloud task

Stimuli included 24 homophonic word pairs (Appendix A), half of which was adopted from Bassetti and Atkinson (Reference Bassetti and Atkinson2015) and the rest was from an online homophone database (Aloisi, Reference Aloisi2008).

Silent-letter read-aloud task

Stimuli included eight target words with silent letters (Appendix B) adopted from Bassetti and Atkinson (Reference Bassetti and Atkinson2015). Each target word contains one of the three silent letters “b”, “d”, or “l”. There were four words for “b” and two words for “d” and “l”, respectively.

Phoneme deletion task

Thirty-two target words (Appendix C), including 16 congruent and 16 incongruent stimuli, were adopted from Tyler and Burnham (Reference Tyler and Burnham2006). Congruence of stimuli refers to the deletion of the first phoneme resulting in a string of letters that match the spelling of the correct phonological response. For instance, deletion of the first phoneme in a congruent stimulus, “bride” (/braɪd/), results in a phonological response, /raɪd/, which matches the spelling of “ride”. On the other hand, deletion of the first phoneme in an incongruent stimulus, “worth” (/wɜːθ/), results in a phonological response, /ɜːθ/, that mismatches the spelling of “orth” (/ɜːθ/ ≠ “orth”). Instead, /ɜːθ/ is pronounced as “earth”. The stimuli were recorded by a female native speaker of English with a British accent. She read aloud each stimulus three times, and the recording with the most natural intonation and moderate speed was selected as stimulus.

Pseudoword read-aloud task

Fifteen pseudowords sounding like real words without any semantic content (e.g., “burd” pronounced as “bird”; Appendix D) were adopted from Lukatela and Turvey (Reference Lukatela and Turvey1991).

Procedure

Participants were seated individually in front of a Windows-running PC in the Social Science Laboratories at City University of Hong Kong. A Logitech H340 USB headset with a mounted microphone was used to present the audio stimuli and capture participants’ spoken responses. Instruction was provided verbally in Cantonese by a trained experimenter prior to each task. First, participants gave consent and filled in the language history questionnaire through Qualtrics (Qualtrics, 2005). Then participants completed all other tasks through Paradigm (Perception Research Systems, 2007) in a standardized order as follows: homophone and silent-letter read-aloud, phoneme deletion and pseudoword read-aloud.

For the homophone and silent-letter read-aloud task, all homophonic words and silent-letter words were presented in a randomized order. After receiving the instruction, participants were told to press SPACEBAR to begin whenever they were ready. For each trial, one stimulus word was shown visually at the center of the computer screen for participants to read aloud. Participants were given as much time as they need and were instructed to press SPACEBAR after their verbal responses. The next stimulus would then be presented after 5 s. There were 32 trials in total. No feedback was given in the experimental trials.

For the phoneme deletion task, participants went through a demonstrating trial and two practice trials with auditory answers prior to the experimental trials. Participants were visually presented a stimulus at the center of the computer screen and heard its auditory form simultaneously. They were required to pronounce it without the first phoneme. After each trial, participants pressed SPACEBAR and waited 5 seconds for the next trial. There were 32 trials in total. No feedback was given in the experimental trials.

Procedure for the pseudoword read-aloud task was the same as the other read-aloud tasks, except that a demonstrating trial and two practice trials with auditory answers were presented prior to the experimental trials. There were 15 trials in total. All spoken responses were captured and recorded by Paradigm. Participants were given breaks in between the tasks and were debriefed upon completion of the whole study.

Results

Data of each task in Experiment 1 can be accessed via https://osf.io/njd7z/.

Two homophonic pairs “caught, court” and “sauce, source”, which are considered nonhomophones by rhotic speakers, were excluded from analysis. Since the instruction did not specify which accent to adopt for the task, the two pairs were excluded to avoid potential effects specific to accents. Participants’ spoken responses from the phoneme deletion task and the read-aloud tasks were scored by two qualified Cantonese-speaking English teachers with prior English phonetics training. Rater 1 was a high school teacher with 7 years of teaching experience. Rater 2 was a primary school English teacher with 8 years of teaching experience. They were blinded to the hypotheses of this study and were asked to listen and rate participants’ pronunciation independently according to the scoring scheme. Table 2 presents the scoring scheme for each task. There was no restriction on the order of rating or the number of times they could listen to the recordings. The inter-rater reliability was high for all tasks. Cohen’s Kappa was .91 for the silent-letter read-aloud task, .81 for the homophone read-aloud taskFootnote ² , and .95 for both phoneme deletion and pseudoword read-aloud tasks, respectively. Participants’ grades in HKDSE in English Language were also coded for analysis (Table 2). Table 3 presents the descriptive statistics for all tasks.

Table 2. Scoring Scheme for all Tasks and the Hong Kong Diploma of Secondary Education Examination (HKDSE) in English Language in Experiment 1

Note. To provide a finer discrimination of candidates’ ability at the top end, level 5** is awarded to the highest-achieving 10% (approximately) level 5 candidates and level 5* is awarded to the next highest-achieving 30% (approximately) level 5 candidates.

Table 3. Descriptive Statistics for Error Rates in Homophone and Silent-Letter Read-aloud Tasks, and Accuracy Rates in Phoneme Deletion and Pseudoword Read-aloud Tasks in Experiment 1

Homophone read-aloud task

Participants had a mean error rate of 13.2% (SD = 15%) that they produced homophonic word pair as nonhomophones. Word pair “seas, seize” had the highest error rate of 29.6%; word pairs “son, sun” and “thai, tie” had the lowest error rate of 1.85%. No obvious error pattern was found.

Silent-letter read-aloud task

The mean error rate for Cantonese participants was 32.2% (SD = 19.2%). However, participants’ performance was inconsistent with the silent letter ‘b’ that “climb” (61.1%) and “lamb” (61.1%) had the highest error rate, but “comb” had a low error rate of 22.2%. A Chi-Square test of independence indicated that the distribution of participants’ epentheses across the silent letter “b” stimuli, “climb”, “lamb”, and “comb”, was significantly different, χ² (2) = 21.8, p < .001.

For the silent letter “l”, “walk” yielded a relatively lower error rate of 7.4% than its counterpart “salmon” (46.3%)Footnote ³ . A Chi-Square test of independence confirmed that the distribution of participants’ epentheses across the two stimuli was significantly different, χ² (1) = 20.8, p < .001. As AoA and exposure to L2 affects L2 learning (Indefrey, Reference Indefrey2006), a possible explanation is that “walk” (M = 3.45) was acquired at an earlier age than “salmon” (M = 8) (Kuperman et al., Reference Kuperman, Stadthagen-Gonzalez and Brysbaert2012). Another possible reason is that participants were more familiar with “walk” than “salmon”. The familiarity ratings with a 7-point scale from our piloting data for Experiment 2 confirmed that (M _walk = 6.96; M _salmon = 4.88).

For the silent letter “d”, “Wednesday” and “landscape” yielded an error rate of 18.5% and 22.2%, respectively. A Chi-Square test of independence showed no significant difference in the distribution of participants’ epentheses across these two stimuli, χ² (1) = 0.228, p = .633.

Phoneme deletion and pseudoword read-aloud tasks

For the phoneme deletion task, participants obtained a mean accuracy rate of 60% (SD = 19.2%). In the current study, the mean error rates of congruent stimuli (M = 39.2%, SD = 12.5%) and incongruent stimuli (M = 40.7%, SD = 12.7%) did not show a significant difference, F(1, 53) = .68, p = .41. The mean accuracy rate for the pseudoword read-aloud task was 63.5% (SD = 3.28%). Overall, Cantonese participants displayed a moderate level of phonological awareness, which was consistent with previous findings of a relatively low level of phonological awareness in ESL learners from Hong Kong, compared to those from Mainland China and Vietnam (Holm & Dodd, Reference Holm and Dodd1996).

Orthographic effects, phonological awareness, and L2 proficiency

The total phonological awareness (PA) score was the sum of raw scores from the phoneme deletion and the pseudoword read-aloud task. A higher total PA score indicates a higher level of phonological awareness. Ten Pearson’s correlation tests between homophone read-aloud task, silent-letter read-aloud task, phoneme deletion task, pseudoword read-aloud task, and HKDSE scores were conducted at a Bonferroni-adjusted alpha level of .005 (0.05/10). Table 4 shows the correlation matrix.

Table 4. Correlation Matrix Among Error Rates in the Orthographic Effect Tasks, Accuracy Rates in the Phonological Awareness (PA) Tasks, the Total PA Score, and the Hong Kong Diploma of Secondary Education Examination (HKDSE) Score in Experiment 1

Note.*p < .005 (Bonferroni-adjusted).

A significant correlation was found between accuracy rates from the phoneme deletion and pseudoword read-aloud tasks at a Bonferroni-adjusted level (r = .63, p < .001), indicating that these tasks showed sufficient validity in measuring phonological awareness while measuring different dimensions of it. Participants’ error rate in the homophone read-aloud task showed a significant negative correlation with the PA score at a Bonferroni-adjusted level (r = −.67, p < .001), and the HKDSE score at a Bonferroni-adjusted level (r = −.51, p < .001). PA score had a significant positive correlation with HKDSE score at a Bonferroni-adjusted level (r = .53, p < .001). The silent-letter read-aloud performance showed no significant correlation with any other variables.

Discussion

The Cantonese ESL participants showed intra-orthographic effects on their L2 pronunciation. In contrast to inter-orthographic effects, intra-orthographic effects took place as different scripts are used in Cantonese and English such that the interference by L1 orthography-to-phonology correspondences is not involved. Transferring the L1 reading strategy, which relies heavily on orthography to decode phonology, to L2 reading, could possibly explain the intra-orthographic effects demonstrated by Japanese-English sequential bilinguals in Sokolović-Perović et al. (Reference Sokolović-Perović, Bassetti and Dillon2020). However, it could not explain the orthographic effects demonstrated by our Cantonese ESL learners as they do not use any transparent orthographies in their L1. The presence of orthographic effects suggested that the Cantonese participants relied on the English spellings to decode the pronunciation to a certain extent, even though such reading strategy is not transferred from their L1. This also implies that the Cantonese participants were not always using the “whole-picture” reading strategy from the L1 to process English words in this experiment.

Although English is considered as an opaque orthography with many irregular and inconsistent GPCs (Borleffs et al., Reference Borleffs, Maassen, Lyytinen and Zwarts2017), its alphabetic nature still allows a good proportion (79.3%; cf. 90.4% for German) of phonological representations to be correctly retrievable from its orthography using the GPC rules (Ziegler et al., Reference Ziegler, Perry and Coltheart2000). One possibility is that the Cantonese participants recognize this property of English and adopt a “hybrid” reading strategy that allows them to decode English words at several different grain sizes, including mapping from grapheme to phoneme, letter pattern to rime or syllable, and mapping at the whole-word level.

In this study, Cantonese ESL learners with higher phonological awareness and English proficiency were less influenced by orthographic forms on their English pronunciation. Aligned with inter-orthographic effects demonstrated in Bassetti et al. (Reference Bassetti, Mairano, Masterson and Cerni2020), intra-orthography effects in this study were also negatively related to both L2 proficiency and phonological awareness. This implies that accuracy and precision of L2 phonological representations as reflected by level of phonological awareness and L2 proficiency level are likely to be important predictors for both inter- and intra-orthography effects.

According to the fuzzy lexicon hypothesis, the phonological representation of L2 words is not fully specified and lacks details such that some phonemes and phonemic sequences are underspecified (Cook et al., Reference Cook, Pandža, Lancaster and Gor2016). This is more likely at the early stage of L2 acquisition and for less-proficient L2 learners (Cook et al., Reference Cook, Pandža, Lancaster and Gor2016). Hence, fuzzy L2 phonological representation in L2 learners might force them to seek orthographic input as another source of information.

However, neither L2 proficiency nor phonological awareness correlated with performance in the silent-letter read-aloud task. Cantonese participants tended to pronounce the silent letter /b/ in “climb” and “lamb” only, but not as much for “comb”. Performance for “walk” and “salmon” was also inconsistent. These results suggested that Cantonese participants did not aware of the convention of silent letters in English.

Since Cantonese participants have long been observed with a consistent tendency to devoicing the final obstruent in English (Chan, Reference Chan2006; Chan & Li, Reference Chan and Li2000; Edge, Reference Edge1991), the attempt to pronounce the silent letter “b” was less likely to be due to contrastive differences in the sound inventory between Cantonese and English, or articulation of permissible final consonants. In English, “b” and “d” can be in either word-initial or word-final position. In Cantonese, “b” and “d” can only be in word-initial position. Our Cantonese participants did not consistently conceal the letter “b” in word-final /mb/ or /bt/ clusters, which was against their habit of omitting English obstruents in word-final position. This implied that the intra-orthographic effect is less likely to be contributed by L1 influence. Rather, it is probably internal to English as silent-letter words in English trigger a regularity violation.

Experiment 2

Experiment 1 examined intra-orthographic effects in Cantonese ESL learners as they have orthographically opaque and non-alphabetic L1 such that their reading strategy does not rely on highly consistent orthography-phonology correspondences. These results ruled out the transfer of such L1 reading strategy as a key mechanism underlying intra-orthographic effects. Our results from Experiment 1 suggested that how accurate the L2 learners phonologically represent L2 words instead predicted intra-orthographic effects. Experiment 2 aimed to further examine the role of the L1 reading strategy in intra-orthographic effects by studying Mandarin Chinese ESL learners from Mainland China. Unlike Cantonese speakers in Hong Kong who learn to read Cantonese by rote only, Mandarin ESL learners from Mainland China learn to read through Pinyin, a transparent Romanized phonetic script, to represent the pronunciation of Chinese characters.

In Mainland China, learning Pinyin in the primary school is a core component in the national curriculum as the first step to learn Chinese (Ministry of Education of the People’s Republic of China, 2011). The primary function of Pinyin is to link abstract Chinese characters to its pronunciation. Children can learn the pronunciation of novel words through its Pinyin. Pinyin is a useful tool for Chinese characters that people are capable to pronounce but unable to write. In addition, Pinyin is the main tool for typing Chinese characters on computers or smartphones using an English keyboard. Pinyin is critical for Chinese reading. A recent study has identified a reciprocal relationship between Pinyin skills and character recognition (Zhang et al., Reference Zhang, Georgiou, Inoue, Zhong and Shu2020).

The Pinyin system adopts 25 Roman alphabets, which are also used in English, excluding “v” and adding “ü”. Many graphemes are shared in English and Mandarin Pinyin, but they do not sound the same across the two languages (see Appendix E for a display of Pinyin initials and finals and their examples, together with IPA-based symbols and approximate English pronunciation). Unlike English words consisting of consonants and vowels, initials and finals are the essential elements of Pinyin. Initials contain consonants; Finals include (1) simple vowels (e.g., a, e, and ü) or (2) compound finals. Compound finals are composed of two or three vowels (e.g., ai, ei, and uei), or a vowel followed by a nasal consonant (e.g., an, ian, and ong). A Pinyin syllable can be spelled with either an initial with a final or just a final itself with one of the four tones, including a flat tone (–), a rising tone (/), a falling-rising tone (∨), or a falling tone (\).

As Lü (Reference Lü2017) pointed out, reading acquisition of a language crucially starts with discovering the basic unit that is embedded in each graphic symbol, followed by uncovering the mapping details between the graphic symbol and its sound. For Chinese, learning to read first requires individuals to realize that each Chinese character is monosyllabic and corresponds to a morpheme with little phonological consistency. Therefore, learning to read in Chinese requires great memorization (Bialystok et al., Reference Bialystok, McBride-Chang and Luk2005). Learning to read words in English includes an initial process of first realizing that letters represent sounds (Lü, Reference Lü2017). The alphabetic system of English makes readers to rely heavily on phonics in learning to read words (Bialystok et al., Reference Bialystok, McBride-Chang and Luk2005). In later literacy, English speakers would gradually figure out the GPCs might not always be regular. The fundamental difference between Chinese and English learning might prompt Cantonese ESL learners to adopt different reading strategies for the two languages as shown in Experiment 1.

However, Mandarin speakers rely on alphabetic Pinyin to pronounce Chinese characters. As the writing system of Pinyin is highly transparent due to consistent letter-sound correspondences (Bassetti, Reference Bassetti, Andreas, Xin and Yexin2007), it is possible that Mandarin speakers use the reading strategy of Pinyin in English reading, leading to non-target-like English pronunciation. Pinyin does not contain silent letters. Common silent letters in English, such as “b”, “d”, and “h”, are all pronounced in Pinyin. Besides, each Roman alphabet in Pinyin corresponds to only one sound, while one Roman alphabet in English could link to different sounds. For example, “a” in Pinyin is similar to /ɑː/ only, but “a” in English is mapped to /ɑː/ in “father”, /ə/ in “about” and /eɪ/ in “base”. As Chinese and English belong to entirely different writing systems, inter-orthographic effects that rely on grapheme-to-phoneme conversions between the two languages is unlikely. However, it is possible that the shared scripts of Pinyin and English exert extra inter-orthographic effects in Mandarin speakers due to incongruent GPCs between Pinyin and English.

It is likely that the daily practice of using transparent Pinyin scripts to read or type in Chinese encourages Mandarin speakers to also rely on orthographies when learning English. Regardless of the incongruent GPCs between English and Pinyin, literacy of Pinyin aided the development of phonological awareness in the early stage of learning both Mandarin and English (McBride-Chang et al., Reference McBride-Chang, Bialystok, Chong and Li2004; McDowell & Lorch, Reference McDowell and Lorch2008). Numerous studies have found that Mainland participants outperformed Hong Kong participants in phonemic-related tasks in Chinese (Cheung & Chen, Reference Cheung and Chen2004; Cheung et al., Reference Cheung, Chen, Lai, Wong and Hills2001) and English (McDowell & Lorch, Reference McDowell and Lorch2008). These findings suggest that Mandarin speakers may be more aware of the relationship between graphemes and phonemes compared to Cantonese-speaking ESL learners, thereby making them more dependent on orthographies.

Experiment 2 investigated if Mandarin-speaking participants would be susceptible to more orthographic effects in their English pronunciation compared to Cantonese-speaking participants due to possible inter-orthographic effects from Pinyin when controlling for L2 proficiency. Controlling for L2 proficiency is crucial as Hong Kong ESL learners have an earlier age of English acquisition and are exposed to English more in their daily life compared to Mandarin speakers from Mainland China due to a difference between their English education system and English-speaking environment (Nunan, Reference Nunan2003). Also, consistent to Experiment 1, we hypothesized that L2 proficiency would predict orthographic effects on L2 production for both Cantonese and Mandarin participants. Since word familiarity has an influence on orthographic effects (Veivo & Järvikivi, Reference Veivo and Järvikivi2013), stimuli in Experiment 2 were better controlled on word familiarity.

Experiment 2 also used homophone and silent-letter read-aloud tasks to examine orthographic effects on English pronunciation in Chinese ESL learners from Hong Kong and China. As HKDSE is not available for Mandarin participants from China and to rule out the potential influence from the time gap between test-taking and the current experiment, Experiment 2 used Lexical Test for Advance learners of English (LexTALE; Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012), which is a valid and standardized test for measuring English proficiency in advanced L2 learners.