1. Introduction
Research on bilingual first language acquisition has shown that children differentiate between their two linguistic systems from an early age (e.g., Döpke, Reference Döpke1999; Genesee, Reference Genesee1989; Hulk & Müller, Reference Hulk and Müller2000). It is also generally accepted that the two linguistic systems of simultaneous bilinguals may interact, a phenomenon known as interdependence (Paradis & Genesee, Reference Paradis and Genesee1996) or crosslinguistic influence (Hulk & Müller, Reference Hulk and Müller2000). In particular, Paradis and Genesee (Reference Paradis and Genesee1996) identify three possible outcomes that interdependence or crosslinguistic influence may lead to, namely transfer, delay, and acceleration. Paradis and Genesee (Reference Paradis and Genesee1996) define transfer as “the incorporation of a grammatical property into one language from the other” (Reference Paradis and Genesee1996, p. 3). Hence, transfer typically leads to some ungrammatical utterances that depart from the typical path of monolingual acquisition, as the bilingual child produces non-adult structures that are syntactic calques of his/her other L1.
Delay, on the other hand, is the effect through which the rate of acquisition of specific properties or structures in one language decreases as a consequence of crosslinguistic influence from the other language (originally, Paradis and Genesee hypothesised that delay might affect the overall rate of acquisition in bilinguals [Reference Paradis and Genesee1996, p. 4], but this interpretation is now outdated). The third possible outcome of interdependence is acceleration. This refers to the possibility that a certain linguistic property may emerge in the speech of a bilingual earlier than it does in monolinguals or, as in recent extensions of the original definition, that interaction between the two languages may result in bilinguals attaining acquisition of a property more quickly than monolinguals and thus in “superior linguistic skills in bilinguals compared with monolinguals” (Fabiano-Smith & Goldstein, Reference Fabiano-Smith and Goldstein2010, p. 162). The idea behind acceleration is that mastery of a particular structure in one of the two languages facilitates acquisition of the corresponding structure in the other language, thus enabling the bilingual child to outperform monolinguals in some linguistic domains.
2. Crosslinguistic interaction in bilingual first language acquisition
There is a considerable body of research examining the ways in which the grammars of bilingual children interact and investigating which particular linguistic areas are vulnerable to interdependence; transfer is perhaps the most studied of the three potential outcomes. However, the vast majority of these studies are within the domain of syntax. For example, Müller and colleagues have reported transfer in subordinate clauses in German–French bilinguals (Müller, Reference Müller1998) and object-drop in Dutch–French, German–French and German–Italian bilinguals (Hulk & Müller, Reference Hulk and Müller2000; Müller & Hulk, Reference Müller and Hulk2001). Serratrice and colleagues have found transfer effects in the development of pronominals (Serratrice, Sorace & Paoli, Reference Serratrice, Sorace and Paoli2004; Sorace, Serratrice, Filiaci & Baldo, Reference Sorace, Serratrice, Filiaci and Baldo2009) and of anaphoric constructions (Serratrice, Reference Serratrice2007) in English–Italian bilingual children. Delay has been observed in some areas of grammar such as the development of word form recognition in Welsh–English infants (Vihman, Lum, Thierry, Nakai & Keren-Portnoy, Reference Vihman, Lum, Thierry, Nakai, Keren-Portnoy, McCardle and Hoff2006), in the acquisition of object pronouns in French–English bilinguals (Pérez-Leroux, Pirvulescu & Roberge, Reference Pérez-Leroux, Pirvulescu and Roberge2009) and of copular constructions in Spanish–English bilinguals (Silva-Corvalán & Montanari, Reference Silva-Corvalán and Montanari2008). Acceleration seems to be much less common than either transfer or delay, though it has been reported on some occasions, notably in the acquisition of the determiner system in German–Italian and German–French children (Kupisch, Reference Kupisch, Gaerts and Jacobs2005).
Although research on bilingual phonology is much less extensive, all three outcomes predicted by Paradis and Genesee (Reference Paradis and Genesee1996) have been attested. Paradis (Reference Paradis2001) found transfer of stress patterns from French into English in French–English bilingual children, while Fabiano-Smith and Barlow (Reference Fabiano-Smith and Barlow2010) reported bidirectional transfer across phonemic inventories in Spanish–English bilingual children: children produced Spanish-specific sounds when speaking English and English-specific sounds when speaking Spanish. Lleó and Rakow (Reference Lleó, Rakow, Cohen, McAlister, Rolstad and MacSwan2003) showed transfer in the assimilation of coda nasals in German–Spanish bilingual children. Kehoe (Reference Kehoe2002) reported delay in the acquisition of the German vowel system (particularly vowel-length distinctions) in German–Spanish bilinguals, while Goldstein and Washington (Reference Goldstein and Washington2001) found that Spanish–English 4-year-old bilinguals were considerably less accurate than their monolingual peers in the rendition of spirants, flaps and trills in Spanish. Similarly, Lleó (Reference Lleó2002) found delay in the development of complex prosodic structures in German–Spanish bilinguals. Evidence of acceleration is rather meagre, however, and as far as we know it has only been reported once within phonology, in relation to coda consonants in Spanish–German bilinguals (Lleó, Kuchenbrandt, Kehoe & Trujillo, Reference Lleó, Kuchenbrandt, Kehoe, Trujillo and Müller2003). The Lleó et al. (Reference Lleó, Kuchenbrandt, Kehoe, Trujillo and Müller2003) study is also one of very few that investigated phonological structure at the syllabic level, as most research on phonology in bilingual acquisition has focused either on prosody (i.e., intonation, stress and rhythm) or on segmental aspects, particularly segmental transfer (i.e., transfer across phonemic inventories). Structural aspects of phonology in general, and consonant clusters in particular, have received relatively little attention, especially with regard to potential acceleration effects. As far as we are aware, the only two studies that have investigated consonant clusters in bilingual acquisition are those of Yavas and Barlow (Reference Yavas and Barlow2006) and Mayr, Jones and Mennen (Reference Mayr, Jones and Mennen2014). However, the former study restricted its focus to only word-initial s + consonant sequences, while the latter involved two languages with almost identical cluster phonotactics, a factor that virtually excluded the possibility of observing any crosslinguistic influence in this domain. The current study contributes towards filling this research gap by investigating non-word repetition performance in bilingual children whose two languages differ systematically as to the types of clusters they allow. The purpose of the study is twofold. Firstly, to test for potential acceleration effects in cases where children simultaneously acquire two languages with different consonant cluster typologies. Specifically, we wanted to investigate whether there is a bilingual advantage in the attainment of an advanced phonological feature, namely consonant clusters. Secondly, to test two competing views of phonological organisation that make conflicting predictions as to whether and where acceleration of cluster structures should occur. To these ends, the focus of the study will be on word-initial and word-medial onset clusters.
3. Consonant Clusters in English and Polish
It is well known that languages differ according to their phonotactic requirements which, among other things, pose limits on what consonants may cluster and in which position. Consonant clusters are typically categorised according to their sonority profile, which is in turn based on the sonority scale. The sonority scale classifies segments based on how sonorous they are, a property that depends on the degree of opening involved in their articulation (Clements, Reference Clements, Kingston and Beckman1990; Kent, Reference Kent1993; Selkirk, Reference Selkirk, Aronof and Oehrle1984), sometimes also classified as “loudness” (Ladefoged, Reference Ladefoged1982). Although there is disagreement as to the exact contents of the sonority scale, a common representation of a 5-point sonority scale is given in figure 1 (following Berent, Steriade, Lennertz & Vaknin, Reference Berent, Steriade, Lennertz and Vaknin2007; Morelli, Reference Morelli, Féry and Van de Vijver2003):
As can be seen from figure 1, vowels (and glides) are the most sonorous segments, while plosives – which involve complete obstruction of the vocal tract – are the least sonorous. Following the sonority scale, clusters involving two consonants can have one of three profiles: rising sonority, as in an obstruent-liquid cluster (e.g., [pl]); falling sonority, as is the case for a fricative-plosive cluster (e.g., [st]); or they can constitute a sonority plateau, as in plosive-plosive clusters (e.g., [pt]).
Languages may therefore differ on two dimensions: namely, which sonority profile(s) they allow (rising, falling, or plateau) and in which position. English and Polish are examples of languages that show differences across these dimensions, with Polish allowing all three sonority profiles both word-initially (e.g., [pr]osić, “ask”; [vd]owa, “widow”; [pt]ak, “bird”) and word-medially (e.g., kro[pl]a “drop”; pro[zb]a, “request”; klo[tk]a, “padlock”) while English allows all three profiles only word-mediallyFootnote 1 (e.g., a[pl]y; po[st]er; se[kt]or) and only two of the three profiles word-initially (e.g., [pl]an; [sk]ate). In other words, the clustering patterns of English are a proper subset of the clustering patterns we find in Polish. Given that the emergence of acceleration is conditional on the child having achieved a “more advanced level of [. . .] complexity in one language than in the other” (Paradis & Genesee, Reference Paradis and Genesee1996, p. 3), the question arises as to what this subset relation means in terms of potential complexity levels across the two languages. Addressing this question will enable us to identify what forms of acceleration might be expected in the acquisition of word-level phonology in Polish–English bilinguals.
4. Frequency and complexity in determining consonant cluster types
Phonological theories can be broadly distinguished on the basis of their representational formats and subdivided into “structural” and “categorisation” perspectives (e.g., Abbot-Smith & Tomasello, Reference Abbot-Smith and Tomasello2006; Johnson, Reference Johnson2007; Rouder & Ratcliff, Reference Rouder and Ratcliff2006). Among other things, these perspectives differ in how they view sound sequences within language. Structural perspectives view sound sequences as units within a higher hierarchical structure. Sequences can therefore vary in relation to how their units are hierarchically related to each other and how much (or how little) they are embedded within the hierarchical structure that confines them. As a result, sound sequences present potentially different levels of structural complexity (part of what Gierut, Reference Gierut2007, calls “ontological complexity”). Categorisation models, on the other hand, do not assume any structural levels and thus do not view sequences as more or less “complex”, but rather as more or less “strong” (Bybee, Reference Bybee1985) or “entrenched” (Abbot-Smith & Tomasello, Reference Abbot-Smith and Tomasello2006; Langacker, Reference Langacker1987), depending on how frequent (or infrequent) a sequence is in the input. Below we present each of these perspectives in some detail.
Structural complexity in consonant clusters
Structural perspectives view sounds as segments belonging to a structural unit, typically the syllable (though other types of structural abstractions have also been proposed, e.g., Lowenstamm, Reference Lowenstamm, Durand and Laks1996). On this view, clusters can be of different types depending on whether the consonants that constitute them belong to the same syllable (tautosyllabicity) or to two adjacent yet separate syllables (heterosyllabicity).
In standard onset-rhyme theories, syllable membership is decided based on a principle known as the Sonority Sequencing Generalisation (Clements, Reference Clements, Kingston and Beckman1990) according to which a well-formed syllable involves an increase in sonority towards the peak and a decrease towards the edges (e.g., Selkirk, Reference Selkirk, Aronof and Oehrle1984; Steriade, Reference Steriade1982). The two consonants in a word-initial or word-medial cluster are therefore taken to belong to the same syllable if they exhibit a rising sonority slope. These tautosyllabic clusters are straightforwardly represented as cases of onset branching, independently of whether they occur word-initially or word-mediallyFootnote 2 .
On the other hand, clusters of non-rising sonority such as stop-stop and fricative-stop clusters violate sonority sequencing, and are therefore treated as heterosyllabic. Moreover, they are treated differently depending on the position they occupy within a word. While they are typically assumed to be coda-onset sequences when appearing word-medially, word-initial instances are treated as somewhat special cases involving an adjunct or extrasyllabic segment (e.g. Rubach & Booij, Reference Rubach and Booij1990; Davis, Reference Davis1990; Halle & Vergnaud, Reference Halle and Vergnaud1980; Kenstowicz, Reference Kenstowicz1994; Rochoń, Reference Rochoń2000; Steriade Reference Steriade1982).
While sonority-based approaches have attracted some criticism (e.g., Ohala, 1990), they nevertheless remain the most common and possibly most researched accounts of syllabic structure (Gouskova, Reference Gouskova2001; Reference Gouskova2004; Steriade, Reference Steriade1982; Selkirk, Reference Selkirk, Aronof and Oehrle1984; Prince & Smolensky, 1993/Reference Prince and Smolensky2004; Smolensky, Reference Smolensky, Smolensky and Legendre2006), and we will therefore rely on a sonority-based perspective in our treatment of syllable structure within onset-rhyme theory.
The structural taxonomy represented in figures 2 and 3 neatly captures typological alternations whereby a language like Spanish (e.g., Harris, Reference Harris1969) may allow branching onsets (i.e., the structure in figure 2) but disallow extrasyllabic consonants (i.e., the structure in figure 3a), while some other language – e.g., Korean (Sohn, Reference Sohn1986) – may allow coda-onset clusters (i.e., the structure in figure 3b) but ban branching onsets (i.e., the structure in figure 2). In relation to English and Polish, figures 2 and 3 show that the two languages allow the very same levels of structural complexity, as both permit all possible structures: namely, onset branching (figure 2), adjunction (figure 3a), and coda-onset sequences (figures 3b). The fact that English does not allow sonority plateaus word-initially does not affect its complexity level, as onset-rhyme theories treat all clusters of non-rising sonority as cases of adjunction, regardless of whether they involve plateaus or falling slopes, thus putting English on a par with Polish in terms of structural complexity. Consequently, following an onset-rhyme view of CC clusters we may hypothesise that no acceleration may occur between Polish and English either word-initially or word-medially, as the two languages allow the same levels of structural complexity in both positions.
Categorization models and consonant clusters
A radically different perspective on phonological organisation is presented by “categorisation models” (Mompeán-González, Reference Mompeán-González2004) such as exemplar-based or usage-based phonology (e.g., Bybee, Reference Bybee2003; Pierrehumbert, Reference Pierrehumbert, Bod, Hay and Jannedy2003) which take the view that structural abstractions such as syllables are redundant. According to this view, linguistic knowledge involves memorising phonetic tokens of individual lexical items together with associated meanings and situational cues. It is from this information that phonological patterns may later emerge. Within a system of this type, advanced levels of complexity are equivalent to “strength” (Bybee, Reference Bybee1985) or “entrenchment” (Langacker, Reference Langacker1987) of forms, which is in turn directly proportional to frequency in the input (e.g., Abbot-Smith & Tomasello, Reference Abbot-Smith and Tomasello2006; Frisch, Large, Zawaydeh & Pisoni, Reference Frisch, Large, Zawaydeh and Pisoni2001). The more frequent a certain form is, the quicker and more effective its categorisation and consequent acquisition will be. Consequently, according to categorisation models, achievement of more or less advanced levels of phonological ability is not due to different complexity levels inherent in consonant clusters, but to the frequency levels of possible consonantal combinations within the linguistic input. Therefore, before any predictions can be made in relation to potential acceleration phenomena it is first necessary to compare frequency levels for cluster types across the two languages at issue.
We analysed the 10000 most frequent words in Polish and English using subtitles corpora, a method that provides a close match with the lexical choice of natural spoken language use (Meunier & Gouverneur, Reference Meunier and Gouverneur2009; Taylor, Reference Taylor2004). The analysis revealed that the Polish system involves about twice as many s + obstruent clusters as English, both word-initially and word-medially, for a ratio of 2.09 : 1 (627/300) and 2.01 : 1 (828/411) respectively. Obstruent-liquid clusters, on the other hand, are more common in English than in PolishFootnote 3 .
As s + obstruent clusters are twice more prevalent in Polish than in English, it follows that representation of s + consonant sequences will be stronger (Bybee, Reference Bybee1985) or more entrenched (Abbot-Smith & Tomasello, Reference Abbot-Smith and Tomasello2006; Langacker, Reference Langacker1987) in the linguistic knowledge of Polish–English bilinguals than in that of their monolingual English-speaking peers. Therefore, if the two systems communicate at the level of phonological organisation, the Polish system of a Polish–English bilingual may offer a higher level of entrenchment with which to aid the development of the English system. Following categorisation views of phonology we may therefore hypothesise that Polish–English bilingual children would perform better than their monolingual English peers in the production of s + obstruent clusters in English both word-initially and word-medially, since these clusters are twice as frequent in the Polish input in both positions. Further, we may hypothesise that no acceleration should occur for obstruent-liquid clusters in either position, since these are actually more frequent in English than in Polish.
In the remainder of the paper we investigate these hypotheses together with the hypothesis arising from the onset-rhyme view (i.e., that no acceleration should occur in any position) by analysing the English nonword repetition performance of Polish–English bilingual children word-initially and word-medially and comparing it with that of monolingual English-speaking children.
5. Data collection
Method
Participants
Fifteen Polish–English bilingual children (9 female, 6 male) aged 7;1 to 8;11 (mean age 8;2, seven months standard deviation) were tested in this experiment. Fifteen monolingual English children (11 female, 4 male) of the same age range (mean age 8;3, eight months standard deviation) also participated in the experiment as control group. A t-test confirmed that the two groups did not differ significantly in terms of age: t(28) = −.12, p = .905.
The age range of the participants was selected upon consideration of two factors: the fact that /s/+stop clusters develop to a standard of 75% on average at around age 6;0 in the spontaneous speech of typically developing children (Smit, Hand, Freilinger, Bernthal & Bird, Reference Smit, Hand, Freilinger, Bernthal and Bird1990. Also, Smit, Reference Smit1993 suggests that s + consonant clusters are still below a 90% performance at ages 7–9), coupled with the fact that our test involved relatively long (3 syllables)Footnote 4 , unfamiliar items of low-lexicality in the form of nonwords, thus adding further levels of difficulty compared to spontaneous speech (e.g., Gathercole, Willis, Emslie & Baddeley Reference Gathercole, Willis, Emslie and Baddeley1991; Jones, Tamburelli, Watson, Gobet & Pine Reference Jones, Tamburelli, Watson, Gobet and Pine2010).
All participants were administered the expressive vocabulary, sentence structure and word structure tests of the Clinical Evaluation of Language Fundamentals-4 (CELF, Semel, Wiig & Secord, Reference Semel, Wiig and Secord2003). Participant selection was based on achieving expressive vocabulary scores within normal ranges and no more than two standard deviations below the mean (e.g., Oetting & Rice, Reference Oetting and Rice1993; Rinker, Kohls, Richter, Maas, Schulz & Schecker, Reference Rinker, Kohls, Richter, Maas, Schulz and Schecker2007). Each child from the bilingual group was individually matched to a child from the monolingual group based on raw scores from the sentence structure and word structure components of the test. Performance scores for the two groups are given in table 2 below (see also appendix 2).
The two groups did not differ significantly in their performance on any of the three subtests: expressive vocabulary t(28) = −.602, p = .552; sentence structure t(28) = −.861, p = . 396, word structure t(28) = .229, p = .821.
Children were recruited and tested in schools within the Nottinghamshire and Derbyshire areas of the UK. All children were reported by school staff as exhibiting typical linguistic and cognitive development and no hearing difficulties or learning disabilities.
Design
Nonwords were manipulated for two repeated-measures independent variables: cluster position (word-initial or word-medial) and cluster type (obstruent-liquid vs. s + obstruent). Participant group (monolingual or bilingual) was also manipulated between subjects. The dependent variable was the accuracy of repetition for the target cluster in the nonword.
Materials
Children were tested through a nonword repetition task. Nonword repetition tasks are widely used as a measure of phonological ability and phonological memory capacity in both typical and atypical language development (e.g., Coady & Evans, Reference Coady and Evans2008; Gathercole, Reference Gathercole2006). The task involves instructing participants to repeat nonsense words that contain the structures to be investigated. For the current study, 36 trisyllabic nonwords were devised. As the aim of the study is to investigate whether knowledge of Polish affects performance in English, the nonwords were specifically developed so that they could be potential English words while being highly unlikely (or even impossible) Polish words. This was done by ensuring that the nonwords followed the phonotactics of English while violating Polish patterns both at the segmental and at the prosodic level. At the segmental level, each non-word contained a schwa (unstressed position) as well as one long vowel or oral diphthong (e.g., [ ː],[ɜː], [eɪ], [aʊ]), both of which are not possible Polish phonemes (Gussmann, Reference Gussmann2007). At the prosodic level, each word followed a strong-weak-strong stress pattern (primary stress-zero stress-secondary stress), a pattern that is not only typical of English phonology (especially in trisyllabic English nouns, see Burzio, Reference Burzio1994; Hammond, Reference Hammond1999) but also rare in Polish, a language in which stress is almost invariably penultimate (e.g., Jassem, Reference Jassem2003)Footnote 5 . This, together with the fact that the experimenter addressed the children in English, ensured as much as possible that the children would carry out the task in a monolingual English mode (Grosjean, Reference Grosjean1989; Soares & Grosjean, Reference Soares and Grosjean1984) or at least towards the English end of the bilinguals’ continuum (e.g., Amrhein, Reference Amrhein1999; Grosjean, Reference Grosjean and Nicol2001).
Each nonword contained one consonant cluster in either word initial or word medial position. The cluster was either an obstruent-liquid or s + obstruent sequence. The clusters involved were /pl/, /fl/, /bl/ for the obstruent-liquid (OL) condition, and /st/, /sp/, /sk/ for the s + obstruent (sO) condition. Adequate assessment of the production of each consonant cluster was achieved by repeating each cluster three times within each condition, while changing the surrounding phonological context (i.e., while the cluster was repeated, the remainder of the nonword changed). To ensure as far as possible that any pattern of performance would be due to the actual cluster type rather than its frequency of occurrence within the English language, all clusters in the stimuli were matched for average biphone frequency (i.e., how frequent the consonantal sequence is in the language), as was the surrounding phonological context (i.e., frequency of the biphone composed of the second consonant from the target cluster together with the following vowel). For example, for the nonword /ˈplɪkəˌriːdʒ/ we calculated frequency of the biphones /pl/, /lɪ/, /ɪk/ and so forth as to include all component biphones, thus integrating all transitional probabilities within the calculations for each stimulus nonword. This method, based on work by Vitevich and colleagues (e.g., Vitevich & Luce, Reference Vitevitch and Luce1998; Vitevich, Luce, Charles-Luce & Kemmerer, Reference Vitevitch, Luce, Charles-Luce and Kemmerer1997) has been shown to be a good predictor of word-likeness ratings (Frisch, Large & Pisoni, Reference Frisch, Large and Pisoni2000). Biphone frequencies were based on occurrence in the Children's Printed Word Database (Masterson, Stuart, Dixon & Lovejoy, Reference Masterson, Stuart, Dixon and Lovejoy2010) a database of word frequencies for 5 – 9 year olds (see also Tamburelli & Jones, Reference Tamburelli and Jones2013). The complete list of stimuli can be found in appendix 2.
The 36 nonwords were recorded onto a Sony ICD-MX20 digital voice dictaphone by a researcher native to the Nottingham area, and subsequently converted into MP3 format using Sony Digital Voice Editor, v. 3.1. The nonwords were recorded in a randomised order, and each nonword was followed by 3 seconds of silence.
Procedure
The children were visited at their school following informed written consent from parents and were assessed on a one-to-one basis in a quiet room away from their classroom. Testing was carried out over two separate sessions on consecutive weeks. In order to maintain the child's attention, the nonword repetition test was divided across the two sessions in a counterbalanced manner. In addition to a nonword repetition task in each session, session 1 also administered the test of Expressive Vocabulary from the CELF4. The second session administered the other core tests from the CELF: the test of Sentence Structure and the test of Word Structure. Children heard the stimuli through a Sony ICD-MX20 digital voice dictaphone with Creative TravelDock 900 Portable speakers, and spoke their responses into another of the same device.
Nonwords were transcribed in their phonemic form by one of the authors. Responses were marked as correct if the target cluster was repeated correctly.
A random sample of 20% of the responses was transcribed by a second researcher not associated with this project, and phoneme-by-phoneme inter-rater reliability was 91%. Disagreements between the two transcriptions were resolved through discussion.
Results
The percentage of correct responses per condition for each participant group are presented in Figure 4.
A 2 (cluster position: initial or medial) X 2 (cluster type: obstruent-liquid or s + obstruent) X 2 (participant group: bilingual or monolingual) mixed-design ANOVA revealed no significant main effects of cluster position, cluster type or participant group: F(1,28) = 1.762, p = .195, F(1,28) = 1.677, p = .206, F(1.29) = 1.482, p = .234 respectively. There was a significant interaction between cluster type and cluster position F(1,28) = 69.723, p < .001, but no interaction between cluster type and participant group: F(1,28) = 0.02, p = .89, or between cluster position and participant group: F(1,28) = 3.66, p = .066. However, the three-way interaction between cluster position, cluster type, and participant group was significant: F(1,28) = 5.345, p = .028.
A by-items (F2) analysis showed exactly the same effects. There were no effects of cluster position (F2(1,32) = .445, p = .51), cluster type (F2(1,32) = .552, p = .463) or participant group (F2 (1.32) = 1.757, p = .194), a significant interaction between cluster type and cluster position (F2 (1,32) = 5.492, p = .025), no interaction between cluster type and participant group (F2 (1,32) = .055, p = .817) and no interaction between cluster position and participant group (F2 (1,32) = 2.681, p = .111). Once again there was a significant cluster position X cluster type X participant group interaction (F2 (1,32) = 4.432, p = .043).
Separate analyses were performed within each of the two cluster types. These revealed a significant effect of cluster position for obstruent-liquid clusters, such that more errors were made word initially than word medially: F(1,28) = 48.103, p < .001. A significant effect of cluster position was also apparent for s + obstruent clusters, such that more errors were made word medially for this cluster type: F(1,28) = 22.724, p < .001. There was no effect of participant group for either cluster type: F(1,28) = 1.058, p = .312 for obstruent-liquid, and F(1,28) = 0.71, p = .407 for s + obstruent. There was no effect of participant group for either cluster type: F(1,28) = 1.058, p = .312 for obstruent-liquid, F(1,28) = 0.71, p = .407 for s + obstruent, and no significant cluster position X participant group interaction for obstruent-liquid clusters: F(1,28) = .061, p = .806. However, a significant cluster position X participant group interaction was found for s + obstruent clusters: F(1,28) = 8.515, p = .007, indicating that bilingual children performed better than monolinguals in the word initial s + obstruent condition but not in the word medial s + obstruent condition.
Subsequent independent samples t-tests were performed to explore this interaction, computed with Bonferroni correction (alpha level set at .025). These revealed a statistically significant difference in the word initial s + obstruent condition, such that bilinguals performed better than monolinguals on this condition: t(29) = 2.613, p = .014. No difference was found between the two groups in the word medial s + obstruent condition t(29) = −1.081, p = .289. We also performed paired-samples two-tailed t tests within each group for the s + obstruent conditions (with Bonferroni correction, alpha level set at .025). These revealed significantly more correct responses in the word initial s + obstruent condition (compared to the corresponding word-medial condition) for the bilingual group, but not for the monolingual group: t(14) = −6.53, p < .001, t(14) = −1.143, p = .272 respectively .
6. Discussion
This study was aimed at investigating an under-researched area of bilingual development: accuracy of consonant cluster production in word-initial and word-medial position. The central goal of the study was to determine whether bilingual Polish–English children are at an advantage compared to English monolinguals. Further, we wished to test predictions that arise from competing views that subscribe to either a structural or a categorisation perspective of phonological knowledge.
Our study provides evidence of acceleration in the production of consonant clusters. As far as we know, this is only the second time that crosslinguistic influence has been reported at the level of syllabic structure (cf. Lleó et al., Reference Lleó, Kuchenbrandt, Kehoe, Trujillo and Müller2003), and the first time it has been found to affect consonant clusters involving onset positions. Moreover, the present study also revealed that the bilingual advantage targets one specific type of cluster (s + obstruent) in one specific word position (word-initial). This pattern cannot be explained by a categorisation view, since s + obstruent clusters are twice more frequent in Polish than in English both word-initially and word-medially (cf. table 1), leading to the prediction that acceleration should have been found in both positions for this cluster type. This is not what we find. Our results are therefore in line with findings from a study on the acquisition of Polish morphology by Krajewski, Theakston, Lieven and Tomasello who reported that frequency was “not a decisive factor” (2011, p. 830) in determining children's performance on their nonword repetition task. Similar conclusions have also been reached by Fabiano-Smith and Goldstein (Reference Fabiano-Smith and Goldstein2010) in relation to the phonological development of Spanish–English bilinguals.
However, onset-rhyme theory also fails to explain the results, as it leads to the hypothesis that no acceleration would take place, due to the fact that Polish and English are supposedly equivalent as far as structural syllabic complexity is concerned.
Additional assumptions are therefore needed for both the categorisation and the structural hypothesis if they are to be reconciled with the data above. One of these additional assumptions could be some form of “sonority markedness” (Berent et al, Reference Berent, Steriade, Lennertz and Vaknin2007), according to which speakers have tacit knowledge of which cluster types are more marked. While markedness does not influence hierarchical structure and thus structural complexity, the idea of markedness is itself framed in terms of some form of complexity, which Gierut (Reference Gierut2007) terms “functional complexity”, in opposition to the “ontological complexity” of hierarchical structures. In particular, marked structures have been shown to be harder to acquire (e.g., Major, Reference Major, Preston and Bayley1996; Major & Faudree Reference Major and Faudree1996; Mazurkewich, Reference Mazurkewich1984), besides being dispreferred crosslinguistically (Blevins, Reference Blevins and Goldsmith1995; Greenberg, Reference Greenberg and Moravcsik1978). Berent et al. (Reference Berent, Steriade, Lennertz and Vaknin2007, p. 597) suggest that the following markedness relations hold between consonant cluster types:
Markedness hierarchy in consonant clusters (from Berent et al., Reference Berent, Steriade, Lennertz and Vaknin2007, p. 597)
-
a. Small sonority rises in the onset are more marked than large rises.
-
b. Sonority plateaus in the onset are more marked than rises.
-
c. Sonority falls in the onset are more marked than plateaus.
This only applies to word-initial clusters, as word-medial clusters of non-rising sonority are treated as coda-onset sequences rather than as onset clusters, and are therefore all equally marked regardless of the sonority slope involved (see also figure 3b above). If, following the spirit of this proposed hierarchy, we assume that a large sonority fall in the onset is more marked than a small sonority fall (i.e., a more fine-grained version of c above), our data would be successfully captured. This is because Polish includes both small and large sonority falls (e.g., [sp] and [mʃ]), while English only includes small sonority falls (e.g., [sp]) which – according to the markedness hierarchy just discussed – makes the phonological structure of Polish more marked (and thus more functionally complex) than that of English. Importantly, the hierarchy only applies to onsets, and therefore the complexity relation does not hold word-medially, predicting that acceleration will only occur word-initially, the desired result.
However, as the markedness hierarchy reported above expresses phenomenological preferences rather than a formal account of linguistic structure (Berent et al., Reference Berent, Steriade, Lennertz and Vaknin2007), the question remains as to how it could be integrated in the two accounts at issue. The question of whether markedness considerations might be integrated into onset-rhyme theory raises some non-trivial issues. In particular, sonority distances are not subsumable under any of the structural relations assumed by onset-rhyme theory, specifically because such relations are based on sonority sequencing (i.e., whether there is a rise or fall in sonority) rather than on sonority distance (i.e., how much of a rise or fall there is). Nevertheless, addition of markedness considerations are unlikely to be problematic for structural perspectives in general, and attempts have been made to integrate markedness relations into structural theories of phonology (De Lacy, Reference De Lacy2002; Prince & Smolensky, 1993/Reference Prince and Smolensky2004; Scheer, Reference Scheer2004), though – as far as we are aware – not within onset-rhyme theory itself. Categorisation models, on the other hand, may not easily lend themselves to this type of hierarchy whose roots are in the structural tradition (e.g., Kean, Reference Kean1975) and which has its basis in allegedly innate constraints (Berent et al., Reference Berent, Steriade, Lennertz and Vaknin2007; Wright, Reference Wright, Steriade, Kirchner and Hayes2004). Nevertheless, the categorisation view could account for the observed difference between word-initial and word-medial clusters if it were extended as to include some form of sonority hierarchy, perhaps by formulating it in terms of acoustics (e.g., Gordon, Ghushchyan, McDonnell, Rosenblum & Shaw, Reference Gordon, Ghushchyan, McDonnell, Rosenblum and Shaw2012; Nakajima, Ueda, Fujimaru, Motomura & Ohsaka, Reference Nakajima, Ueda, Fujimaru, Motomura and Ohsaka2012), together with some type of featural encoding that allows higher-level distinctions that go beyond the encoding of individual phonetic segments (see also Davidson, Reference Davidson2006 on this point).
A potential alternative to the views just discussed comes from another structural view of segmental relations and a competitor of the onset-rhyme theory, namely CVCV theory (Scheer, 2004). Developed within a structural tradition, CVCV theory is similar to onset-rhyme theories in that it relies on the Sonority Sequencing Generalisation and it views consonant clusters as units within a larger structure. However, unlike onset-rhyme theory, CVCV bases its constituent structures on syntagmatic “licensing” relations rather than on hierarchy. While onset-rhyme theory takes all obstruent-obstruent clusters as structurally identical (i.e., as involving an extrasyllabic adjunct), CVCV theory makes a principled structural distinction between s + obstruent and other obstruent-obstruent clusters as well as between these and obstruent-liquid clusters. This structural distinction allows CVCV to capture the patterns observed in our findings without the need for additional, non-structural (and non-axiomatic) assumptions. In particular, in one implementation of CVCV theory based on data from monolingual acquisition, Sanoudaki (Reference Sanoudaki2010) suggests that the structural relations involved in the acquisition of word-initial s + obstruent clusters is a proper subset of the relations needed for the acquisition of other word-initial obstruent-obstruent clusters. All remaining clusters (i.e., all word-medial clusters as well as word-initial obstruent-liquid) do not intersect at the structural level, and are therefore structurally independent from each other as far as complexity relations are concerned. On this view, it is therefore expected that acceleration in Polish–English bilinguals would only affect word-initial s + obstruent clusters. This is because the only word-initial obstruent clusters found in English involve s + obstruent, while Polish also has other word-initial obstruent-obstruent clusters. According to the structural relations developed within CVCV theory, this means that the word-initial clusters found in Polish are ontologically more complex than those available in English. It therefore follows that exposure to the Polish clusters would facilitate acquisition of the simpler (i.e., involving fewer structural relations) s + obstruent English clusters. Importantly, acceleration is predicted to be limited to word-initial s + obstruent clusters, as these are the only cluster types for which Polish possesses a more complex counterpart.
In relation to our findings, the important difference between CVCV and onset-rhyme theory is that the latter does not distinguish structurally between s + obstruent and other obstruent-obstruent clusters, assuming them to be identical cases of adjunction. CVCV theory, on the other hand, takes the two categories as being instantiations of non-identical structural relations, thus potentially providing an independently motivated account for our findings.
Conclusions
This study provided evidence of acceleration in the production of consonant clusters in children simultaneously acquiring Polish and English as first languages. Our findings revealed that the bilingual advantage targets one specific type of cluster, namely s + obstruent, in one specific word position (word-initial). Obstruent-liquid clusters were unaffected, as were s + obstruent clusters in word-medial position. This pattern indicates that the interaction between sub-segmental information and the sonority hierarchy is an important aspect of phonological knowledge that is prone to being transferred across the two developing phonologies of simultaneous bilinguals. Neither the categorisation nor the structural view (in the form of onset-rhyme theory) could straightforwardly capture the findings. Nevertheless, it was suggested that structural views are at an advantage as they allow for necessary additional assumptions (i.e., the sonority hierarchy and encoding of sub-segmental features) which are rooted in the structural tradition but may not naturally fit a categorisation perspective. It is far from clear, however, how and whether some of these necessary additional assumptions (i.e., sonority distance) can be included in onset-rhyme theory in particular, as they are at odds with the axiomatic assumptions of onset-rhyme structure (though not with structural perspectives in general). It was then suggested that CVCV theory, a further theory also grounded in the structural tradition, independently provides the apparatus necessary in order to account for our findings without the need for further assumptions.
Importantly, our study shows how investigating the development of phonology in bilingual first language acquisition can inform phonological theory, as well as provide evidence for what specific phonological properties are prone to crosslinguistic influence. The current study is a first step in providing evidence that crosslinguistic influence is not limited to the segmental or phonemic level, thus lending explanatory power to theoretical accounts based on the representation of sub-segmental information and their interaction with overarching structural configurations.