Hostname: page-component-745bb68f8f-g4j75 Total loading time: 0 Render date: 2025-02-06T05:47:24.414Z Has data issue: false hasContentIssue false

Neuroimaging of phonetic perception in bilinguals*

Published online by Cambridge University Press:  06 October 2015

NARLY GOLESTANI*
Affiliation:
University of Geneva
*
Address for correspondence: Brain and Language Lab, Department of Clinical Neuroscience, Campus Biotech, 9 Chemin des Mines, 1202 Genève, Switzerland. Narly.Golestani@unige.ch
Rights & Permissions [Opens in a new window]

Abstract

This review addresses the cortical basis of phonetic processing in bilinguals and of phonetic learning, with a focus on functional magnetic resonance imaging studies of phonetic perception. Although results vary across studies depending on stimulus characteristics, task demands, and participants’ previous experience with the non-native/second-language sounds, taken together, the literature reveals involvement of overlapping brain regions during phonetic processing in the first and second language of bilinguals, with special involvement of regions of the dorsal audio-motor interface including frontal and posterior cortices during the processing of new, or ‘difficult’ speech sounds. These findings converge with the brain imaging literature on language processing in bilinguals more generally, during semantic and syntactic processing of words and of connected speech. More brain imaging work can serve to better elucidate the precise mechanisms underlying phonetic encoding and its interaction with articulatory processes, in particular where multiple phonetic repertoires have been or are being acquired.

Type
Review Article
Copyright
Copyright © Cambridge University Press 2015 

Introduction

When a new language is proficiently learned in early or late bilinguals, novel phonetic categories are established (MacKain, Best & Strange, Reference MacKain, Best and Strange1981), although there are individual differences in how well these sounds are perceived and produced (Flege, Munro & MacKay, Reference Flege, Munro and MacKay1995, Pallier, Bosch & Sebastián-Gallés, Reference Pallier, Bosch and Sebastián-Gallés1997, Bosch, Costa & Sebastián-Gallés, Reference Bosch, Costa and Sebastián-Gallés2000, Sebastián-Gallés, Rodriguez-Fornells, Deigo-Balaguer & Díaz, Reference Sebastián-Gallés, Rodriguez-Fornells, de Diego-Balaguer and Díaz2006). Laboratory training studies also show that adults can learn to hear and to produce foreign speech sounds, again with large individual differences (Golestani & Zatorre, Reference Golestani and Zatorre2009, Hattori & Iverson, Reference Hattori and Iverson2009, Kartushina, Hervais-Adelman, Frauenfelder & Golestani, Reference Kartushina, Hervais-Adelman, Frauenfelder and Golestani2015). This review will address the cortical basis of phonetic processing in bilinguals and of phonetic learning, with a focus on functional magnetic resonance imaging (fMRI) studies of phonetic perception. An overview of the neural basis of phonetic processing per se will precede the review of the bilingual and phonetic learning literature.

Cortical bases of phonetic processing

Functional brain imaging studies using methods such as PET and fMRI in adults have examined the neural underpinnings of phonetic perception using words, speech syllables, and meaningless speech sounds, and using passive listening, phoneme monitoring, discrimination, identification, and rhyming tasks. Several existing papers offer well-established models of the neural underpinnings of language processing and learning more generally (Hickok & Poeppel, Reference Hickok and Poeppel2007, Rodriguez-Fornells, Cunillera, Mestres-Misse & Deigo-Balaguer, Reference Rodriguez-Fornells, Cunillera, Mestres-Misse and de Diego-Balaguer2009, Price, Reference Price2012). With respect to phonetic processing specifically, these models highlight the role of the dorsal audio-motor interface, or of the dorsal stream, including auditory, frontal and parietal regions, in mapping sounds onto articulatory-based representations (Hickok & Poeppel, Reference Hickok and Poeppel2007, Rodriguez-Fornells et al., Reference Rodriguez-Fornells, Cunillera, Mestres-Misse and de Diego-Balaguer2009). This network is especially relevant for phonological processing and working memory (Aboitiz, Reference Aboitiz2012), in contrast with the ventral stream, which is thought to be more implicated in lexical processing and in processing meaning, or semantics (Hickok & Poeppel, Reference Hickok and Poeppel2007, Rodriguez-Fornells et al., Reference Rodriguez-Fornells, Cunillera, Mestres-Misse and de Diego-Balaguer2009). Within the dorsal audio-motor network, the left pars opercularis, which lies in the posterior portion of Broca's area, and the adjacent left insula/frontal operculum (FO) of the left inferior frontal gyrus (LIFG) are involved even during purely receptive (i.e., perceptual) phonetic tasks when there are specific task demands such as phonetic segmentation and analysis (Démonet, Chollet, Ramsay, Cardebat, Nespoulous, Wise, Rascol & Frackowiak, Reference Démonet, Chollet, Ramsay, Cardebat, Nespoulous, Wise, Rascol and Frackowiak1992, Zatorre, Evans, Meyer & Gjedde, Reference Zatorre, Evans, Meyer and Gjedde1992, Fiez, Raichle, Miezin, Petersen, Tallal & Katz, Reference Fiez, Raichle, Miezin, Petersen, Tallal and Katz1995, Poldrack, Wagner, Prull, Desmond, Glover & Garbrieli, Reference Poldrack, Wagner, Prull, Desmond, Glover and Gabrieli1999, Burton, Small & Blumstein, Reference Burton, Small and Blumstein2000, Golestani & Zatorre, Reference Golestani and Zatorre2004). The left pars opercularis and the left supramarginal gyrus (SMG) are implicated in verbal working memory, or in the phonological loop, with the left pars opercularis and the adjacent left premotor area being involved in subvocal rehearsal, and the left SMG being involved in phonological storage (Paulesu, Frith & Frackowiak, Reference Paulesu, Frith and Frackowiak1993, Smith, Jonides, Marshuetz & Koeppe, Reference Smith, Jonides, Marshuetz and Koeppe1998, Nixon, Lazarova, Hodinott-Hill, Gough & Passingham, Reference Nixon, Lazarova, Hodinott-Hill, Gough and Passingham2004, Koelsch, Schulze, Sammler, Fritz, Mueller & Gruber, Reference Koelsch, Schulze, Sammler, Fritz, Mueller and Gruber2009). The implication of left motor cortex in addition to premotor regions during phonetic perception is thought to reflect subvocal articulatory demands (Pulvermuller, Huss, Kherif, Martin, Hauk & Shtyrov, Reference Pulvermuller, Huss, Kherif, Martin, Hauk and Shtyrov2006, Lee, Turkeltaub, Granger & Raizada, Reference Lee, Turkeltaub, Granger and Raizada2012, Rogers, Mottonen, Boyles & Watkins, Reference Rogers, Mottonen, Boyles and Watkins2014), in line with the motor theory of speech perception (Liberman & Mattingly, Reference Liberman and Mattingly1985).

The bilateral auditory cortex activations observed in the superior temporal gyrus (STG) during phonetic perception are typically localized to secondary auditory cortices anterior and posterior to Heschl's gyrus (HG), including the planum temporale (PT) (Binder, Rao, Hammeke, Yetkin, Jesmanowicz, Bandettini, Wong, Estkowski, Goldstein, Haughton & Hyde, Reference Binder, Rao, Hammeke, Yetkin, Jesmanowicz, Bandettini, Wong, Estkowski, Goldstein, Haughton and Hyde1994, Jancke, Shah, Posse, Grosse-Ryuken & Muller-Gartner, Reference Jancke, Shah, Posse, Grosse-Ryuken and Muller-Gartner1998, Binder, Frost, Hammeke, Bellgowan, Springer, Kaufman & Possing, Reference Binder, Frost, Hammeke, Bellgowan, Springer, Kaufman and Possing2000, Hickok & Poeppel, Reference Hickok and Poeppel2000, Kilian-Huetten, Valente, Vroomen & Formisano, Reference Kilian-Huetten, Valente, Vroomen and Formisano2011). However, these regions are also involved in processing complex sounds such as amplitude modulated noise (Giraud, Lorenzi, Ashburner, Wable, Johnsrude, Frackowiak & Kleinschmidt, Reference Giraud, Lorenzi, Ashburner, Wable, Johnsrude, Frackowiak and Kleinschmidt2000) as well in the analysis of spectral and temporal information more generally (Obleser, Eisner & Kotz, Reference Obleser, Eisner and Kotz2008, Santoro, Moerel, De Martino, Goebel, Ugurbil, Yacoub & Formisano, Reference Santoro, Moerel, De Martino, Goebel, Ugurbil, Yacoub and Formisano2014), whereas earlier, primary auditory regions respond preferentially to simpler stimuli such as pure tones (Wessinger, VanMeter, Tian, Van Lare, Pekar & Rauschecker, Reference Wessinger, VanMeter, Tian, Van Lare, Pekar and Rauschecker2001). When the processing of complex auditory (i.e., non-phonetic) stimuli is controlled for, or when across category phonetic conditions are compared to within category ones, phonetic perception is localised to the more downstream left middle/anterior superior temporal sulcus (STS) (Liebenthal, Binder, Spitzer, Possing & Medler, Reference Liebenthal, Binder, Spitzer, Possing and Medler2005) and to the adjacent left middle temporal gyrus (Zhang, Xi, Xu, Shu, Wang & Li, Reference Zhang, Xi, Xu, Shu, Wang and Li2011), respectively. This latter study, which investigated lexical tonal stimuli in native speakers of Chinese, and other studies having examined the learning of lexical tone in people who were not native speakers of Chinese (Wong, Perrachione & Parrish, Reference Wong, Perrachione and Parrish2007), demonstrate convergence in terms of the left-lateralized neural underpinnings of lexical tone processing and of phonetic processing in non-tonal languages. Consistent with the hierarchical view that more downstream regions respond to phonetic information per se, it has been proposed that speech perception is robust due to the presence of multiple, complementary representations of the input, which operate both on acoustic-phonetic features but also in articulatory-gestural domains (Scott & Johnsrude, Reference Scott and Johnsrude2003, Obleser, Leaver, VanMeter & Rauschecker, Reference Obleser, Leaver, Vanmeter and Rauschecker2010). Bilateral temporal regions are involved in the processing of phonology, and higher levels of linguistic information in the speech signal (e.g., semantics, syntax) are processed in higher-level, left-lateralized frontal and parietal association cortices (Scott & Johnsrude, Reference Scott and Johnsrude2003, Peelle, Reference Peelle2012). Interestingly however, recent electrical recordings in humans (electrocorticography, or ECoG) during surgical planning have shown neural response patterns within the posterior STG (pSTG) which correspond to phonetic category boundaries (Chang, Rieger, Johnson, Berger, Barbaro & Knight, Reference Chang, Rieger, Johnson, Berger, Barbaro and Knight2010), and to the speech sound features which map onto particular articulatory dimensions (Mesgarani, Cheung, Johnson & Chang, Reference Mesgarani, Cheung, Johnson and Chang2014). In other words, the pSTG does more than process spectro-temporal information in complex auditory input, and is likely also engaged in functional interaction with higher-level frontal and parietal regions that are involved in the categorical perception (CP) of speech sounds, an idea that is supported by recent developmental fMRI work on CP (Conant, Liebenthal, Desai & Binder, Reference Conant, Liebenthal, Desai and Binder2014), and by fMRI adaptation (Raizada & Poldrack, Reference Raizada and Poldrack2007) and pattern classification studies on CP (Lee et al., Reference Lee, Turkeltaub, Granger and Raizada2012). Similarly, the adjacent left temporo-parietal junction (area Spt) is thought to be involved in the interface, or mapping between sensory and motor representations during speech processing (Hickok & Poeppel, Reference Hickok and Poeppel2007). Finally, there is growing evidence for involvement of partially overlapping frontal (i.e., Broca's area) and posterior (i.e., Wernicke's area) brain regions classically associated with speech production and perception, respectively, during phonological and speech perception and production (Paus et al., Reference Paus, Perry, Zatorre, Worsley and Evans1996, Buchsbaum, Hickok & Humphries, Reference Buchsbaum, Hickok and Humphries2001, Heim, Opitz, Muller & Friederici, Reference Heim, Opitz, Muller and Friederici2003, Hickok & Poeppel, Reference Hickok and Poeppel2007, Meister, Wilson, Deblieck, Wu & Iacoboni, Reference Meister, Wilson, Deblieck, Wu and Iacoboni2007, Price, Crinion & Macsweeney, Reference Price, Crinion and Macsweeney2011, Agnew et al., Reference Agnew, McGettigan, Banks and Scott2013), lending further support to the idea of interdependency of phonetic perception and production in the human brain.

Functional brain imaging studies on bilingual phonetic processing and on phonetic learning

Studies involving words

In an early PET study in late, proficient bilinguals, overlapping activations were observed in regions including the pars triangularis and the pars orbitalis of the LIFG in the first (L1) and second language (L2) during rhyme and synonym generation tasks, where phonological and semantic cues guided word selection, respectively (Klein, Milner, Zatorre, Meyer & Evans, Reference Klein, Milner, Zatorre, Meyer and Evans1995). These frontal regions are more typically associated with semantic processing and memory (Binder, Frost, Hammeke, Rao & Cox, Reference Binder, Frost, Hammeke, Rao and Cox1996, Dapretto & Bookheimer, Reference Dapretto and Bookheimer1999, Liebenthal, Desai, Ellingson, Ramachandran, Desai & Binder, Reference Liebenthal, Desai, Ellingson, Ramachandran, Desai and Binder2010) than with phonetic processing, which is more typically localized to the left pars opercularis (Poldrack, Wagner, Prull, Desmond, Glover & Gabrieli, Reference Poldrack, Wagner, Prull, Desmond, Glover and Gabrieli1999). The implication of semantic regions also during phonologically guided word retrieval might be expected given that a word generation task was used where semantic and lexical processes are likely also at play, especially when new words are generated. The findings of this study were interpreted as reflecting shared neural representations during phonetic and also semantic processing, in proficient bilinguals (Klein et al., Reference Klein, Milner, Zatorre, Meyer and Evans1995).

In a later longitudinal fMRI study on phonetic learning, minimal word pairs were used to test and to train Japanese individuals to hear the /r/ - /l/ contrast. These participants had previously been extensively exposed to this contrast, during 6 years of English-language instruction. After training, increased activation was found in regions including the bilateral superior temporal gyrus/sulcus (STG/STS), IFG, insula, SMG, premotor cortex, supplementary motor area and subcortical regions. It was proposed that these increases reflect the acquisition of auditory-articulatory mappings for the difficult /r-l/ contrast, in particular since this network was broader than that observed during perception of an easy phonetic contrast (/b-g/) (Callan, Tajima, Callan, Kubo, Masaki & Akahane-Yamada, Reference Callan, Tajima, Callan, Kubo, Masaki and Akahane-Yamada2003). Given that training was extensive and that it involved words, the functional plasticity results could in part have arisen from changes in semantic processing. It is interesting, however, that activation in primary and secondary auditory areas was also increased after training, reflecting functional plasticity in relatively low-level auditory regions (Callan et al., Reference Callan, Tajima, Callan, Kubo, Masaki and Akahane-Yamada2003). More generally, greater overall activation during perception of the difficult compared to the easy contrast is consistent with the idea of greater neural recruitment during effortful task performance, an explanation that has been offered for bilingual language processing more generally, in particular in the left IFG (Frith, Friston, Liddle & Frackowiak, Reference Frith, Friston, Liddle and Frackowiak1991, Chee, Hon, Lee & Soon, Reference Chee, Hon, Lee and Soon2001, Golestani & Zatorre, Reference Golestani and Zatorre2004, Golestani, Alario, Meriaux, Le Bihan, Dehaene & Pallier, Reference Golestani, Alario, Meriaux, Le Bihan, Dehaene and Pallier2006). However, the above studies did not isolate phonetic processing per se, and as such the interpretation of the findings is limited.

Studies having used isolated phonemes or syllables

Studies on bilingual phonetic processing and on phonetic learning having used isolated phonemes or syllables converge with the idea that phonetic processing in L1 and L2 generally overlap, with greater neural recruitment during non-native, or effortful phonetic processing. For example in a magnetoencephalographic study on preattentive neural responses to stimulus change, English and Japanese listeners were tested during exposure to the /ra/ and /la/ syllables. The processing of non-native speech sounds in the Japanese group recruited greater neural resources and was associated with longer periods of brain activation in bilateral superior temporal and inferior parietal regions (Zhang et al., Reference Zhang, Kuhl, Imada, Kotani and Tohkura2005).

Other phonetic perception studies have required active task performance. In one such fMRI study, native (English) and non-native (Japanese) listeners identified syllables starting with /r/ and /l/ (Callan, Jones, Callan & Akahane-Yamada, Reference Callan, Jones, Callan and Akahane-Yamada2004). The Japanese listeners had previously studied English for at least 6 years, and accordingly, they performed above chance on this task, but still more poorly than the English participants. In line with the above-described longitudinal study by the same group (Callan et al., Reference Callan, Tajima, Callan, Kubo, Masaki and Akahane-Yamada2003), brain imaging revealed greater activation in the non-native listeners in an articulatory-auditory network comprising Broca's area, the anterior insula, the anterior STS/STG, the PT, the temporo-parietal junction, the SMG and the cerebellum, once again consistent with greater neural recruitment during more effortful, non-native phonetic processing. There was also a weak, positive correlation between performance on the /r/-/l/ contrast and activation in the above-reported network in the non-native listeners (Callan et al., Reference Callan, Jones, Callan and Akahane-Yamada2004). In other words, between groups, higher activation was associated with poorer performance (i.e., in the non-native compared to native listeners), but within the non-native (Japanese) group, the opposite was observed.

In line with the above-described study (Callan et al., Reference Callan, Jones, Callan and Akahane-Yamada2004) and with the related longitudinal study by the same group (Callan et al., Reference Callan, Tajima, Callan, Kubo, Masaki and Akahane-Yamada2003), a second longitudinal study also found greater recruitment of auditory and articulatory brain regions after learning to hear a difficult non-native phonetic contrast (Golestani & Zatorre, Reference Golestani and Zatorre2004). In this latter study, listeners were trained to hear the difficult dental-retroflex contrast. After training, the pattern of brain activation came to resemble that observed during identification of a native contrast, with greater recruitment of the left IFG, the right insula / FO, the STG bilaterally and the left caudate nucleus (Golestani & Zatorre, Reference Golestani and Zatorre2004). There was also a positive relationship between behavioural improvement and post-training brain activation in the left angular gyrus, as well as a negative relationship between improvement and activation in the left insula/FO. This latter result suggests that the degree of success in phonetic learning is accompanied by more efficient neural processing in frontal speech regions implicated in phonetic processing, and conversely, that more effortful processing in the poorer learners is accompanied by greater recruitment of the left insula/FO (Golestani & Zatorre, Reference Golestani and Zatorre2004). The negative correlation with performance is in the opposite direction to that found in this and other brain regions by Callan and colleagues (Reference Callan, Jones, Callan and Akahane-Yamada2004). One factor that could explain the discrepancy is that in Golestani and Zatorre (Reference Golestani and Zatorre2004), participants were completely naïve to the contrast before training, and after 5 hours of training, only about half of the participants performed above chance (Golestani & Zatorre, Reference Golestani and Zatorre2004), whereas in the study by Callan and colleagues (Reference Callan, Jones, Callan and Akahane-Yamada2004), all the Japanese participants performed above chance even before scanning.

This raises the important question of the interaction between performance/effort and the degree of neural recruitment of relevant brain regions. Specifically, it is likely that some individuals can easily hear the contrast, that others can do so but with difficulty (i.e., with uncertainty and effort), and that yet others cannot hear it at all. In this latter subgroup, due to perceptual assimilation of non-native with native sounds, one can expect that participants eventually make less effort (i.e., they might give up on performing the task), and one can also expect greater neural adaptation in these individuals (Grill-Spector & Malach, Reference Grill-Spector and Malach2001) due to the fact that they effectively hear the same sound across different trials. Such differences across individuals and also across studies (e.g., related to aptitudes, but also to previous exposure to the contrast of interest) might modulate the observed neural response in brain regions involved, resulting in discrepancies across studies in terms of the direction of the training effects, and in terms of the direction of correlations between activation and performance.

Interestingly, an electroencephalography (EEG) study has uncovered an important finding in relation to individual differences in phonetic perception. Using a pre-attentive oddball paradigm on vowels, it was found that good and poor phonetic perceivers differed in their electrophysiological response indexing change detection (i.e., the mismatch negativity, or MMN response) not only to non-native but also to native phonetic contrasts (Díaz, Baus, Escera, Costa & Sebastían-Gallés, Reference Díaz, Baus, Escera, Costa and Sebastián-Gallés2008). In other words, people who are particularly good or poor in non-native vowel perception also differ in their neural response to native vowel contrasts. This finding may arise from the partially shared neural resources underlying L1 and L2 phonetic processing (Golestani & Zatorre, Reference Golestani and Zatorre2004), and suggests that there exist individual differences even in how native speech sounds are perceived, at least in bilinguals. This could in part be due to the influence of learning a new phonetic inventory on characteristics of the native inventory (Chang, Reference Chang2012, Kartushina, Hervais-Adelman, Frauenfelder & Golestani, unpublished manuscript). Possibly related to a relationship between L1 and L2 phonetic perception is recent behavioural evidence for a relationship between L1 and L2 phonetic production (Kartushina and Frauenfelder, Reference Kartushina and Frauenfelder2014).

The studies reviewed thus far reported results of univariate analyses, and generally converge in showing greater recruitment of frontal and/or of posterior brain regions during the processing of new or of difficult speech sounds. Different, complementary results have been obtained using multi-voxel pattern analysis (MVPA, aka ‘pattern classification’), which is better suited for differentiating neural representations within spatially overlapping brain regions. In one such study, English and Japanese listeners were tested on their perception of the /r/ - /l/ distinction. It was found that the statistical separability of fMRI activation patterns in the right primary auditory cortex predicted subjects’ ability to tell the sounds apart, both across and within groups (Raizada, Tsao, Liu & Kuhl, Reference Raizada, Tsao, Liu and Kuhl2010). This result is consistent with functional brain imaging (Binder et al., Reference Binder, Rao, Hammeke, Yetkin, Jesmanowicz, Bandettini, Wong, Estkowski, Goldstein, Haughton and Hyde1994, Jancke et al., Reference Jancke, Shah, Posse, Grosse-Ryuken and Muller-Gartner1998, Binder et al., Reference Binder, Frost, Hammeke, Bellgowan, Springer, Kaufman and Possing2000, Hickok & Poeppel, Reference Hickok and Poeppel2000, Kilian-Huetten et al., Reference Kilian-Huetten, Valente, Vroomen and Formisano2011) and with electrocorticography studies showing temporal cortex involvement during phonetic processing (Chang et al., Reference Chang, Rieger, Johnson, Berger, Barbaro and Knight2010, Mesgarani et al., Reference Mesgarani, Cheung, Johnson and Chang2014), and demonstrates that further work is needed involving more fine-grained analyses of differences in neural recruitment within spatially overlapping brain regions. This opens the question of the contributions of top-down versus bottom-up influences on auditory cortex activation differences in relation to phonetic processing.

A recent adaptation fMRI study partially addressed this question (Myers & Swan, Reference Myers and Swan2012). Involvement of temporal and inferior frontal brain regions was shown in phonetic processing (Myers & Swan, Reference Myers and Swan2012), and additionally, the bilateral middle frontal gyri were implicated specifically during the processing of a newly learned phonetic category. This suggests that top-down information about new categories may reshape perceptual sensitivities via attentional or executive mechanisms (Myers & Swan, Reference Myers and Swan2012), and demonstrates that there is a complex interplay between low-level, perceptual aspects of the input and higher-level knowledge about phonetic categories, in particular when they are newly learned. Related to this are the results of a longitudinal training study with synthetic, phonetic and non-speech but voice-like continua, which showed that the left posterior STS may play a role in the short-term representation of sound features relevant for learning new sound categories (Liebenthal et al., Reference Liebenthal, Desai, Ellingson, Ramachandran, Desai and Binder2010). This provides evidence for a lower-level, temporal cortex mechanism that may mediate subsequent consolidation during the learning of novel speech sounds.

Conclusions and future reading

In conclusion, although a limited number of functional imaging studies have examined the neural underpinnings of bilingual phonetic processing per se, the results of these studies generally converge in showing overlapping brain regions during phonetic processing in the L1 and L2 of bilinguals, with greater recruitment of frontal and posterior brain regions during the processing of new or of ‘difficult’ non-native sounds. This converges with findings on bilingual language processing more generally, where it has been shown that at early stages of L2 learning there is relatively greater engagement of anterior and parietal portions of the language network including Broca's area as well as of higher level executive and language control regions, and that, as increased proficiency is attained in the second language, the two languages recruit more overlapping brain networks (Indefrey, Reference Indefrey2006, Abutalebi, Cappa & Perani, Reference Abutalebi, Cappa and Perani2001, Stowe & Sabourin, Reference Stowe and Sabourin2005, Abutalebi, Reference Abutalebi2008, Sebastian, Laird & Kiran, Reference Sebastian, Laird and Kiran2011). Further, studies having examined the question of phonetic perception and learning per se using univariate approaches and at the macroscopic level using fMRI suggest that largely overlapping regions of the auditory cortex are recruited when processing familiar versus novel speech sounds, or when processing different speech sounds of one language. More advanced image analysis methods (i.e., MVPA) and invasive approaches such as intracranial recordings, however, reveal differences in the neural response pattern within overlapping regions of auditory cortex in response to L1 versus L2 speech sounds, and also in relation to specific phonetic features such as place of articulation, and in relation to cross versus within category differences (i.e., categorical perception). These more fine-grained auditory cortex differences, which are likely modified during the acquisition of new speech sounds, are likely mediated a) by regions including the left middle to posterior STS in the short-term representation of sound features defining new sound categories; b) by increased involvement of the left temporo-parietal junction related to increased demands on sensori-motor mapping of the new sounds; and c) by the additional involvement of frontal brain regions in the top-down reshaping of lower-level, perceptual phonetic encoding in the auditory cortex. These findings are convergent with the known roles of these respective components of the dorsal audio-motor stream in spectro-temporal analysis (bilateral dorsal STG), in phonological processing (bilateral middle to posterior STS), in the sensori-motor interface (left temporo-parietal junction) and in subvocal articulation (posterior LIFG) (Hickok & Poeppel, Reference Hickok and Poeppel2007). Outstanding questions remain regarding the precise mechanisms underlying differential encoding of L1 versus L2 (or foreign) speech sounds in primary and secondary auditory cortices, in particular in light of interactions of these bottom-up, auditory processes with top-down, frontal and temporo-parietal ones. These can be addressed using, among other approaches, ultra-high resolution (i.e., 7 Tesla) functional mapping, advanced data analysis methods including MVPA and computational modelling, and invasive methods such as intracranial recordings.

Recommendations for further reading that relate to the neural bases of phonetic processing in bilingualism and to phonetic learning include developmental work on native and non-native speech sound processing in infants (Cheour, Ceponiene, Lehtokoski, Luuk, Allik, Alho & Naatanen, Reference Cheour, Ceponiene, Lehtokoski, Luuk, Allik, Alho and Naatanen1998, Rivera-Gaxiola, Silva-Pereyra & Kuhl, Reference Rivera-Gaxiola, Silva-Pereyra and Kuhl2005, Minagawa-Kawai, Mori, Naoi & Kojima, Reference Minagawa-Kawai, Mori, Naoi and Kojima2007, Petitto, Berens, Kovelman, Dubins, Jasinska & Shalinsky, Reference Petitto, Berens, Kovelman, Dubins, Jasinska and Shalinsky2012, Ortiz-Mantilla, Hamalainen, Musacchia & Benasich, Reference Ortiz-Mantilla, Hamalainen, Musacchia and Benasich2013, Fava, Hull & Bortfeld, Reference Fava, Hull and Bortfeld2014), on foreign-language syllable production in children (Hashizume, Taki, Sassa, Thyreau, Asano, Asano, Takeuchi, Nouchi, Kotozaki, Jeong, Sugiura & Kawashima, Reference Hashizume, Taki, Sassa, Thyreau, Asano, Asano, Takeuchi, Nouchi, Kotozaki, Jeong, Sugiura and Kawashima2014), and on the neural bases of lexical tone processing in individuals whose first language was tonal but was subsequently forgotten (Pierce, Klein, Chen, Delcenserie & Genesee, Reference Pierce, Klein, Chen, Delcenserie and Genesee2014). There is also a large electrophysiological (EEG and magnetoencephalography, or MEG) literature and some functional near infrared spectroscopy (fNIRS) work on the cortical and subcortical bases of phonetic perception and learning (Alain, Reinke, McDonald, Chau, Tam, Pacurar & Graham, Reference Alain, Reinke, McDonald, Chau, Tam, Pacurar and Graham2005, Zhang, Kuhl, Imada, Iverson, Pruitt, Stevens, Kawakatsu, Tohkura & Nemoto, Reference Zhang, Kuhl, Imada, Iverson, Pruitt, Stevens, Kawakatsu, Tohkura and Nemoto2009, Kumar, Hegde & Mayaleela, Reference Kumar, Hegde and Mayaleela2010, Xi, Zhang, Shu, Zhang & Li, Reference Xi, Zhang, Shu, Zhang and Li2010, Zhang et al., Reference Zhang, Xi, Xu, Shu, Wang and Li2011, Chandrasekaran, Kraus & Wong, Reference Chandrasekaran, Kraus and Wong2012, Brandmeyer, Farquhar, McQueen & Desain, Reference Brandmeyer, Farquhar, McQueen and Desain2013, Kaan, Wayland & Keil, Reference Kaan, Wayland and Keil2013, Skoe, Chandrasekaran, Spitzer, Wong & Kraus, Reference Skoe, Chandrasekaran, Spitzer, Wong and Kraus2014, Zinszer, Chen, Wu, Shu & Li, Reference Zinszer, Chen, Wu, Shu and Li2015). Also, given the growing evidence for the importance of syllable-level speech processing (Morillon, Liegeois-Chauvel, Amer, Bener & Giraud, Reference Morillon, Liegeois-Chauvel, Amer, Bener and Giraud2012, Edwards & Chang, Reference Edwards and Chang2013, Doelling, Arnal, Ghitza & Poeppel, Reference Doelling, Arnal, Ghitza and Poeppel2014), studies on the neural basis of bilingual phonotactic processing are recommended (Dehaene-Lambertz, Dupoux & Gout, Reference Dehaene-Lambertz, Dupoux and Gout2000, Jacquemot, Pallier, Le Bihan, Dehaene & Dupoux, Reference Jacquemot, Pallier, Le Bihan, Dehaene and Dupoux2003, Minagawa-Kawai, Cristia, Long, Vendelin, Hakuno, Dutat, Filippin, Cabrol & Dupoux, Reference Minagawa-Kawai, Cristia, Long, Vendelin, Hakuno, Dutat, Filippin, Cabrol and Dupoux2013), although only a limited number of studies have addressed this.

Other literature that is relevant to bilingual phonetic processing and learning is a body of work on the brain structural correlates of individual differences in phonetic processing and also in language processing more generally (see Golestani, Reference Golestani2014, for a recent review). These include studies on the brain structural correlates of phonetic perception (Golestani, Paus & Zatorre, Reference Golestani, Paus and Zatorre2002, Golestani, Molko, Dehaene, Le Bihan & Pallier, Reference Golestani, Molko, Dehaene, Le Bihan and Pallier2007, Wong, Chandrasekaran, Garibaldi & Wong, Reference Wong, Chandrasekaran, Garibaldi and Wong2011, Lebel & Beaulieu, Reference Lebel and Beaulieu2009, Wong, Warrier, Penhune, Roy, Sadehh, Parrish & Zatorre, Reference Wong, Warrier, Penhune, Roy, Sadehh, Parrish and Zatorre2008, Sebastián-Gallés, Soriano-Mas, Baus, Díaz, Ressel, Pallier, Costa & Pujol, Reference Sebastián-Gallés, Soriano-Mas, Baus, Díaz, Ressel, Pallier, Costa and Pujol2012, Burgaleta, Baus, Díaz & Sebastián-Gallés Reference Burgaleta, Baus, Díaz and Sebastián-Gallés2014) and production (Golestani & Pallier, Reference Golestani and Pallier2007), on foreign speech imitation (Reiterer, Hu, Erb, Rota, Nardo, Grodd, Winkler & Ackermann, Reference Reiterer, Hu, Erb, Rota, Nardo, Grodd, Winkler and Ackermann2011), on bilingualism (Mechelli, Crinion, Noppeney, O’Doherty, Ashburner, Frackowiak & Price, Reference Mechelli, Crinion, Noppeney, O’Doherty, Ashburner, Frackowiak and Price2004, Ressel, Pallier, Ventura-Campos, Díaz, Roessler, Avila & Sebastián-Gallés, Reference Ressel, Pallier, Ventura-Campos, Díaz, Roessler, Avila and Sebastián-Gallés2012, Klein, Mok, Chen & Watkins, Reference Klein, Mok, Chen and Watkins2014) and on expertise in phonetics (Golestani, Price & Scott, Reference Golestani, Price and Scott2011).

Footnotes

*

This work was supported by the Swiss National Science Foundation (PP00P3_133701). I would like to thank Christophe Pallier and an anonymous reviewer, who provided helpful comments on this review.

References

Aboitiz, F. (2012). Gestures, vocalizations, and memory in language origins. Frontiers in Evolutionary Neuroscience, 4, 2.CrossRefGoogle ScholarPubMed
Abutalebi, J. (2008). Neural aspects of second language representation and language control. Acta Psychologica (Amst), 128, 466–78.Google Scholar
Abutalebi, J., Cappa, S. F., & Perani, D. (2001). The bilingual brain as revealed by functional neuroimaging. Bilingualism: Language and Cognition, 4, 179190.CrossRefGoogle Scholar
Agnew, Z. K., McGettigan, C., Banks, B., & Scott, S. K. (2013). Articulatory movements modulate auditory responses to speech. Neuroimage, 73, 191–9.Google Scholar
Alain, C., Reinke, K., McDonald, K. L., Chau, W., Tam, F., Pacurar, A., & Graham, S. (2005). Left thalamo-cortical network implicated in successful speech separation and identification. Neuroimage, 26, 592599.Google Scholar
Binder, J. R., Frost, J. A., Hammeke, T. A., Bellgowan, P. S. F., Springer, J. A., Kaufman, J. N., & Possing, E. T. (2000). Human temporal lobe activation by speech and nonspeech sounds. Cerebral Cortex, 10, 512528.Google Scholar
Binder, J. R., Frost, J. A., Hammeke, T. A., Rao, S. M., & Cox, R. W. (1996). Function of the left planum temporale in auditory and linguistic processing. Brain, 119, 12391247.Google Scholar
Binder, J. R., Rao, S. M., Hammeke, T. A., Yetkin, F. Z., Jesmanowicz, A., Bandettini, P. A., Wong, E. C., Estkowski, L. D., Goldstein, M. D., Haughton, V. M., & Hyde, J.S (1994). Functional magnetic resonance imaging of human auditory cortex. Annals of Neurology, 35, 662–72.Google Scholar
Bosch, L., Costa, A., & Sebastián-Gallés, N. (2000). First and second language vowel perception in early bilinguals. European Journal of Cognitive Psychology, 12, 189221.Google Scholar
Brandmeyer, A., Farquhar, J. D. R., McQueen, J. M., & Desain, P. W. M. (2013). Decoding Speech Perception by Native and Non-Native Speakers Using Single-Trial Electrophysiological Data. PLoS One, 8:e68261.Google Scholar
Buchsbaum, B. R., Hickok, G., & Humphries, C. (2001). Role of left posterior superior temporal gyrus in phonological processing for speech perception and production. Cognitive Science, 25, 663678.Google Scholar
Burgaleta, M., Baus, C., Díaz, B., & Sebastián-Gallés, N. (2014). Brain structure is related to speech perception abilities in bilinguals. Brain Structure & Function, 219, 14051416.Google Scholar
Burton, M. W., Small, S. L., & Blumstein, S. E. (2000). The role of segmentation in phonological processing: an fMRI investigation. Journal of Cognitive Neuroscience, 12, 679–90.Google Scholar
Callan, D. E., Jones, J. A., Callan, A. M., & Akahane-Yamada, R. (2004). Phonetic perceptual identification by native- and second-language speakers differentially activates brain regions involved with acoustic phonetic processing and those involved with articulatory-auditory/orosensory internal models. Neuroimage, 22, 11821194.Google Scholar
Callan, D. E., Tajima, K., Callan, A. M., Kubo, R., Masaki, S., & Akahane-Yamada, R. (2003). Learning-induced neural plasticity associated with improved identification performance after training of a difficult second-language phonetic contrast. Neuroimage, 19, 113124.CrossRefGoogle ScholarPubMed
Chandrasekaran, B., Kraus, N., & Wong, P. C. M. (2012). Human inferior colliculus activity relates to individual differences in spoken language learning. Journal of Neurophysiology, 107, 13251336.CrossRefGoogle ScholarPubMed
Chang, C. B. (2012). Rapid and multifaceted effects of second-language learning on first-language speech production. Journal of Phonetics, 40, 249268.CrossRefGoogle Scholar
Chang, E. F., Rieger, J. W., Johnson, K., Berger, M. S., Barbaro, N. M., & Knight, R. T. (2010). Categorical speech representation in human superior temporal gyrus. Nature Neuroscience, 13, 1428U169.Google Scholar
Chee, M. W. L., Hon, N., Lee, H. L., & Soon, C. S. (2001). Relative language proficiency modulates BOLD signal change when bilinguals perform semantic judgments. Neuroimage, 13, 11551163.Google Scholar
Cheour, M., Ceponiene, R., Lehtokoski, A., Luuk, A., Allik, J., Alho, K., & Naatanen, R. (1998). Development of language-specific phoneme representations in the infant brain. Nature Neuroscience, 1, 351353.CrossRefGoogle ScholarPubMed
Conant, L. L., Liebenthal, E., Desai, A., & Binder, J. R. (2014). FMRI of phonemic perception and its relationship to reading development in elementary- to middle-school-age children. Neuroimage, 89, 192202.Google Scholar
Dapretto, M., & Bookheimer, S. Y. (1999). Form and content: Dissociating syntax and semantics in sentence comprehension. Neuron, 24, 427432.Google Scholar
Dehaene-Lambertz, G., Dupoux, E., & Gout, A. (2000). Electrophysiological correlates of phonological processing: a cross-linguistic study. Journal of Cognitive Neuroscience, 12, 635–47.Google Scholar
Démonet, J. F., Chollet, F., Ramsay, S., Cardebat, D., Nespoulous, J. L., Wise, R., Rascol, A., & Frackowiak, R. (1992). The anatomy of phonological and semantic processing in normal subjects. Brain, 115, 1753–68.Google Scholar
Díaz, B., Baus, C., Escera, C., Costa, A., & Sebastián-Gallés, N. (2008). Brain potentials to native phoneme discrimination reveal the origin of individual differences in learning the sounds of a second language. Proceedings of the National Academy of Sciences of the United States of America, 105, 1608316088.Google Scholar
Doelling, K. B., Arnal, L. H., Ghitza, O., & Poeppel, D. (2014). Acoustic landmarks drive delta-theta oscillations to enable speech comprehension by facilitating perceptual parsing. Neuroimage, 85, 761768.Google Scholar
Edwards, E., & Chang, E. F. (2013). Syllabic (approximately 2–5 Hz) and fluctuation (approximately 1–10 Hz) ranges in speech and auditory processing. Hearing Research, 305, 113–34.Google Scholar
Fava, E., Hull, R., & Bortfeld, H. (2014). Dissociating Cortical Activity during Processing of Native and Non-Native Audiovisual Speech from Early to Late Infancy. Brain Sciences, 4, 471–87.CrossRefGoogle ScholarPubMed
Fiez, J. A., Raichle, M. E., Miezin, F. M., Petersen, S. E., Tallal, P., & Katz, W. F. (1995). PET Studies of Auditory and Phonological Processing: Effects of Stimulus Characteristics and Task Demands. Journal of Cognitive Neuroscience, 7, 357–75.Google Scholar
Flege, J. E., Munro, M. J., & MacKay, I. R. (1995). Factors affecting strength of perceived foreign accent in a second language. The Journal of the Acoustical Society of America, 97, 3125–34.Google Scholar
Frith, C. D., Friston, K. J., Liddle, P. F., & Frackowiak, R. S. J. (1991). A PET study of word finding. Neuropsychologia, 29, 11371148.Google Scholar
Giraud, A. L., Lorenzi, C., Ashburner, J., Wable, J., Johnsrude, I., Frackowiak, R., & Kleinschmidt, A. (2000). Representation of the temporal envelope of sounds in the human brain. Journal of Neurophysiology, 84, 15881598.Google Scholar
Golestani, N. (2014). Brain structural correlates of individual differences at low-to high-levels of the language processing hierarchy: A review of new approaches to imaging research. International Journal of Bilingualism, 18, 634.Google Scholar
Golestani, N., Alario, F. X., Meriaux, S., Le Bihan, D., Dehaene, S., & Pallier, C. (2006). Syntax production in bilinguals. Neuropsychologia, 44, 1029–40.Google Scholar
Golestani, N., Molko, N., Dehaene, S., Le Bihan, D., & Pallier, C. (2007). Brain structure predicts the learning of foreign speech sounds. Cerebral Cortex, 17, 575–82.Google Scholar
Golestani, N., & Pallier, C. (2007). Anatomical correlates of foreign speech sound production. Cerebral Cortex, 17, 929–34.Google Scholar
Golestani, N., Paus, T., & Zatorre, R. J. (2002). Anatomical correlates of learning novel speech sounds. Neuron, 35, 9971010.Google Scholar
Golestani, N., Price, C. J., & Scott, S. K. (2011). Born with an Ear for Dialects? Structural Plasticity in the Expert Phonetician Brain. Journal of Neuroscience, 31, 42134220.Google Scholar
Golestani, N., & Zatorre, R. J. (2004). Learning new sounds of speech: reallocation of neural substrates. Neuroimage, 21, 494506.Google Scholar
Golestani, N., & Zatorre, R. J. (2009). Individual differences in the acquisition of second language phonology. Brain and Language, 109, 5567.Google Scholar
Grill-Spector, K., & Malach, R. (2001). fMR-adaptation: a tool for studying the functional properties of human cortical neurons. Acta Psychologica, 107, 293321.Google Scholar
Hashizume, H., Taki, Y., Sassa, Y., Thyreau, B., Asano, M., Asano, K., Takeuchi, H., Nouchi, R., Kotozaki, Y., Jeong, H., Sugiura, M., & Kawashima, R. (2014). Developmental changes in brain activation involved in the production of novel speech sounds in children. Human Brain Mapping, 35, 4079–89.Google Scholar
Hattori, K., & Iverson, P. (2009). English vertical bar r vertical bar-vertical bar l vertical bar category assimilation by Japanese adults: Individual differences and the link to identification accuracy. Journal of the Acoustical Society of America, 125, 469479.Google Scholar
Heim, S., Opitz, B., Muller, K., & Friederici, A. D. (2003). Phonological processing during language production: fMRI evidence for a shared production-comprehension network. Cognitive Brain Research, 16, 285296.CrossRefGoogle ScholarPubMed
Hickok, G., & Poeppel, D. (2000). Towards a functional neuroanatomy of speech perception. Trends in Cognitive Sciences, 4, 131138.Google Scholar
Hickok, G., & Poeppel, D. (2007). Opinion - The cortical organization of speech processing. Nature Reviews Neuroscience, 8, 393402.CrossRefGoogle Scholar
Indefrey, P. (2006). A meta-analysis of hemodynamic studies on first and second language processing: Which suggested differences can we trust and what do they mean? Language Learning, 56, 279304.Google Scholar
Jacquemot, C., Pallier, C., Le Bihan, D., Dehaene, S., & Dupoux, E. (2003). Phonological grammar shapes the auditory cortex: A functional magnetic resonance imaging study. Journal of Neuroscience, 23, 95419546.Google Scholar
Jancke, L., Shah, N. J., Posse, S., Grosse-Ryuken, M., & Muller-Gartner, H. W. (1998). Intensity coding of auditory stimuli: an fMRI study. Neuropsychologia, 36, 875883.Google Scholar
Kaan, E., Wayland, R., & Keil, A. (2013). Changes in oscillatory brain networks after lexical tone training. Brain Sciences, 3, 757–80.Google Scholar
Kartushina, N., & Frauenfelder, U. H. (2014). On the effects of L2 perception and of individual differences in L1 production on L2 pronunciation. Frontiers in Psychology, 5. http://doi.org/10.3389/fpsyg.2014.01246 Google Scholar
Kartushina, N., Hervais-Adelman, A., Frauenfelder, U., & Golestani, N. (2015). The effect of production training with visual feedback on the perception and production of foreign speech sounds. Journal of the Acoustical Society of America, 138, 817832.Google Scholar
Kartushina, N., Hervais-Adelman, A., Frauenfelder, U., & Golestani, N. (unpublished manuscript). Mutual influences between native and non-native vowels in production: evidence from short-term articulatory feedback training. Journal of Phonetics.Google Scholar
Kilian-Huetten, N., Valente, G., Vroomen, J., & Formisano, E. (2011). Auditory Cortex Encodes the Perceptual Interpretation of Ambiguous Sound. Journal of Neuroscience, 31, 17151720.CrossRefGoogle Scholar
Klein, D., Milner, B., Zatorre, R. J., Meyer, E., & Evans, A. C. (1995). The neural substrates underlying word generation - a bilingual functional-imaging study. Proceedings of the National Academy of Sciences of the United States of America, 92, 2899–2903.Google Scholar
Klein, D., Mok, K., Chen, J. K., & Watkins, K. E. (2014). Age of language learning shapes brain structure: a cortical thickness study of bilingual and monolingual individuals. Brain and Language, 131, 20–4.Google Scholar
Koelsch, S., Schulze, K., Sammler, D., Fritz, T., Mueller, K., & Gruber, O. (2009). Functional Architecture of Verbal and Tonal Working Memory: An fMRI Study. Human Brain Mapping, 30, 859873.Google Scholar
Kumar, A. U., Hegde, M., & Mayaleela, (2010). Perceptual learning of non-native speech contrast and functioning of the olivocochlear bundle. International Journal of Audiology, 49, 488496.Google Scholar
Lebel, C., & Beaulieu, C. (2009). Lateralization of the Arcuate Fasciculus from Childhood to Adulthood and its Relation to Cognitive Abilities in Children. Human Brain Mapping, 30, 35633573.Google Scholar
Lee, Y.-S., Turkeltaub, P., Granger, R., & Raizada, R. D. S. (2012). Categorical Speech Processing in Broca's Area: An fMRI Study Using Multivariate Pattern-Based Analysis. Journal of Neuroscience, 32, 39423948.Google Scholar
Liberman, A. M., & Mattingly, I. G. (1985). The motor theory of speech perception revised. Cognition, 21, 136.Google Scholar
Liebenthal, E., Binder, J. R., Spitzer, S. M., Possing, E. T., & Medler, D. A. (2005). Neural substrates of phonemic perception. Cerebral Cortex, 15, 16211631.Google Scholar
Liebenthal, E., Desai, R., Ellingson, M. M., Ramachandran, B., Desai, A., & Binder, J. R. (2010). Specialization along the Left Superior Temporal Sulcus for Auditory Categorization. Cerebral Cortex, 20, 29582970.Google Scholar
MacKain, K. S., Best, C. T., & Strange, W. (1981). Categorical perception of English /r/ and /l/ by Japanese bilinguals. Applied Psycholinguistics, 2, 369390.Google Scholar
Mechelli, A., Crinion, J. T., Noppeney, U., O’Doherty, J., Ashburner, J., Frackowiak, R. S., & Price, C. J. (2004). Neurolinguistics: structural plasticity in the bilingual brain. Nature, 431 (7010), 757.Google Scholar
Meister, I. G., Wilson, S. M., Deblieck, C., Wu, A. D., & Iacoboni, M. (2007). The essential role of premotor cortex in speech perception. Current Biology : CB, 17, 1692–6.Google Scholar
Mesgarani, N., Cheung, C., Johnson, K., & Chang, E. F. (2014). Phonetic Feature Encoding in Human Superior Temporal Gyrus. Science, 343, 10061010.Google Scholar
Minagawa-Kawai, Y., Cristia, A., Long, B., Vendelin, I., Hakuno, Y., Dutat, M., Filippin, L., Cabrol, D., & Dupoux, E. (2013). Insights on NIRS Sensitivity from a Cross-Linguistic Study on the Emergence of Phonological Grammar. Frontiers in Psychology, 4, 170.Google Scholar
Minagawa-Kawai, Y., Mori, K., Naoi, N., & Kojima, S. (2007). Neural attunement processes in infants during the acquisition of a language-specific phonemic contrast. Journal of Neuroscience, 27, 315321.Google Scholar
Morillon, B., Liegeois-Chauvel, C., Amer, L. H., Bener, C.-G., & Giraud, A.-L. (2012). Asymmetric function of theta and gamma activity in syllable processing: an intra-cortical study. Frontiers in Psychology, 3.Google Scholar
Myers, E. B., & Swan, K. (2012). Effects of Category Learning on Neural Sensitivity to Non-native Phonetic Categories. Journal of Cognitive Neuroscience, 24, 16951708.Google Scholar
Nixon, P., Lazarova, J., Hodinott-Hill, I., Gough, P., & Passingham, R. (2004). The inferior frontal gyrus and phonological processing: An investigation using rTMS. Journal of Cognitive Neuroscience, 16, 289300.Google Scholar
Obleser, J., Eisner, F., & Kotz, S. A. (2008). Bilateral speech comprehension reflects differential sensitivity to spectral and temporal features. Journal of Neuroscience, 28, 8116–23.Google Scholar
Obleser, J., Leaver, A. M., Vanmeter, J., & Rauschecker, J. P. (2010). Segregation of vowels and consonants in human auditory cortex: evidence for distributed hierarchical organization. Frontiers in Psychology, 1, 232.Google Scholar
Ortiz-Mantilla, S., Hamalainen, J. A., Musacchia, G., & Benasich, A. A. (2013). Enhancement of Gamma Oscillations Indicates Preferential Processing of Native over Foreign Phonemic Contrasts in Infants. Journal of Neuroscience, 33, 1874618754.Google Scholar
Pallier, C., Bosch, L., & Sebastián-Gallés, N. (1997). A limit on behavioral plasticity in speech perception. Cognition, 64, B9B17.Google Scholar
Paulesu, E., Frith, C. D., & Frackowiak, R. S. (1993). The neural correlates of the verbal component of working memory. Nature, 362, 342–5.Google Scholar
Paus, T., Perry, D. W., Zatorre, R. J., Worsley, K. J., & Evans, A. C. (1996). Modulation of cerebral blood flow in the human auditory cortex during speech: role of motor-to-sensory discharges. The European Journal of Neuroscience, 8, 2236–46.Google Scholar
Peelle, J. E. (2012). The hemispheric lateralization of speech processing depends on what “speech” is: a hierarchical perspective. Frontiers in Human Neuroscience, 6.Google Scholar
Petitto, L. A., Berens, M. S., Kovelman, I., Dubins, M. H., Jasinska, K., & Shalinsky, M. (2012). The “Perceptual Wedge Hypothesis” as the basis for bilingual babies’ phonetic processing advantage: New insights from fNIRS brain imaging. Brain and Language, 121, 130143.Google Scholar
Pierce, L. J., Klein, D., Chen, J. K., Delcenserie, A., & Genesee, F. (2014). Mapping the unconscious maintenance of a lost first language. Proceedings of the National Academy of Scieinces of the United States of America, 111, 17314–9.Google Scholar
Poldrack, R. A., Wagner, A. D., Prull, M. W., Desmond, J. E., Glover, G. H., & Gabrieli, J. D. E. (1999). Functional specialization for semantic and phonological processing in the left inferior prefrontal cortex. Neuroimage, 10, 1535.Google Scholar
Price, C. J. (2012). A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading. Neuroimage, 62, 816–47.Google Scholar
Price, C. J., Crinion, J. T., & Macsweeney, M. (2011). A Generative Model of Speech Production in Broca's and Wernicke's Areas. Frontiers in Psychology, 2, 237.Google Scholar
Pulvermuller, F., Huss, M., Kherif, F., Martin, F., Hauk, O., & Shtyrov, Y. (2006). Motor cortex maps articulatory features of speech sounds. Proceedings of the National Academy of Sciences of the United States of America, 103, 78657870.Google Scholar
Raizada, R. D. S., & Poldrack, R. A. (2007). Selective amplification of stimulus differences during categorical processing of speech. Neuron, 56, 726740.Google Scholar
Raizada, R. D. S., Tsao, F. M., Liu, H. M., & Kuhl, P. K. (2010). Quantifying the Adequacy of Neural Representations for a Cross-Language Phonetic Discrimination Task: Prediction of Individual Differences. Cerebral Cortex, 20, 112.Google Scholar
Reiterer, S. M., Hu, X., Erb, M., Rota, G., Nardo, D., Grodd, W., Winkler, S., & Ackermann, H. (2011). Individual differences in audio-vocal speech imitation aptitude in late bilinguals: functional neuro-imaging and brain morphology. Frontiers in Psychology, 2, 271.Google Scholar
Ressel, V., Pallier, C., Ventura-Campos, N., Díaz, B., Roessler, A., Avila, C., & Sebastián-Gallés, N. (2012). An Effect of Bilingualism on the Auditory Cortex. Journal of Neuroscience, 32, 1659716601.Google Scholar
Rivera-Gaxiola, M., Silva-Pereyra, J., & Kuhl, P. K. (2005). Brain potentials to native and non-native speech contrasts in 7-and 11-month-old American infants. Developmental Science, 8, 162172.Google Scholar
Rodriguez-Fornells, A., Cunillera, T., Mestres-Misse, A., & de Diego-Balaguer, R. (2009). Neurophysiological mechanisms involved in language learning in adults. Philosophical Transactions of the Royal Society B-Biological Sciences, 364, 37113735.Google Scholar
Rogers, J. C., Mottonen, R., Boyles, R., & Watkins, K. E. (2014). Discrimination of speech and non-speech sounds following theta-burst stimulation of the motor cortex. Frontiers in Psychology, 5, 754.Google Scholar
Santoro, R., Moerel, M., De Martino, F., Goebel, R., Ugurbil, K., Yacoub, E., & Formisano, E. (2014). Encoding of natural sounds at multiple spectral and temporal resolutions in the human auditory cortex. PLoS Computational Biology, 10, e1003412.Google Scholar
Scott, S. K., & Johnsrude, I. S. (2003). The neuroanatomical and functional organization of speech perception. Trends in Neurosciences, 26, 100107.Google Scholar
Sebastián-Gallés, N., Rodriguez-Fornells, A., de Diego-Balaguer, R., & Díaz, B. (2006). First- and second-language phonological representations in the mental lexicon. Journal of Cognitive Neuroscience, 18, 12771291.Google Scholar
Sebastián-Gallés, N., Soriano-Mas, C., Baus, C., Díaz, B., Ressel, V., Pallier, C., Costa, A., & Pujol, J. (2012). Neuroanatomical markers of individual differences in native and non-native vowel perception. Journal of Neurolinguistics, 25, 150162.Google Scholar
Sebastian, R., Laird, A. R., & Kiran, S. (2011). Meta-analysis of the neural representation of first language and second language. Applied Psycholinguistics, 32, 799819.Google Scholar
Skoe, E., Chandrasekaran, B., Spitzer, E. R., Wong, P. C. M., & Kraus, N. (2014). Human brainstem plasticity: The interaction of stimulus probability and auditory learning. Neurobiology of Learning and Memory, 109, 8293.Google Scholar
Smith, E. E., Jonides, J., Marshuetz, C., & Koeppe, R. A. (1998). Components of verbal working memory: Evidence from neuroimaging. Proceedings of the National Academy of Sciences of the United States of America, 95, 876882.Google Scholar
Stowe, L. A., & Sabourin, L. (2005). Imaging the processing of a second language: Effects of maturation and proficiency on the neural processes involved. International Review of Applied Linguistics in Language Teaching, 43, 329353.Google Scholar
Wessinger, C. M., VanMeter, J., Tian, B., Van Lare, J., Pekar, J., & Rauschecker, J. P. (2001). Hierarchical organization of the human auditory cortex revealed by functional magnetic resonance imaging. Journal of Cognitive Neuroscience, 13, 17.Google Scholar
Wong, F. C. K., Chandrasekaran, B., Garibaldi, K., & Wong, P. C. M. (2011). White Matter Anisotropy in the Ventral Language Pathway Predicts Sound-to-Word Learning Success. Journal of Neuroscience, 31, 87808785.Google Scholar
Wong, P. C. M., Perrachione, T. K., & Parrish, T. B. (2007). Neural characteristics of successful and less successful speech and word learning in adults. Human Brain Mapping, 28, 9951006.Google Scholar
Wong, P. C. M., Warrier, C. M., Penhune, V. B., Roy, A. K., Sadehh, A., Parrish, T. B., & Zatorre, R. J. (2008). Volume of left heschl's gyrus and linguistic pitch learning. Cerebral Cortex, 18, 828836.Google Scholar
Xi, J., Zhang, L., Shu, H., Zhang, Y., & Li, P. (2010). Categorical perception of lexical tones in Chinese revealed by mismatch negativity. Neuroscience, 170, 223–31.Google Scholar
Zatorre, R. J., Evans, A. C., Meyer, E., & Gjedde, A. (1992). Lateralization of phonetic and pitch discrimination in speech processing. Science, 256, 846–9.Google Scholar
Zhang, L., Xi, J., Xu, G., Shu, H., Wang, X., & Li, P. (2011). Cortical Dynamics of Acoustic and Phonological Processing in Speech Perception. PLoS One, 6.Google ScholarPubMed
Zhang, Y., Kuhl, P. K., Imada, T., Iverson, P., Pruitt, J., Stevens, E. B., Kawakatsu, M., Tohkura, Y., & Nemoto, I. (2009). Neural signatures of phonetic learning in adulthood: A magnetoencephalography study. Neuroimage, 46, 226240.Google Scholar
Zhang, Y., Kuhl, P. K., Imada, T., Kotani, M., & Tohkura, Y. (2005). Effects of language experience: Neural commitment to language-specific auditory patterns. Neuroimage, 26, 703720.Google Scholar
Zinszer, B. D., Chen, P., Wu, H., Shu, H., & Li, P. (2015). Second language experience modulates neural specialization for first language lexical tones. Journal of Neurolinguistics, 33, 5066.Google Scholar