1. Introduction
In recent decades the neurolinguistic literature has increasingly focused on the capacity of the human brain to acquire and manage more than one language concurrently. Today, neuroimaging techniques (fMRI and PET), electroencephalography (especially event-related potentials, ERP) and sophisticated cognitive tests provide much data and valuable information on bilinguism, against which modern theories can be challenged.
Several open questions exist on how bilinguals acquire their lexicon, store and retrieve it, and control language selection by minimizing interference. Some of these questions are related to the second language (L2) proficiency level (which is generally poorer than that of native speakers) and the context in which the learner acquires the language (i.e., the degree of recent exposure to a given language).
A first problem when building a neurolinguistic model is whether the lexical and semantic information is memorized separately for the two languages, or shared in a single store. There is widespread consensus in the current literature on the need to separate the semantic and the lexical levels. Most influential theories (de Groot, Reference de Groot1992; Francis, Reference Francis1999; Kroll & Stewart, Reference Kroll and Stewart1994; Potter, So, Von Eckardt & Feldman, Reference Potter, So, Von Eckardt and Feldman1984; see French & Jacquet, Reference French and Jacquet2004, for a review) assume a single semantic (conceptual) store which is common across languages and clearly separated from lexical aspects. This idea comes from results on semantic priming, showing that a semantic distractor can activate lexical nodes or lemmas in both languages (Caramazza & Brones, Reference Caramazza and Brones1980; Chen & Ng, Reference Chen and Ng1989; Keatley, Spinks & de Gelder, Reference Keatley, Spinks and de Gelder1994; Kirsner, Smith, Lockhart, King & Jain, Reference Kirsner, Smith, Lockhart, King and Jain1984).
A more debated question is whether the lexical aspects of the second language are processed by the same neural structure as those of the first language (L1), or whether the second language exploits a different lexical neural system. Despite long-standing controversy on the topic, the latest theories converge on the idea that the two languages use a single lexical store. Justification for this assumption can be found in several excellent recent review papers (Abutalebi, Reference Abutalebi2008; Abutalebi & Green, Reference Abutalebi and Green2007; French & Jacquet, Reference French and Jacquet2004).
The previous aspects (in particular, the presence of a single lexical store for L1 and L2) rise two additional crucial problems:
(i) How is the second language progressively acquired and how does its proficiency affect the neural organization of L2?
(ii) How can the brain manage two languages simultaneously (for instance when shifting from one language to another) by minimizing interference effects?
The first problem has been tackled by several cognitive and psycholinguistic studies suggesting that organization of the second language changes during the learning period. It has been proposed that in the early stage of language acquisition, when proficiency is low, L2 can access the semantic meaning of words only by being parasitic on L1 words. Conversely, in later stages of learning, L2 words can directly access their semantics without the participation of L1 (Chen & Leung, Reference Chen and Leung1989; Kroll et al., Reference Kroll and Stewart1994; Potter et al., Reference Potter, So, Von Eckardt and Feldman1984). Potter et al. (Reference Potter, So, Von Eckardt and Feldman1984) proposed that bilinguals with low L2 proficiency may realize a direct lexical link from a word in L2 to the corresponding word in L1 (a theory named “word association model”). Conversely, bilinguals with high proficiency mainly connect lexical words in L2 directly with their semantic concepts (“conceptual mediation model”). Kroll and Stewart's influential paper (Kroll & Stewart, Reference Kroll and Stewart1994) proposed a more flexible model (“revised hierarchical model”), according to which translation from L2 to L1 is realized via both concept mediation and direct lexical links. Moreover, the strengths of direct lexical links between L2 and L1 in this model are asymmetrical.
Important information is also provided by recent neuroimaging studies. Some of these suggest that bilinguals with low L2 proficiency engage additional brain activity (mostly in prefrontal areas) compared to L1, whereas activation is similar in the two languages when L2 proficiency becomes comparable to L1 (see Abutalebi, Reference Abutalebi2008; Abutalebi & Green, Reference Abutalebi and Green2007 for recent exhaustive reviews). These results support Green's convergence hypothesis (Green, Reference Green, van Hout, Hulk, Kuiken and Towell2003), according to which differences in neural organization between L1 and L2 disappear as proficiency increases.
Although the previous conceptual models provide important suggestions on how the lexical-semantic system can be organized, an understanding of the bilingual brain must also entertain the mechanisms controlling this network. Indeed, neuroimaging studies show that the main differences in brain activation between tasks in L1 and in a low-proficiency L2 are actually found outside the classical language areas (for instance, in areas involved in problem solving such as the prefrontal cortex and the anterior cingulate cortex) suggesting that these differences are mainly engaged in language control rather than in lexical aspects per se. Language control in bilinguals is probably realized through a competition between L1 and L2. Green (Reference Green1998), in a conceptual model named the inhibitory control (IC) model, suggested that this competition is resolved by inhibiting all non-target competitors at the lemma level. Although questioned by some experiments (e.g., Finkbeiner, Almeida, Janssen & Caramazza, Reference Finkbeiner, Almeida, Janssen and Caramazza2006), the idea that inhibition plays an essential role in language selection is supported by many recent behavioral and neuroimaging studies (Abutalebi & Green, Reference Abutalebi and Green2007; Christoffels, Firk & Schiller, Reference Christoffels, Firk and Schiller2007; Kroll, Bobb & Wodniecka, Reference Kroll, Bobb and Wodniecka2006).
The previous elements provide a rich conceptual framework for the study of the lexical-semantic aspects of bilingualism that can be further verified through upcoming neuroimaging and behavioral data. An increasing role in this field is being played by computational connectionist models. Brain-inspired mathematical models making use of distributed neural networks can offer important benefits for cognitive neuroscience in general, and for neurolinguistic studies in particular. These models may force conceptual theories to be described in rigorous terms; can be used to emulate brain development processes through realistic learning rules; can check the feasibility of existing theories against available data, and can be used to generate testable predictions that can drive the design of future experiments. Finally, differences between the model and real data can provide indications on aspects that need to be modified or removed in future theories.
Indeed, several connectionists models on bilingualism based on neural distributed networks have been developed in the past thirty years, as summarized in some review papers (French & Jacquet, Reference French and Jacquet2004; Thomas & van Heuven, Reference Thomas, van Heuven, Kroll and de Groot2005) (a more detailed analysis of some of these models is postponed to the “Discussion” section below).
Most of the existing models, however, are aimed at investigating the phonological aspects of bilingualism. Some simulate how words in two languages can be segregated into clusters based on phonetic differences (Li & Farkas, Reference Li, Farkas and Altarriba2002), or on the statistics of word association (French, Reference French1998), others emulate interference effects between phonologically similar words (Dijkstra & van Heuven, Reference Dijkstra, van Heuven, Grainger and Jacobs1998, Reference Dijkstra and van Heuven2002). Indeed, the model by Li and Farkas also considers semantic aspects, including two distinct self-organizing maps, one for phonological information and the other for semantic information. Two more recent models developed by Miikkulainen and Kiran (Reference Miikkulainen and Kiran2009) and Zhao and Li (Reference Zhao and Li2010) also lay emphasis on the semantic aspects of bilingualism (i.e., on how connections develop between the conceptual store and the lexical one). A main limitation of these models, however, is that they do not provide a clear explanation of how a competition between L1 and L2 develops during acquisition of the second language.
In recent years, we developed an original model (Cuppini, Magosso & Ursino, Reference Cuppini, Magosso and Ursino2009; Ursino, Magosso & Cuppini, Reference Ursino, Magosso and Cuppini2009) to explore several important issues of semantic memory, emphasizing the possible topological organization of the neural units involved, their reciprocal connections and synapse learning mechanisms. The model assumes that objects are represented via different multimodal features, encoded through a distributed representation among different cortical areas: each area is devoted to a specific feature. Features are topologically organized and linked together by implementing two high-level rules: similarity and previous knowledge. Furthermore, the model can retrieve multiple objects simultaneously through the synchronized activity of neural oscillators in the gamma-band (Bertrand & Tallon-Baudry, Reference Bertrand and Tallon-Baudry2000). Finally, lexical aspects are represented in a separate cortical area, and linked with the object semantics via a Hebbian mechanism.
Previous publications (Cuppini et al., Reference Cuppini, Magosso and Ursino2009; Cuppini, Magosso & Ursino, Reference Cuppini, Magosso, Ursino and Perusich2010) used the model to study the lexical-semantic aspects of a single language. Naturally, the same theoretical structure, with some extensions and with the inclusion of additional mechanisms, can also be used to explore some aspects of bilingualism. To this end, the monolingual model should be broadened by considering the following additional aspects: how words in the second language are acquired by being parasitic on a previously existing language; how words in the two languages can compete via inhibitory mechanisms; how the second language representation changes with proficiency; how inhibitory mechanisms can be used to switch from L1 to L2.
In a recent paper (Ursino, Cuppini & Magosso, Reference Ursino, Cuppini and Magosso2010) we presented a preliminary simple model of bilingualism, which assumes an inhibitory competition between L1 and L2. In the following, this will be named the Basal Model. Since this model exhibits some evident drawbacks (especially at low L2 proficiency), the present paper implemented a second model (named the Extended Model) assuming the existence of additional direct links among L1 and L2 units in the Lexical Layer, which can be excitatory or inhibitory. The performances of the two models are compared, laying emphasis on the second model.
Firstly, the model is described in qualitative terms. Simulation results are then presented to show how the model responds to L2 word recognition tasks, L1 word recognition tasks, or to word production tasks at different proficiency levels. An example of language switching is also provided in case of high proficiency L2.
After a training phase, the model can be used to provide possible answers to the following questions: How are words in the two languages acquired? Is L2 acquired by exploiting existing L1 words or rather through a direct link to semantic concepts? What competition may develop between words in the two languages having the same semantics? Does this competition vary with the proficiency level? Can this competition provide indications to explain semantic interferences among words? Do model results support the “convergence hypothesis” and can they help to explain neuroimaging results (at least approximately)? Is it possible to distinguish between direct (i.e., automatic) competitive mechanisms and top–down inhibitory strategies, as hypothesized by some authors (Rodriguez-Fornells, De Diego Balaguer & Münte, Reference Rodriguez-Fornells, De Diego Balaguer and Münte2006)?
2. Method
The model is based on the idea that the semantic and lexical aspects of languages are stored in two distinct areas (de Groot, Reference de Groot1992; Francis, Reference Francis1999; French & Jacquet, Reference French and Jacquet2004; Kroll et al., Reference Kroll and Stewart1994; Potter et al., Reference Potter, So, Von Eckardt and Feldman1984). Hence, the model consists of two main networks: the first (named “Feature Network”) is devoted to object representation realized as a collection of sensory-motor features. The second (named “Lexical Network”) is devoted to the representation of word forms or lemmas, as they derive from an upstream processing of phonemes or letters, regardless of which language they belong to. The two networks communicate via direct trained synapses. The Lexical Layer also receives a signal from a “Decision Network”, which recognizes whether a correct object is present in the Feature Areas, and avoids a word being evoked by a misleading representation. Finally, a “Competition Area” is inserted between lexical units by means of a network of inhibitory interneurons. This is explicitly devoted to implementing a winner-takes-all (WTA) dynamics between words which may be co-active in different languages. WTA means that the neural unit with stronger activation inhibits the other ones, so that only a single unit is active in steady state conditions (i.e., when the transient response after receiving the input has been extinguished).
This work considers two variants of the model. In a first version (hereinafter named the Basal Model), units in the Lexical Layer are not directly connected to other units in the same area, but connections occur only indirectly via the semantic and competitive networks (Figure 1A). In the second version of the model (the Extended Model), units in the Lexical Layer can also be linked via direct synapses that can be excitatory or inhibitory and are the result of Hebbian learning (Figure 1B). In other words, the Extended Model includes an additional synaptic mechanism in the Lexical Layer with respect to the Basal Model.
The Basal Model is first described in qualitative terms, then the Extended Model is presented emphasizing its differences with respect to the Basal Model. Equations and parameter numerical values are given in Supplementary Materials online accompanying this paper on the journal's webpage accessible via http://journals.cambridge.org/BIL.
A schematic description of the two model variants is presented in Figures 1a and 1b, before and after learning a word in L2.
2.1 The Basal Model
The Feature Network
As described in previous papers (Cuppini et al., Reference Cuppini, Magosso, Ursino and Perusich2010, Reference Cuppini, Magosso and Ursino2009; Ursino et al., Reference Ursino, Magosso and Cuppini2009), this network is composed of F distinct cortical areas (in the following examples, F = 4), each devoted to the representation of a specific attribute or feature of the object (for instance, one may represent colours, another may represent shapes, or actions). We assume that each feature has been extracted from a pre-processing stage in the neocortex, which elaborates sensory-motor information. In the present work, only schematic objects are used (i.e., these simulations are “proofs of concepts”).
Each unit in the Feature Areas consists of an oscillator (see also Ursino, La Cara & Sarti, Reference Ursino, Cara and Sarti2003): this means that the unit is silent if it does not receive enough excitation, but oscillates in the γ-frequency band (30–50 Hz) if excited by a sufficient input. This oscillator is realized via the feedback connection of two neural groups (one excitatory and one inhibitory) (Wilson & Cowan, Reference Wilson and Cowan1972), an arrangement that mimics that encountered in the cortical column. Examples of oscillations can be found in subsequent figures of this work (Figures 2, 3 and 5–12).
The oscillators are placed in a bi-dimensional lattice. This structure more closely resembles that found in the cerebral cortex. In particular, a two-dimensional map is more suitable to represent the columnar organization of the cortex where features may vary both within a column and from one column to another (Tanaka, Reference Tanaka2003). Furthermore, a two-dimensional map encodes a richer and more flexible description of similarity in which a feature has several neighbours.
The Feature Network implements two main cognitive principles (similarity and previous knowledge). First, all features in the same area are topologically organized, i.e., spatially nearby oscillators code for similar features (a condition commonly encountered in the cortex; Rolls & Treves, Reference Rolls and Treves1998). Accordingly, oscillators in the same area are connected via lateral excitatory and inhibitory synapses with a classical “Mexican hat” disposition. This signifies that two proximal units coding for similar features tend to excite reciprocally, but inhibit dissimilar features. As a consequence, presentation of a given feature to the network activates a “bubble” of units, located around the unit encoding the given feature. Second, neural oscillators belonging to different areas can be connected via excitatory synapses after training, on the basis of frequent previous co-occurrence. These synapses are initially set at zero, but may assume a positive value through a learning phase.
Training of an object in the semantic network is achieved by presenting all its features simultaneously and reinforcing excitatory synapses among the different features via a time-dependent Hebbian rule, used in a previous work (Ursino et al., Reference Ursino, Magosso and Cuppini2009). This rule assumes that the synapses are reinforced on the basis of the co-occurrence of the present activity in the post-synaptic neuron, and the average activity of the pre-synaptic neuron in the previous 10 ms (Markram, Lubke, Frotscher & Sakmann, Reference Markram, Lubke, Frotscher and Sakmann1997). At the end of this phase, enough information is stored inside the network to allow object recognition even in the presence of incomplete or moderately altered properties. Moreover, several objects can simultaneously oscillate in time division in the gamma-band. Details are described in Ursino et al. (Reference Ursino, Magosso and Cuppini2009).
The Lexical Network
This network is devoted to a representation of lexical aspects. Each unit represents a specific “word form” or “lemma”. The present model codes a word form by a single unit at a given position in the Lexical Network (similarly, a feature is coded by a single unit in the corresponding Feature Area). At present there is no relationship between the position of a word form in the Lexical Area and its phonetic representation, hence the position is chosen arbitrarily. When a neuron in the Lexical Area receives enough excitation to jump from the silent to the active state, we say that the corresponding word has been recognized. Excitation, in turn, can arrive from an external input (in that case we assume that the subject is listening to the word or is reading it) or can arrive from the semantic network (in that case the subject is perceiving the object and evokes the correct word). This paper describes external inputs as an array of 20 × 20 scalar values (one input per each lexical unit). We assigned the value 0 to the input when the word is not listened to (or read) and the value +1 when the subject is listening to the word (or is reading it). This value is sufficient to drive the lexical unit from the silent to the active state. A pre-processing network linking the Lexical Network with realistic external inputs can be incorporated in future works, taking inspiration from previous models (see e.g., Dijkstra & van Heuven, Reference Dijkstra, van Heuven, Grainger and Jacobs1998, Reference Dijkstra and van Heuven2002; French, Reference French1998; Li & Farkas, Reference Li, Farkas and Altarriba2002; Thomas & van Heuven, Reference Thomas, van Heuven, Kroll and de Groot2005; Zhao & Li, Reference Zhao and Li2007 for possible models).
The activity of each unit in the Lexical Area is computed from its input by means of a filter which cuts off frequencies above 100–150 Hz (to simulate the temporal response of neurons to a sudden input) and a sigmoidal characteristic ranging between 0 and 1 (0 means that the unit is silent, and 1 conventionally means maximal activation).
For the sake of simplicity, the present study does not consider a topological organization for this network (and so we cannot simulate interference between phonologically similar words). This aspect can be the subject of future extensions.
Lexico-semantic links
The conceptual and the lexical levels are reciprocally connected by the formation of long-range excitatory synapses between co-active units in the two networks. These connections are created through a second learning phase, during which a “word” and the corresponding conceptual representation are simultaneously given to the model.
Synapses from the Feature Areas to the Lexical Layer (WF ij,hk in Figure 1) and from lexical units to the features layer (WL ij,hk) are initially set at zero, and increased with a Hebbian rule, using the co-activation of the pre-synaptic and the post-synaptic activity. Moreover, we assumed that inter-area synapses cannot overcome a maximum saturation value. This is realized assuming that learning factors are progressively reduced to zero when synapses approach saturation. After training, a “word” and its specific attributes are combined to realize an integrated lexical-semantic representation of the object that can be activated indifferently by language or sensory-motor inputs.
The decision network
We assumed that, during an object naming task a word in the Lexical Network can be activated from information in the semantic network only if all features of the object are recovered, and correctly segmented from those of other objects (i.e., we assume that generalization from incomplete or noisy inputs occurs entirely in the semantic net). In case of incorrect object recognition or wrong segmentation, the corresponding word must not be evoked. To deal with this problem, we used a “decision network”, developed in the previous paper (Ursino et al., Reference Ursino, Magosso and Cuppini2009), which implements a top-down strategy. This network receives inputs from the Feature Areas and verifies that there is one and only one “activation bubble” in any area. To this end, the network computes the total activity in any area at a given instant, and checks whether the activity ranges between a minimum threshold and a maximum threshold. If activity is too low, no activation bubble is detected in that area, i.e., the object lacks some features. If activity is too high, too many activation bubbles are evoked simultaneously in the area, i.e., the objects have not been correctly segmented. Furthermore, the previous condition must be verified along a certain time interval to ensure the continuity of object perception.
Finally, we assume that the “decision network” sends sufficient inhibition to all units of the Lexical Layer to keep them silent as long as no object is recognized in the Feature Areas. This inhibition is then withdrawn as soon as a correct object is present, and the Lexical Layer can be activated by the properties in the Feature Network.
The competitive mechanism
The previous model can work satisfactorily in case of monolingual subjects. In bilingualism the Lexical Layer can store words belonging to different languages, but referring to the same concept. (A case of two words with exactly the same semantic meaning might also occur in one language, albeit rarely. We suggest that it might be treated as in bilingualism.)
To ensure that only one word at a time is activated by the conceptual representation, units in the Lexical Layer interact via a competitive mechanism. The role of this mechanism is to solve possible conflicts when two or more words are referring to the same concept and tend to be simultaneously active. The two words compete with one another to emerge, but usually only one can win the competition.
To this end, we implemented a layer of inhibitory interneurons, with the same number of units as units in the Lexical Layer. They are described in the same way as lexical units (i.e., by means of a low-pass filter and a sigmoidal relationship). Each element in this “inhibitory Competition Area” receives an excitatory synapse from just one element in the Lexical Layer (say, its “master”) and tries to inhibit all other units in the Lexical Layer competing with the master.
The synapses from the Competition Area to the Lexical Layer are the result of a training phase (see the sub-section “Training a bilingual” below). Finally, the Competition Area receives an additional input from top-down influences. This may be used to switch from one word to another, causing the inhibition of the non-target word.
Training a bilingual
The third training phase consists in learning L2 words, assuming that L1 words have already been learned with high proficiency (i.e., synapses between L1 words and the corresponding conceptual representations are close to saturation). To simulate L2 learning, a unit coding for a word in the second language is activated by its external input together with its translation in L1 (i.e., we provide both simultaneous excitatory inputs to the Lexical Network). The L1 word activates the object conceptual representation in the semantic store by means of its connections with the Feature Areas. Hence, the following units are co-active in the model: the L1 word and the L2 word in the Lexical Layer; the corresponding inhibitory units in the Competition Area, and the oscillators describing the object semantics in the Feature Areas.
Two kinds of synaptic changes occur, considering units that are simultaneously active and applying the learning rules
(i) Excitatory synapses linking the Feature Areas and the L2 word (i.e., synapses WF ij,hk and WL ij,hk) are created using the same Hebbian mechanism implemented in language L1. The effect of these synapses is that after prolonged learning the L2 word can evoke its conceptual representation per se, without the participation of L1. Moreover, the conceptual representation may evoke the L2 word.
(ii) Inhibitory synapses are created from the interneurons in the Competition Area to the units in the Lexical Layer. At the beginning of training these synapses are set at zero. After a bilingual training, an interneuron sends inhibition to other words (generally in another language) which were co-active with its master, trying to inhibit them. This implements a competition mechanism between L1 and L2 words with the same semantics.
At the end of the L2-training, a unit in the Lexical Layer receives its conceptual input from the Feature Networks, and an inhibitory input mediated by the Competition Area from the other words referring to the same object representation.
A simple schema of L2 learning in the Basal Model is depicted in Figure 1A.
2.2 The Extended Model: Differences with respect to the Basal Model
The Extended Model differs from the Basal Model due to the presence of direct connections among units in the Lexical Layer. All other aspects (including the learning rules) are identical.
According to previous conceptual models (de Groot, Reference de Groot1992; Kroll et al., Reference Kroll and Stewart1994; Potter et al., Reference Potter, So, Von Eckardt and Feldman1984), we assume the possibility that direct L1–L2 lexical links are also learned with experience. Furthermore, since these links may be asymmetrical (Kroll et al., Reference Kroll and Stewart1994), we assume the possibility of excitatory or inhibitory connections occurring.
These connections are initially set at zero and are subject to a learning mechanism during the learning of languages L1 and L2, when units in the Lexical Layer are active. The difference compared with the previous formulation of the Hebbian rule is that we are now assuming both the possibility of reinforcement (positive change) or weakening (negative change) for these synapses, whereas only reinforcement was considered for the other synapses in the network. In particular, we assumed that the weight of the connection between two lexical units changes whenever the pre-synaptic neuron is active; the sign of the change (positive or negative) depends on the activity (above or below a given threshold) of the post-synaptic neuron. We must distinguish two different phases during learning of these synapses:
(i) During learning of language L1, just one lexical element is active, together with the corresponding object representation. As a consequence, at the end of the L1 learning, inhibitory synapses are formed within the Lexical Layer: these emerge from the lexical units in L1 and target all other units in the Lexical Layer. The presence of these inhibitory synapses explains why, during the early phase of L2 training, synapses between L1 and L2 words are asymmetrical.
(ii) During L2 learning, two units are co-active within the Lexical Layer: the L1 word and the new L2 word. Hence, the Hebbian rule predicts that excitatory synapses between these two words are reinforced. As a consequence, a positive synapse sprouts from the L2 to the L1 word, which progressively increases to saturation at high L2 proficiency. Furthermore, the link from the L1 word to the L2 word becomes less inhibitory and, at high L2 proficiency, may also become positive (due to a prevalence of excitation on inhibition). At the same time, inhibitory synapses are formed between L2 words and all other (non-active) words in the Lexical Network.
In general, we can say that at low proficiency the L1 word directly inhibits the L2 word, whereas the L2 word sends excitation to the L1 word. At high proficiency, the synapse from the L1 to the L2 word is converted to a moderate excitation, while the excitatory link from the L2 to the L1 word reaches saturation. The pattern of direct synapses is asymmetrical, as predicted by the “revised hierarchical model” (Kroll et al., Reference Kroll and Stewart1994). Only at very high proficiency, will synapses tend to become equal, as predicted by the “convergence hypothesis” (Green, Reference Green, van Hout, Hulk, Kuiken and Towell2003).
Of course, besides these direct links in the Lexical Layer, the Extended Model maintains a competition between L1 and L2 words, realized through the Competition Area and progressively reinforced with proficiency. We assumed that the competition mechanism develops more slowly than the direct synaptic links, but exhibits a stronger asymptotic behaviour.
A simplified schema of learning in the Extended Model is illustrated in Figure 1B.
2.3 Some considerations on the training rules adopted
The present work used different rules to train connections among units. These choices were adopted in part following classical knowledge on synaptic plasticity (long-term potentiation, LTP, and long-term depression, LTD) in part following recent ideas on the role of the anterior cingulate cortex in “conflict detection” (see Botvinick, Braver, Barch, Carter & Cohen, Reference Botvinick, Braver, Barch, Carter and Cohen2001) (especially the competition mechanism). These rules require ad hoc validation and may represent testable aspects in future studies.
Hence, it is interesting to justify these differences in training rules and discuss their implications.
Synapses within the Lexical Area
The training rule for these synapses (see Eq. 24 in the supplementary material) was written assuming that the connections between two units in the Lexical Network make use of both excitatory and inhibitory synapses (hence the total connection can change its sign) and that synapses are subject to long-term potentiation (LTP) and long-term depression (LTD) depending on the value of pre-synaptic and post-synaptic activity. To realize LTP and LTD within the same rule, neural computation textbooks (see Dayan & Abbott, Reference Dayan and Abbott2001; Trappenberg, Reference Trappenberg2002) suggest comparing neuron activity with a threshold (this rule is often named the “covariance Hebb rule” (Dayan & Abbott, Reference Dayan and Abbott2001) when the threshold is close to the neuron average activity). Accordingly, we assumed that excitation is reinforced (and inhibition reduced) when both the presynaptic activity and the post-synaptic activity are above threshold (LTP). Conversely, excitation is weakened (and inhibition reinforced) when the presynaptic activity is above a threshold and the post-synaptic activity is below a threshold (homosynaptic LTD). This rule may result in a different sign (positive or negative) of the overall synaptic connection between the two lexical units, depending on whether excitation or inhibition prevails. It is worth noting that we did not assume heterosynaptic LTD (i.e., depression when the pre-synaptic activity is off and the post-synaptic activity is on) since inclusion of this mechanism would lead to symmetrical synapses (as in traditional autoassociative nets, like the Hopfield model; Hopfield, Reference Hopfield1984), whereas we wish to obtain an asymmetrical pattern of synapses as in the revised hierarchical model (Kroll et al., Reference Kroll and Stewart1994).
Training of synapses between the lexical and the semantic networks
In order to train these synapses, we made use only of LTP because during our training periods a word and its semantic representation are always active together. Hence, only potentiation of excitatory synapses can occur.
Training connections from the control area to the lexical units
These connections are inhibitory, but in the model follow a different rule: they are reinforced (i.e., become more inhibitory) when both the presynaptic interneuron in the control area and the postsynaptic neuron in the Lexical Area are active. Depression was not incorporated. How is this different choice explained?
Our basic idea is that connections from the Competition Layer to the Lexical Layer reflect a mechanism for conflict resolution, which may involve different brain areas and is not merely a synaptic change. Some authors, (see, among the others, Botvinick et al., Reference Botvinick, Braver, Barch, Carter and Cohen2001), starting from data on the anterior cingulate cortex (ACC), assumed that the ACC is specialized in conflict detection and produces a cognitive control signal. In this model, conflict is computed using the product of activity in pairs of co-active units (Botvinick et al., Reference Botvinick, Braver, Barch, Carter and Cohen2001; Yeung, Botvinick & Cohen, Reference Yeung, Botvinick and Cohen2004). When two units are simultaneously active, conflict is detected and a control signal is reinforced. Our rule basically follows the same strategy, implementing a sort of “conflict detection model”. Moreover, our model assumes that the control signal is inhibitory, as often postulated in the literature on bilingualism (Green, Reference Green1998; Kroll et al., Reference Kroll, Bobb and Wodniecka2006).
In conclusion, our model does not detect conflict when there is only one active word in the Lexical Area during training, hence there is no reinforcement of a control signal. Conversely, the occurrence of a conflict (the presence of both L1 and L2 words during training) leads to the development of a control inhibitory signal; since this control is inhibitory, it implements a sort of winner takes all dynamics between the two words.
2.4 Parameter assignments
An important aspect of neurocomputational models is parameter assignment. Several parameters in our model do not have a clear neurophysiological counterpart (for instance, individual units in the model do not represent single neurons, but groups of co-active neurons coding for the same information). Hence, several parameters must be given “a posteriori” on the basis of the behavior obtained. These parameter choices represent assumptions requiring further validation on the basis of future neurophysiological or behavioral data.
Training rates
The main assumption is that the synapses implementing the control mechanism from the competition to the lexical area are trained more slowly (β = 0.001) compared with those within the lexical area (β = 0.01), and with those linking semantic and lexical units (β = 0.01). The effects of this choice will be evident in Figure 4 below. It is worth noting that this difference in learning rates cannot be easily demonstrated in vivo, hence represents a fundamental “working hypothesis” of this study, which finds a justification only “a posteriori”, on the basis of the obtained results. Indeed, this hypothesis leads to a pattern of synapses in agreement with several experimental findings. In particular:
(i) studies on cross-language priming provide straightforward evidence that low proficiency bilinguals present a clear asymmetric pattern of priming effects, suggesting that cross-lingual lexical connections actually develop at very early stages of L2 (Dimitropoulou, Duñabeitia & Carreiras, Reference Dimitropoulou, Duñabeitia and Carreiras2011; Keatley, Spinks & de Gelder, Reference Keatley, Spinks and de Gelder1994; Kroll et al., Reference Kroll and Stewart1994). The rationale is that, at the beginning of training, a word in L2 must activate the corresponding L1 word to access its semantics. Only after the creation of its own semantics, L2 should inhibit L1.
(ii) Studies testing bilinguals at the highest levels of L2 show symmetric effects across the two translation directions (Duñabeitia, Perea & Carreiras, Reference Duñabeitia, Perea and Carreiras2010). iii) The competition signal in the model does not represent a simple synapse, but a more sophisticate control strategy (involving probably the ACC) to solve conflict conditions. This conflict resolution system requires a longer training phase.
Saturation values for synapses
Synapses within the semantic network and in the decision network agree with those described in a previous paper (Ursino et al., Reference Ursino, Magosso and Cuppini2009). The synapses from the Lexical Layer to the semantic layer were made strong enough for an active word to evoke all its features (i.e., activity of a word in the Lexical Layer causes all its features to oscillate in the gamma range). The synapse from the decision unit to the lexical unit was made so that when the decision unit is in the off state, all lexical units are inhibited. The synapses from the Feature Network to the lexical unit were made strong enough so that when all features of an object are simultaneously active (and so the decision network is in the on state), the associated lexical unit receives enough excitation to jump from the silent to the active state. The sigmoidal relationship of the lexical units and the competitive units are sharp, so that their passage from the off to the on state occurs quite abruptly.
Time constants for neuron dynamics are of the same order (10 ms) as those typical of neuron membranes.
In conclusion, we made use of just two fundamental assumptions: that the competition mechanism develops more slowly than the others, and that synapses (at the end of training, i.e., at high proficiency) are strong enough to allow words to activate its semantic representation and vice versa.
3. Results
This section presents the results of different simulations to illustrate the differences between the two models and highlight how they can mimic the mechanisms involved in L2 learning.
Although the network can store and retrieve different objects and their relative words, for the sake of simplicity we will use just one exemplary object and the corresponding words in the two languages. Simulations were performed assuming that L1 had been completely learned. Three different stages will be shown:
(i) the beginning of training, when the connections between L2 and the Feature Areas are still weak;
(ii) an intermediate learning stage (intermediate training);
(iii) the stage after a long training period, when proficiency is high (advanced training).
For each training phase we present two simulation results characterized by different inputs to the model: in the first, the network is stimulated with all features of an object (word production task); in the second, the L2 word is used as input to the model (word recognition task). Moreover, in case of highly trained synapses (last stage of training), the model is tested in two additional cases: L1 word recognition, to disclose differences in the Lexical Layer with respect to the L2 word recognition; word production paired with an additional inhibitory top–down input, to switch the preferred language from L1 to L2.
3.1 Basal Model
First we tested how the Basal Model modifies synapses and responds to different inputs during the different phases of the learning process. Since its behavior is less interesting than the behavior of the Extended Model, only a few simulation results are displayed.
Beginning of training (stage i)
In response to the object features, the model correctly recalls the corresponding L1 word. Conversely, when it is stimulated with the L2 word, it is not able to recall the object features. This means that the synapses linking L2 to the Feature Areas are still too weak to evoke any semantics about the word. Since no link exists between L2 and L1 in this model, the L2 word fails to produce any effect.
Intermediate training (stage ii)
The responses of the model at this training stage are reported in Figure 2. The left panels show the case of word production; the right panels the response to presentation of the L2 word. The object presentation in the Feature Areas evokes the corresponding L1 word, whereas no activity is elicited in the element describing the L2 word (Figure 2A). Conversely, when the L2 word is used as input, the model shows activation not only of the L2 word in the Lexical Layer and object representation in the Feature Areas, but also moderate activation of the L1 word in the Lexical Layer (Figure 2B). The reason for this behavior is that synapses between the Feature Areas and L1 word are strong, whereas the competitive mechanism is still immature and subject to training. Hence, in this phase the L2 word cannot completely win the competition with its L1 counterpart.
Advanced training (stage iii)
Figure 3 shows the results of the same two simulations performed in case of higher proficiency. After presentation of the object in the Feature Areas, L1 is activated in the Lexical Layer while L2 shows just a mild activation. This signifies that, despite improved learning of L2, the subject still commonly uses L1 as a default language and can almost completely inhibit L2. Conversely, during presentation of the L2 word, L1 is inhibited. L2 is well learned, and its direct stimulation produces an activity in the Feature Areas comparable to that produced by L1 stimulation, without the participation of the L1 word. The competitive mechanism is strong enough, and only one word is active at a time.
3.2 Extended Model
In order to clarify the differences between the Extended Model and the Basal Model, and to help understand subsequent simulation results, Figure 4 shows the pattern of synapses at different steps during training in the Extended Model. The left upper panel of this figure shows the synapses targeting to the L2 word from the L1 word. The left bottom panel shows the synapses targeting into the L1 word from the L2 word. Finally, the right panel shows the synapses entering the L2 word from the Feature Areas (we can see four peaks associated with the four properties of the object).
The inhibitory synapse coming from the competition layer in Figure 4 has been drawn with a positive value although its effect is inhibitory (hence, it has a negative effect on the target unit). This choice was adopted to allow a direct comparison between the excitation coming from the lexical layer and inhibition coming from the competition layer: when the two curves cross, inhibition and excitation are equal.
The synapses linking an L1 word to the L2 word start with a negative value since L1 was frequently active in past experience without the presence of the L2 word. This resulted in homosynaptic depression and in the creation of an inhibitory link. Conversely, the synapse from L2 to L1 is initially at zero, since L2 was never active before (our training rule does not contemplate heterosynaptic depression). Finally, the inhibitory connections from the Competition Area to both L1 and L2 are also initially at zero, since these “competition signals” are reinforced only in the presence of a conflict (they follow a different learning strategy, see above) and no conflict between L1 and L2 was revealed before.
As training of the second language progresses, LTP causes the creation of a excitatory link between L1 and L2 (dashed line in the left panels), while the control mechanisms trigger a reinforcement of the competition signal (continuous lines). Simultaneously, the semantic connections between the semantic net and the L2 word are also reinforced (right panel).
The effect of these synapses at different proficiency levels can be commented as follows. At the beginning of training, the interaction between the L1 and L2 words occurs especially via a direct link (dashed lines in the left panels) whereas competition via the inhibitory layer (continuous lines) is still negligible. Moreover, the L2 word exhibits only a weak connection with the Feature Areas. In this condition, the L2 word excites the L1 word via a direct connection created by Hebbian learning (this is the main difference compared with the Basal Model, which lacks this link). This means that L2 tries to parasitize on L1 to have access to its semantics. As training progresses, competition through the inhibitory layer becomes more important than any direct links between the two words, while the L2 word exhibits a strong connection with its semantic representation in the Feature Areas. In this situation, the L2 word tries to inhibit L1, to access its semantics directly. However, in case of activation of both words from an external cue (word production) L1 still wins the competition thanks to a difference in direct synapses. Only at very high training levels do the two language representations become equivalent in terms of synaptic strength (convergence hypothesis). Simulations are illustrated in Figure 5.
Beginning of training (stage i)
Figure 5 shows the results during a word production task. The two panels on the left display the activity elicited in the Feature Areas, (upper panel) and in the Lexical Layer (bottom panel). The external inputs directly activate the object representation in the Feature Areas; in turn, the Feature Areas activate the L1 word through the strong synapses from the Feature to the Lexical Layer. No L2 word activity is evoked. In order to gain a deeper understanding of these simulations, the right panels show some expanded snapshots of the activity in the Lexical Layer (upper panel), the different inputs converging to the L1 word (second panel), the different inputs converging to the L2 word (third panel), and activity in the Feature Areas (bottom panel). It can be seen that the overall input from the Feature Areas to L1 is strong, whereas inputs from the Lexical Layer and from the Competition Area to L1 are negligible. On the other hand, L2 receives a weak stimulus from the Feature Areas (since these synapses are still immature), whereas it receives a strong inhibition both from the L1 word in the Lexical Layer, and from the Competition Area. The result is that only L1 is active in the Lexical Layer.
Figure 6 displays the activities elicited in the network after presentation of the L2 word. As shown in the bottom panel on the left, both words are simultaneously strongly active in the Lexical Layer. The reasons are explained in the right-hand column. As shown in the second panel on the right, the L1 word receives a strong excitatory stimulation (dashed line) directly from the L2 word in the Lexical Layer; this excitatory contribution overcomes the inhibitory contribution coming from the Competition Area (dashed-dot line). Thanks to its net excitatory input, the L1 word pops out and triggers activation of the object representation in the Feature Areas (see the left upper panel). Feature activation, in turn, reinforces activation of the L1 word (solid line in the right second panel). Finally, L1 activity induces a competition in the Lexical Layer, i.e., inhibition from L1 to L2 (dashed line in the third panel on the right) reducing the activity of the L2 word.
Two aspects of the previous simulations deserve comment. First, the model activates both L1 and L2 words in the Lexical Layer (and so it also activates the corresponding inhibitory interneurons in the Competition Area). Hence, the global activity in these areas is double that in a monolingual task. Second, the model can recall the object representation in the Feature Areas (i.e., the semantics of the word) only thanks to the activation of the L1 word. In order to verify this assumption, we repeated the same simulation with the Extended Model by eliminating all synapses from L2 to L1: in this condition, the model is unable to elicit any activity in the Feature Areas (note that this is the same result obtained with the simple model at stage one). This result is consistent with the hypothesis of others (Kroll et al., Reference Kroll and Stewart1994) that during the initial learning phase, L2 exploits the L1 lexical representation to make direct connections to semantics.
Intermediate training (stage ii)
In this stage the synapses between the L2 word and the corresponding object representation, and the synapses coming from the Competition Area to the Lexical Layer are stronger than in the previous case (see Figure 4). Moreover, the excitatory synapse from the word in L2 to the word in L1 has almost reached its saturation, and is overcome by the inhibitory synapse from the Competition Area. Hence, L1 is no longer excited by L2.
Figure 7 shows the results obtained during the word production task. The behavior is almost the same as that shown at the beginning of training (Figure 5): activity is present in the Feature Areas together with L1 word activity in the Lexical Layer; the L2 word is almost completely inhibited. However, contrary to the situation depicted at the beginning of training, in this condition the L2 word receives strong stimulation from the Feature Areas, but also strong inhibitory input from the Competition Area (right third panel). This pattern reflects a greater maturation of synapses. The final result is that the L2 word is not stimulated enough to be active.
Figure 8 shows the response to presentation of the L2 word. The L2 word is able to recall the object representation in the Feature Areas per se, i.e., without any need to exploit the L1 lexicon (this can be demonstrated observing how activity in the Feature Areas emerges in the bottom right-hand column at 16 ms, i.e., before the emergence of the L1 activity in the upper right-hand column at about 17 ms). The L1 word also emerges in the Lexical Layer since is indirectly activated by the semantic representation in the Feature Areas. The reason is that inhibition from the Competition Area is not strong enough to solve the competition between the two language representations. As can be seen in the second right-hand panel in Figure 8, the Competition Area sends an inhibition to the L1 word (dashed-dot line), but this is less than the sum of the excitatory contributions coming from the L2 word (dashed line), and from the Feature Areas (solid line). As a consequence, the L1 word is active, but with a value lower than that of the L2 word. This signifies a possible interference of L1 with L2.
Advanced training (stage iii)
Lastly, we tested the model after a protracted training, when the synapses had almost reached their maximum value (see Figure 4). Since this case is particularly interesting as an exemplum of a high-proficiency bilingual, a more complete simulation set is presented.
Figure 9 shows the results during a word production task. We can note that the L1 word is retrieved in the Lexical Layer, but a small amount of activity is also elicited in the L2 word. As shown in the right-hand panels of Figure 9, both words receive a strong input from the stimulated features (solid lines), but due to the still higher proficiency of L1 with respect to L2, the L1 word is activated faster and can drive the Competition Area to send strong inhibition to the L2 word (dashed-dot line). It is worth noting that in this stage of training the effect of the direct synapses between the two units in the Lexical Layer is less important both for L1 word and the L2 word (dashed lines).
When the network is stimulated by the L2 input (Figure 10), only this word is activated in the Lexical Layer and is able to recall the corresponding object representation in the Feature Areas. As can be seen in the right-hand panels of this figure, the L2 word is now able to elicit an inhibitory input from the Competition Area to the L1 word; L2 wins the competition and avoids the concomitant activation of the L1 word. In this training condition, the overall activity elicited by the L2 word is comparable to that obtained by presentation of the word in language L1 (the latter is displayed in Figure 11). So we can say that the network shows the behavior of a high-proficiency bilingual, which can evoke the correct semantics from both the L1 and L2 words, with almost no interference from the other language. However, in response to object presentation, the subject still prefers L1. In fact, in this phase, the direct synapses between L1 and L2 are still asymmetrical. The final schema reflects that hypothesized in Kroll et al. (Reference Kroll and Stewart1994).
Figure 12 reports the results obtained during a word production task in which the L2 interneuron in the Competition Area also receives an external top-down input, which has the task of switching from L1 to L2. This additional input forces the interneuron to send a stronger inhibitory input to the not-targeted word, in this case the L1 word. As we can see in the figure, the network presents an activity coinciding with the L2 word and correct object representation. Looking at the right-hand panels, it is worth noticing that inhibition to the L1 word differs from zero even when the L2 word is in the off-period, since it is driven by an external stimulus.
Finally, an important aspect of bilingualism concerns the behavior of patients with lesions. Some preliminary results are shown in the supplementary material.
4. Discussion
Research on multilingualism has made impressive advances in the past decade due to the concomitant presence of sophisticated neuropsychological and behavioral studies and the advent of new neuroimaging and electrophysiological techniques. A novel field, “the neuroscience of multilingualism” (Abutalebi, Tettamanti & Perani, Reference Abutalebi, Rosa, Tettamanti, Green and Cappa2009) is assuming an increasing role in neuroscience today, not only owing to the enormous number of subjects who can manage more than one language all over the world, but also because many neurocognitive problems faced in multilingualism may have a general validity in neuroscience. Within this emerging field, computational models based on connectionist neural networks may play a significant role in helping the conceptualization of knowledge by summarizing existing data into a coherent theoretical framework and testing the reliability of current hypotheses in rigorous quantitative terms.
In this work we implemented two models: the first named a posteriori the “Basal Model” (Ursino et al., Reference Ursino, Cuppini and Magosso2010) does not include any direct link between words in the lexical area. It is worth noting that this model structure resembles the “conceptual mediation model” proposed by Potter et al. (Reference Potter, So, Von Eckardt and Feldman1984). This model, however, has some important drawbacks, especially evident during the initial phase of L2 acquisition. In particular, the network abruptly passes from a condition in which L2 is unable to evoke any concept, to a condition where L2 competes with L1.
Conversely, several results (Chen & Leung, Reference Chen and Leung1989; Dufour & Kroll, Reference Dufour and Kroll1995; Kroll et al., Reference Kroll and Stewart1994) suggest that at low proficiency levels, when L2 is unable to evoke its conceptual representation per se, the L2 lexical items are processed primarily through association with their semantic equivalents in L1.
In order to overcome the previous limitation of the Basal Model, we introduced additional learning mechanisms in the “Extended Model” allowing the creation of direct links (both excitatory and inhibitory) between items in the Lexical Layer. With this further mechanism the present model resembles the “revised hierarchical model” proposed by Kroll and colleagues (Kroll et al., Reference Kroll and Stewart1994). These authors suggested that at low proficiency the direct links between the L2 word and its L1 translation must be asymmetrical. This asymmetry naturally emerges in our model assuming the presence of both excitatory and inhibitory synapses learned on the basis of previous correlation (or anti-correlation) between words
In the following, we discuss testable predictions of the Extended Model, compare it with existing models and neuroimaging data, highlight the original aspects of this work and point out lines for future investigations.
4.1 Testable predictions
The Extended Model makes several testable predictions, which may be the targets of future studies. These are briefly summarized as follows:
(i) Inhibition plays a fundamental role at high L2 proficiency, whereas it is less important during the early phase of L2 learning.
(ii) The model predicts greater neural activation when a low-proficiency L2 is used, compared with activation caused by high-proficiency L1, i.e., more units are active when using a weak language.
(iii) Neural activation decreases with proficiency, and so the use of high-proficiency L2 causes a similar neural activation as the use of high-proficiency L1.
(iv) The use of L2 with moderate-proficiency during object naming tasks requires involvement of a “control centre”, which inhibits L1.
(v) The model predicts a strong L1 interference on L2 during word recognition tasks performed at moderate L2-proficiency.
(vi) If both languages have high proficiency, control mechanisms must constantly operate to solve the conflict.
(vii) The interactions between high-proficiency L1 and high-proficiency L2 may be modulated by exposure to the environment, assuming a permanent modification of synapses, so that the language most frequently used in recent time tends to inhibit the other.
The following discussion will clarify some of these predictions.
4.2 Model behavior at different proficiency levels
Looking at Figures 5–12, we can distinguish three main levels of proficiency, which can roughly be defined as: beginning, weak, and strong.
During the beginning period, the model is in the situation depicted in Figure 1B left-hand panel: L1 inhibits L2 via a direct connection, whereas L2 excites L1. The L2 word must parasitize on L1 to access its conceptual representation.
During the period of weak proficiency, synapses linking L2 and the Feature Areas are more developed (Figure 1B, middle panel), hence an L2 item can directly access its conceptual meaning. However, the competitive mechanism is immature, while direct synapses in the Lexical Layer are still strongly asymmetrical (L2 excites L1 but L1 still inhibits L2, although with less intensity). The result is a strong interference of L1 on L2 in case of L2 word recognition (Figure 8). In particular, in this situation, as illustrated in Figure 8, the use of an L2 word evokes a significant activation of the L1 word, thus exciting also the corresponding inhibitory interneuron. As a consequence (see the testable prediction ii above) model predicts that a greater activation exists in the overall lexical network compared with that caused by a high-proficiency L1 word. Moreover, in case of object presentation, L1 completely dominates the Lexical Layer.
Finally, in case of high proficiency, the competitive mechanism becomes very strong (it resembles a winner takes all dynamics). However, L1 remains stronger than L2 (the reciprocal synapses are asymmetrical and the subject is not a perfect bilingual). The result is that there is almost no interference between L1 and L2 during a word recognition task (regardless of which word is given as input). In particular, the use of a high-proficiency L2 word does not evoke any appreciable activity of the L1 word (see Figure 10). Hence, the model predicts that the use of the L2 word induces global network activation like that in the use of the L1 word (testable prediction (iii) above). Conversely, during an object recognition task L1 dominates. Naming L2 during an object recognition task requires strong external inhibition direct to L1.
Were L2 acquisition to continue, the model would reach a completely symmetrical condition (“perfect bilingual”) in accordance with Green's (Reference Green, van Hout, Hulk, Kuiken and Towell2003) convergence hypothesis (see Figure 4). The final model would reassemble the “mixed model” described by de Groot (Reference de Groot1992). In this perfectly symmetrical condition we would expect a high level of interference during word production, with the need to maintain a permanent active inhibition of one language to favor the other.
Since inhibition plays an important role in our model, it deserves more comment. In a high proficiency subject a lexical item can be inhibited by the competitive network, which works to avoid interference. In normal conditions, when one language is stronger than the other, this network works autonomously, without the need for any external input (we normally speak L1). However, in particular cases, an external input can force inhibition of one word, thus consenting a switch to its alternative. This mechanism may be exploited in future works to simulate a language switching task, or to simulate language translation.
Two alternative models have been proposed in the literature to describe how a correct word is selected in bilinguals. The “language specific selection model” (Costa, Miozzo & Caramazza, Reference Costa, Miozzo and Caramazza1999) assumes that only one language is accessed at a time. Conversely, non-specific language models (Green, Reference Green1998) assume the simultaneous access to words in both languages, and that candidates across languages actively compete. The present model substantially assumes a non-selective access and, as in Green (Reference Green1998), presupposes the existence of a clear distinction between a single integrated lexical-semantic system, and control procedures operating on it. According to recent proposals, the model implicitly assumes that words can be at different levels of activation, and in order to use one word, its activation must exceed that of the translation equivalent in the Lexical Layer (Grosjean, Reference Grosjean1988; Paradis, Reference Paradis1984). Within this framework, our model assumes two control strategies to select words: a bottom–up strategy, internal to the model, which automatically selects the stronger word via a competitive mechanism, and a top–down strategy which requires external inputs.
4.3 Comparison with existing models
Several computational models of bilingualism have been proposed in past years, stressing the importance of a connectionist approach in multilingual neuroscience research (see Thomas & van Heuven, Reference Thomas, van Heuven, Kroll and de Groot2005, for a recent review). Most of these models, however, have a different purpose, emphasizing the possibility to cluster the two languages automatically on the basis of phonological cues (Li & Farkas, Reference Li, Farkas and Altarriba2002) or on the statistics of word association (French, Reference French1998). Thomas (Reference Thomas, Bullinaria, Glasspool and Houghton1997) built a model that transforms an activation pattern of a word's orthography into its meaning, based on a distributed semantic feature set. The model includes a group of hidden units as in classic feed-forward networks. The main result is that activity in hidden units allows L1 and L2 words to be grouped in separate clusters. Dijkstra and van Heuven (Reference Dijkstra, van Heuven, Grainger and Jacobs1998) extended a connectionist model by McClelland and Rumelhart (named IA model), which simulates orthographic processes in visual word recognition (McClelland & Rumelhart, Reference McClelland and Rumelhart1981) to the bilingual domain (BIA model). Aim of the network was to recognize if a word belongs to one language or the other on the basis of its orthographic aspects and to study the effect of word neighbors in both languages. A more recent version of the model (named BIA+) also includes phonological and semantic representations (Dijkstra & van Heuven, Reference Dijkstra and van Heuven2002). This model, however, does not consider how a bilingual structure develops over time and during learning. A similar model (named BIMOLA) was developed by Grosjean (Reference Grosjean2008). It uses auditory features, phonemes and words, and differs from BIA especially due to a more evident separation between L1 and L2 words.
Models which describe lexical development and include some competition mechanisms have also been realized in past years. Zhao and Li (Reference Zhao and Li2007), by extending a previous monolingual model by Li, Farkas and MacWhinney (Reference Li, Farkas and MacWhinney2004), investigated how the structure of a bilingual lexicon can emerge. The model assumes distinct semantic and phonological representations connected via Hebbian learning, and develops topographically organized maps for both representations using self-organizing algorithms. In this model, the competition between words is a function of the position in the orthographic map which, in turn, depends on the way lexical distributions are packaged. Regier (Reference Regier2005) describes a model for word learning which uses emergent symbols together with a competition mechanism. Like the present model, Regier's model includes a bidirectional associative memory from word forms to meaning and vice versa; these connections, however, are mediated by two hidden layers. Competition in the model is captured through a normalization rule, which computes the probability of word (or meaning) production.
Another model resembling the present one was developed by Miikkulainen (Reference Miikkulainen1993, Reference Miikkulainen1997) for monolinguals and recently extended to the bilingual case (Miikkulainen & Kiran, Reference Miikkulainen and Kiran2009). The model consists of two self-organizing maps, one for lexical symbols and the other for word meaning, and associative connections between them based on Hebbian learning. The model modulates relative language proficiency by the exposure to each specific language. An important result, somewhat similar to ours, is that asymmetry emerged between the L1 and L2 maps so that lexical activation in the non-dominant language results in activation of the corresponding representation in the dominant language. The Mikkulainen's model includes an orthographic map (absent in our model), but it does not include competitive mechanisms between L1 and L2 words, which are an important aspect of our model.
Briefly, although the previous models share some aspects with the present model (two stores for words and meanings, associative links, the need to select a winner for the network), the competition mechanisms necessary to discriminate between L1 and L2 are substantially different. In particular, our model is the first to explicitly incorporate the existence of inhibitory mechanisms between L1 and L2 words having the same semantics, simulate the maturation of different L2 vs. L1 interactions (from dependence to winner takes all competition) and train these mechanisms via physiological (Hebbian) rules. It can also easily incorporate top–down mechanisms for language selection by modulating the initial WTA competition bias (to favour one word vs. the other).
4.4 Comparison with neuroimaging data
An important aspect to be considered (in present and future model versions) is the relationship between model results and neuroimaging data. fMRI and PET data are becoming essential for any bilingualism theory (Abutalebi, Reference Abutalebi2008; Abutalebi & Green, Reference Abutalebi and Green2007; Perani & Abutalebi, Reference Perani and Abutalebi2005). Indeed, the present model is still too simple to attempt a clear relationship with neuroimaging. However, a few tentative considerations can be made. A coarse comparison between our model and neuroimaging data can be performed, by considering the amount of activity evoked in different areas at various proficiency levels. The model simulations suggest that at very low proficiency level recognition of an L2 word causes a greater neural activation in the Lexical Layer and in the Competition Area than recognition of an L1 word. Let us consider the situation presented in Figure 6. Here one can observe two zones of the Lexical Layer which are simultaneously active during L2 word recognition, corresponding to activation of the L2 word and the L1 word; of course, the inhibitory interneurons are also active in the same zones (since they receive their input directly from the lexical units). This signifies that a greater activation is recruited when the low-proficiency subject is trying to use an L2 word. At higher proficiency levels, L2 can be used without an evident activation of L1, thus reducing the overall activation in the Lexical Layer. This result is supported by neuroimaging data (Abutalebi, Reference Abutalebi2008), although it is difficult to force this parallelism beyond a very qualitative level. Studies investigating the lexical-semantic domain show that bilinguals with low-proficiency L2 entail additional brain activity compared with the L1 word or with monolingual subjects: the increased activity is especially observed in the left inferior frontal gyrus and in prefrontal areas (Briellmann, Saling, Connell, Waites, Abbott & Jackson, Reference Briellmann, Saling, Connell, Waites, Abbott and Jackson2004; De Bleser, Dupont, Postler, Bormans, Speelman & Mortelmans, Reference De Bleser, Dupont, Postler, Bormans, Speelman and Mortelmans2004). It is likely that some of these differences may be caused by the activation of control mechanisms. Let us consider a word production task. If L2 has lower proficiency than L1, it is obliged to inhibit L1 via further control mechanisms (which, in the present version of the model, are simply simulated by an external input) to produce the correct L2 word.
Our model assumes a simple inhibition mechanism reinforced by conflict detection to implement a competition between L1 and L2 words. Future versions of the model should focus on an explicit representation of these external control strategies, which may provide additional comparison with neuroimaging activation maps. In particular, a more complex control strategy may be implemented in future work, also exploiting information obtained from connectivity estimation techniques. Indeed, methods to estimate functional connectivity from fMRI data (such as dynamical causal modeling Friston, Harrison & Penny, Reference Friston, Harrison and Penny2003) are currently available and allow network structure to be investigated in individual patients. These may be compared with the predictions of the model or may provide important suggestions to improve model structure (see Abutalebi, Rosa, Tettamanti, Green & Cappa, Reference Abutalebi, Tettamanti and Perani2009 for a recent interesting example on the control network).
An attractive aspect of this model, rapidly tested in Figure A1 of the Supplementary Material on line, is that the network may be subject to malfunctioning in case of lesions (for instance reducing the number of active neurons, hence the synaptic strength). In this condition, one can expect interferences of one language on the other, depending on which of the two words was subject to the greater damage (either in its semantic input or in its control mechanism). Examples of such interferences are shown in Abutalebi, Miozzo and Cappa (Reference Abutalebi, Miozzo and Cappa2000). The role of connections between the control and language networks during language recovery was stressed in a recent work by Abutalebi et al. (Abutalebi, Rosa, Tettamanti, Green & Cappa, Reference Abutalebi, Tettamanti and Perani2009). Some recent studies analyzed bilingual deficits using a similar computational approach. In particular, Miikkulainen (Reference Miikkulainen1997) used his model, based on self-organizing maps, to simulate various damage to the lexical system, resulting in dyslexic and aphasic impairments. Recently, a preliminary report appeared that extend this study to a bilingual language system (Grasemann, Sandberg, Kiran & Miikkulainen, Reference Grasemann, Sandberg, Kiran and Miikkulainen2010).
5. Conclusions: Model limitations and future research
We terminate this discussion by commenting on two main model limitations, which should become the target of future studies.
A first significant limitation is that it does not use realistic inputs. This does not signify that the model is unrealistic, but that it focuses attention on an internal processing stage, and it needs other preprocessing networks to compute appropriate inputs and to be linked with the external world.
In particular, the lexical input represents a stimulus to a unit in the Lexical Area coding for a given word form. It may derive from a pre-processing network processing phonemes or orthographic symbols. Several previous models can be used as a pre-processing stage of the present Lexical Network to implement a relationship between phonology/orthography and word forms, so that the model can be used in future works with real inputs. Examples of such networks can be found in Hopfield and Brody (Reference Hopfield and Brody2001); Li et al. (Reference Li, Farkas and MacWhinney2004); Zhao and Li (Reference Zhao and Li2007).
The inputs to the semantic network are more complex: they represent the principal features of objects. In this case too, inputs should be extracted from a pre-processing network. Several examples of data sets using a feature representation of objects can be found in the literature (see among the others, Miikkulainen & Kiran, Reference Miikkulainen and Kiran2009; Vigliocco, Vinson, Lewis & Garrett, Reference Vigliocco, Vinson, Lewis and Garrett2004) and these can be used to provide realistic inputs to the semantic network in future works.
Linking the present model with pre-processing networks (such as those mentioned above) may allow the model to be used to simulate results of psychological tests, and to check its main hypotheses against real data.
A further limitation consists in the absence of top–down control strategies (although a bottom–up competitive mechanism is operative to solve conflicts in simpler cases). Results in Figure 12 stress the need for a second top–down control strategy, which can be considered an external input to the model, and inhibiting one language to favor the other. This control system may be recruited during paradigms like language-switching, language translation or language selection. In our simulations this external input may become necessary when two languages have a similar proficiency level (Figure 12) or when the subject is forced to use a low-proficiency language despite interference from the high-proficiency one (Figure 6). A classic point of view is that these conflicts are solved by a dynamic inhibitory input to the non-target language, and this may originate from various brain areas classically related to cognitive control, such as the caudate nucleus, prefrontal cortex and anterior cingulate cortex (Abutalebi, Reference Abutalebi2008). An interesting question is whether this top–down control system is specifically dedicated to language, or represents a more general structure devoted to conflict resolution independently of its explicit domain. A further important question is how this control system discriminates words in one language from those in the other without the need for explicit tags or explicit language nodes (as exploited in most previous models). Inclusion of top–down control strategies should be the main challenge of future model versions.