1. Introduction
A theory of code-switching should, ideally, not be a theory of code-switching. It should instead be a spelling out of the implications of a more general theory as it applies to the phenomena of code-switching. Given a cognitive perspective, the ideal theory will show how an account of code-switching can be derived from the nature of representation and processing. As social factors are clearly important, they must also be accommodated – in terms of how they are realized in and shaped by cognitive representation and processing. The foundation of a theory of code-switching should thus be a general model of psychological functioning that has relatively clear accounts of representation and processing and which readily accommodates social factors. The Modular Online Growth and Use of Language (MOGUL) framework (Sharwood Smith & Truscott, Reference Sharwood Smith and Truscott2014) fits these requirements and so will be taken as the basis for explicating the mechanisms that underlie code-switching and related phenomena.
Approaches to code-switching have tended to use a given theoretical perspective and techniques from within one or other area of social, anthropological, psychological, or neurological research. Since the 1970s, the code-switching literature has debated numerous issues surrounding the mixing of languages and what linguistic principles might determine switches and switch points (Auer, Reference Auer1984, Reference Auer, Milroy and Muysken1995; Blom & Gumperz, Reference Blom, Gumperz, Gumperz and Hymes1972; Heller, Reference Heller1988; MacSwan, Reference MacSwan1999; Myers-Scotton, Reference Myers-Scotton1993; Poplack, Reference Poplack1980; Sankoff & Poplack, Reference Sankoff and Poplack1981). Attempts to explain the motivation behind code-switching have adopted sociolinguistic and conversational analytic perspectives (Auer, Reference Auer1984, Reference Auer, Milroy and Muysken1995; Wei, Reference Wei and Auer1998). Neurolinguistic studies have sought to explain how bilinguals manage their different languages (e.g., Costa, Santesteban & Ivanova, Reference Costa, Santesteban and Ivanova2006; Green, Reference Green1998; Green & Wei, Reference Green and Wei2014). A combined use of different research domains has certainly figured in the literature, for example the interplay between sociolinguistic and psycholinguistic factors (Kroll, van Hell, Tokowicz & Greed, Reference Kroll, van Hell, Tokowicz and Greed2011; MacSwan, Reference MacSwan1999; Myers-Scotton, Reference Myers-Scotton1993; Pfaff, Reference Pfaff1979; Poplack, Reference Poplack1980). Broader cross-disciplinary publications have been in the form of overviews of, or selections from, specialized domains rather than applications of an overarching framework (Bullock & Toribio, Reference Bullock and Toribio2009; Gardner-Chloros, Reference Gardner-Chloros2009; Milroy & Muysken, Reference Milroy and Muysken1995).
The time is ripe for broader-based theoretical explanations dealing simultaneously with both linguistic issues and the wider context. One advantage of a framework encompassing all kinds of cognition is that it can, at the same time, deal with representation and real-time processes. For example, code-switching findings in both areas can be integrated along with the ongoing general debate about the ways multilinguals control the different language systems at their disposal (see, for example, Abutalebi & Green, Reference Abutalebi and Green2008; Bialystok & Craik, Reference Bialystok and Craik2010; Costa et al., Reference Costa, Santesteban and Ivanova2006; Dijkstra & van Heuven, Reference Dijkstra and van Heuven2002).
We will begin with a summary of the framework, emphasizing aspects having special significance for the study of code-switching. We then look specifically at the nature of representation and processing as they are realized in the framework. This discussion then serves as the basis for our approach to code-switching, developed in the final sections. The goal, throughout, is to establish a framework within which code-switching can be better understood and studied. This last statement is important because the usefulness of the framework is not immediately to produce a set of empirical predictions. Specific questions raised by empirical findings concerning memory and other processing mechanisms which are regularly often bypassed can now be addressed directly. The framework is first and foremost about enhancing explanation rather than generating predictions and so, with regard to the present discussion, it promises more than ‘just another theory of code-switching’.
2. A sketch of MOGUL architecture, processing, and development
MOGUL architecture has been discussed in detail in a range of publications since its first airing in 2004 where its theoretical allegiances and commitments have been set out along with particular applications to aspects of language acquisition and cognition (e.g., Sharwood Smith & Truscott, Reference Sharwood Smith and Truscott2014; Truscott, Reference Truscott2015; Truscott & Sharwood Smith, Reference Truscott and Sharwood Smith2004). This section will therefore attempt to provide as brief as possible a sketch of its principal features which will all be crucial for discussing code-switching.
Although originally conceived to cope with language-related issues, the framework has evolved as a model of the mind, detailing its basic components and defining how they interact. It is a psychologically plausible model to the extent that it reflects current thinking across relevant areas of cognitive science. It is a coherent model in that it has a clearly defined architecture and finally it should be a useful model to the extent that it can simultaneously address issues in a range of different research areas and stimulate the development of better explanations and hypotheses in each of them.
One salient feature of the MOGUL framework is that, although it is often presented as a processing approach, it actually unifies accounts of representation and on-line processing under one umbrella. Basically, it aims to provide an explanation of (a) how the mind handles the different tasks that it is continually confronted with and (b) how the mind changes in response to life experience. The special focus of MOGUL is still on language and its role within the mind as a whole. Although parts of the mind are, by hypothesis, specific to language alone, the framework makes it clear that language in its broadest sense involves not only these dedicated areas but many other parts of the mind as well, the implication being that narrow explanation of how language and mind operate in general will always be hampered by this limitation. In this light code-switching explanations, the focus of the present discussion, should not only be part of a general account of language processing and development but like all other linguistic areas, should be integrated in explicit ways into the bigger picture of the mind as a whole.
MOGUL architecture is modular. The mind is conceptualized as a set of ‘expert systems’, different from, but developed in relationship with those systems that characterize the workings of the physical brain. There are many versions of modularity in the cognitive science literature but MOGUL architecture relates most closely to the parallel architecture proposed by Ray Jackendoff for the language faculty (Jackendoff, Reference Jackendoff1987, Reference Jackendoff1997, Reference Jackendoff2002). Each expert system can be thought of as a module that has unique properties that distinguish it from all the others and a particular set of tasks to fulfil. Its tasks are not covered by any of the others, examples including the visual and auditory systems, the somatosensory and motor systems, the conceptual system and those systems specific to language to which we will return shortly. These modules or rather their stores are repositories of particular structural elements which are combined in various ways according to the code of that module into more or less complex representations in response to life experience. Each module is equipped at birth with a species specific starter set that has evolved over time to optimize the organism's passage through the initial period of its existence. In this way the development of vision, for instance, and hearing as well as linguistic development have a uniquely human character from the very start.
Despite the special nature of each expert system these modules all have a basic design in common, namely they consist of a store plus a processor that operates on those structures in its store that are currently activated. These items that are at a given moment sufficiently activated to participate in on-line processing constitute the store's current working memory. Operations can involve combining items, in lawful ways, with other items currently in an activated state in the same store producing a more complex set of representations of the same type, or it can involve co-activation with items of different types, i.e., structures in other working memories. This view of working memory accords more with Cowan's (Reference Cowan1993, Reference Cowan2005) embedded process model than with Baddeley's (Reference Baddeley2007, Reference Baddeley2012; Baddeley & Hitch, Reference Baddeley, Hitch and Bower1974).
The operation that causes the activation of items in one store to co-activate other items in other stores is not carried out by a store's processor but by the interface between them. The functions of an interface between two modules include associating particular structures that are currently active in adjacent modules. For example, the currently activated visual representation of a face might be matched with a currently activated conceptual structure identifying the given person. The initial matching of these two separate types of structure will cause the interface to assign them a common identifying ‘tag’ called an index. This means that, other things being equal, the next time one is activated, the interface will activate the other. One structure may have many indices: the syntactic structure N(oun) for example will be coindexed with many different conceptual structures (meanings); many structures will share the same index. In this way associations are formed across modules involving not only simple chains of two but whole networks of structures as different modules are recruited to manage a particular complex experience. One thing is worth noting here: there is no real communication between modules other than co-indexing and co-activation. Syntactic structures are not converted or ‘translated’ into conceptual structures, for example: no information passes between them.
One module will be especially important for the discussion of code-switching: affective structures (AfS). This is a crucial module connected with most if not all the other modules in the mind. It provides the basis for explaining simple and complex emotions that are at work below and above the level of awareness (Öhman, Flykt & Lundquist, Reference Öhman, Flykt, Lundquist, Lane and Nadel2000). However, the most essential function of the affective module is to assign value. This it does by associating value structures in AfS with structures in other modules via its many interfaces. These AfS structures have the effect of assigning a more or less positive value or more or less negative value to the representations they are co-indexed with and may be set or reset at any given moment. We will consider in more detail the important implications of this for code-switching below but clearly the shifting from one linguistic system to another will have something to do with changes in the current value associated with particular linguistic structures that are of potential use for a current communicative task. Note however that valuation is a very general phenomenon, applying not only to code-switching but to all aspects of language and indeed all aspects of cognition.
Why MOGUL is often called a processing approach is because of the crucial role that degrees of activation have both in accounting for the growth (the G in MOGUL) of new representations and of their relative accessibility in on-line processing. The metaphor we favor for conceptualizing how activation operates is one of vertical location. Items (structures/representations) that may be activated are located within a given store. Particular theories will define the nature of those items and the principles they obey but, at any given moment a representation, be it a single structural feature or a complex configuration of features, is viewed as existing at a specific ‘height’ within a memory store. Strongly activated elements will ‘rise’ into working memory which is conceived of as the upper layer of the store. Where they rise from and where they ‘sink’ to afterwards is described as a given resting level of activation. Their proximity, when ‘resting’, to the upper layer is highly significant since the nearer they are the more ‘accessible’ they are. This gives them an advantage over potential competitors in the same store in participating in a processing chain. Resting levels are certainly not fixed. They can vary all the time depending on the use of a structure. An account of code-switching, for example, will certainly focus on shifting resting levels and consequently shifting degrees of accessibility to working memory and competitive advantage over rival structures in the on-line building of a representational chain.
It is important to keep in mind that, in the particular modular architecture under discussion, working memory is not one resource shared by all expert systems: it is a feature of particular stores. The term ‘in working memory’ can only be understood as shorthand for either in a particular WM or, used more loosely, in a group of associated WM's that are currently participating in the building of a processing chain.
Representational growth within a module follows acquisition by processing theory (APT) which is a central principle explaining development in the framework (Truscott & Sharwood Smith, Reference Truscott and Sharwood Smith2004). When a new structure is established during processing, purely as part of the system's efforts to construct legitimate representations for its current input, this new structure will linger in the store. The more it is used, the higher its resting level of activation becomes. The reverse is also true. Lack of use leads to a decline. In language this ‘forgetting’ process is called attrition. Note that the accessibility of items in memory is directly affected not by the frequency of events in the environment, but solely on the basis of activation within the module in question. Thus a language learner's frequent exposure to some linguistic structure will have no impact on development unless it actually gets processed by some relevant module(s). What can impact on development, therefore, is the frequency of internal input to a given module and not the frequency of relevant events in the environment, the external input. This explains, for example, why frequency of linguistic (external) input is not a reliable predictor of the order in which structures are acquired (Brown, Reference Brown1973; Gass & Mackey, Reference Gass and Mackey2002).
In attempts to make sense of perceived language, there will be automatic processes at work trying to match, in the case of speech, generic auditory structures with structures in the first of two modules forming the core language system,Footnote 1 i.e., phonological structures (PS's). These will be matched with syntactic structures (SS's). Then, going outside the core language system again, matches will be attempted with conceptual structures (CS's). Matching is always a two-way process with each collaborating module making one or usually more than one candidate available in its working memory to participate in the current chain. The system as a whole is always working on a best-fit basis. In other words, a priority is to maintain coherence across the various modules involved.
In sum, growth can either be the creation of new structural combinations and new coindexing or it can be a change in the activation history of structures already in place but now becoming more or less accessible. In the case of long-term lack of activation, development will actually involve the gradual decrease in the resting levels of structures. Note that representations in this framework cannot just be those that appear in performance: such readily available structures will be a subset of all the representations, others having various lesser degrees of availability, i.e., resting levels of activation so low that they do not at the moment see the light of day. This has obvious methodological implications. Only very subtle techniques can establish their presence or absence (see, for example, Osterhout, Poliakov, Inoue, McLaughlin, Valentine, Pitkanen, Frenck-Mestre & Hirschensohn, Reference Osterhout, Poliakov, Inoue, McLaughlin, Valentine, Pitkanen, Frenck-Mestre and Hirschensohn2008).
3. Representation and its development in the individual
The MOGUL framework offers an account of representation, one that can serve as foundation for an account of code-switching. We will first consider representation in the core language system and then representation that is not specifically linguistic but is used for language, focusing on contextual and goal representations in CS and value representations in AfS. We will conclude the section by looking at bilingual representation.
3.1 Linguistic representations
The PS and SS modules are specific to language. Other modules intimately involved in language-related representations and processing are ones whose primary or original function is not linguistic: these ones have equivalents in many other species. For example, there is the auditory system which processes generic sound and the motor system controlling those parts of the body involved in producing speech, writing and sign language and Braille, and the conceptual system where abstract meaning is stored and processed. As described above, the language-specific systems constitute the core language system.
What the two modules in the core language system do is to process and store those two types of structures (PS and SS) that are specific to human language; they will try and associate, for example, the auditory representation (AS) of a perceived sound such as one created when someone utters the word walk, with a PS. This PS will include features like syllable, V(owel)s and C(onsonant)s and will be matched up with a SS that includes N(oun) or V(erb) plus syntactic features like number and person. The PS⇔SS chain thus created via the interface between phonology and syntax modules will trigger the activation of conceptual structures, CS's, which will comprise all the associated meanings for that individual of the word walk in the linguistic and situational context in which it is uttered.
Compare this linguistically processed sound with what happens when an excited dog has heard the word walk uttered and has been able to attach meaning to it, as its excitement suggests. The dog will have matched the sound of walk, or more specifically its auditory representation (AS) directly to a meaning (CS). Not having any phonological or syntactic properties associated with it, the dog's auditory representations can never be combined together in complex ways with other AS. Only the core language system can provide the combinatorial possibilities and rich, expressive power of human language. Without the core language system, the resulting ‘dog’ CS will be correspondingly simple as compared to the complex conceptual structures that can populate the human conceptual system (see Sharwood Smith, unpublished manuscript; Sharwood Smith & Truscott, Reference Sharwood Smith and Truscott2014). The two processing chains, one shared by dogs and humans and the other specific to humans, can be illustrated thus:
-
(1)
3.2 Contextual representations
The MOGUL framework is concerned primarily with the INTERNAL CONTEXT that is created from experience with the outside world. Naturally, the wider and the more immediate situational context existing outside the language user plays a vital role. At the same time, the focus here must be on how the mind actually represents it. Many contextual factors can be identified that constitute the language user's internal context. One crucial set involves the interlocutor and his/her understanding of and attitude toward that person: level of familiarity, status, character, attitudes, what languages or dialects he/she understands. The character of an utterance directed to a particular listener will be strongly influenced by such internal factors, as will the interpretation established for whatever that person says. In short, existing representations of the interlocutor are active and influence activation of all other representations that are involved in language use. When active, they thus help to form the internal context of language use. The content and character of preceding utterances, by both parties, is also part of this context, as is the physical setting, to name only some of the variables. Each is reflected in the current activation levels of representations in CS (and beyond), which form the context within which language use occurs.
3.3 Goal representations
Another key type of conceptual structure is the goal representation. In the MOGUL framework goals are CS representationsFootnote 2 that guide thought and action. Their roots are in the basic survival needs of any organism, such as eating when the body requires it. The first and most fundamental goal representations thus serve the function of encouraging the satisfaction of basic needs. These are innately present in CS. For social creatures like humans, goal representations that serve social functions are also crucial, and these also are innately present. Likely examples are affiliation, power, and face. From these basic goals a great assortment of more specific and more sophisticated goals develop through experience. This development is to be explained in the same way as any other development: APT. When innate goals and other representations are active in CS, the processor seeks to combine them to form new, composite representations. Goal representations also become connected to representations outside CS, notably the value representations in AfS, also through processing experience.
3.4 Value representations
Above, we briefly described AfS, emphasizing the value representations it contains. For present purposes, their connections with representations in other modules, particularly CS, are their essential feature. Figures 1a and 1b provide a relatively simple (and simplified) example of these connections and the way that value interacts with context.
Both figures display an almost identical situation. An individual is confronted with, and perceives a biscuit. This immediately activates various different interfaced structures including a visual one – the appropriate VS representing a biscuit – plus an olfactory one – the appropriate OfS representing the smell sensation evoked – and a gustatory one – the appropriate GS representing a biscuit's taste. Spreading activation triggers structural networks across various working memories. The activated networks displayed in these figures also contain three more structures: 1) an activated affective structure which in Figure 1a associates a strong positive value with the above three perceptual structures (VS, OfS and GS); and 2) two activated and combined conceptual structures in the conceptual store. These two CS's have been combined into a complex CS since in this particular situation the biscuit in question is associated with Shop 1. For this individual both Shop 1 and the biscuit in view are highly valued. In variations of this situation, we can imagine that motor structures (not displayed) will also be activated causing movements associated with taking hold of the biscuit and conveying it to the mouth.
In Figure 1b, you are asked to imagine that the individual concerned has learned that the biscuit is actually from Shop 2 which has a very negative association (hence the AfS with strong negative value). Spreading activation will negatively influence other associated structures in the network and they would therefore not trigger motor structures associated with taking and eating the biscuit. Rather the opposite: avoidance behavior would replace attraction. From this simple example, it may be concluded that complex conceptual and affective structures that are developed as a result of previous life experience with the outside world can cause shifts in behavior. Sometimes all it needs is a perceived outside event to make one conceptual structure relatively less valued and an alternative one relatively more valued. This results in a rise in the activation level of one and a fall in the other.Footnote 3 The strong positive value of the revised conceptual structure will influence activation levels of all associated structures which then come to dominate where they were previously in abeyance.
The interaction of value with goal representations can also be brought out in an example of this sort. The person's initial goal might have been simply to enjoy eating a biscuit. If, on the other hand, the biscuit is to be shared with another person, the goal of pleasing that person will be active as well. If the additional person is known to prefer the products of Shop 2, the value of the biscuit should rise accordingly, the extent of the rise depending on the value attached to that person and to pleasing him/her. All this is easily explained within a modular framework and, as should soon become clear, biscuit preferences and code-switching have something in common.
3.5 Representation in the multilingual mind
The notion of code-switching implies that linguistic systems must have a separate existence in the mind. The question is where. What part(s) of the modular system as a whole allows multilinguals to operate consistently in one ‘code’ and how can codes become mixed in performance? The current conceptualization, in the framework, of how PS and SS operate, treats the core language system as neutral territory: these two modules make absolutely no internal distinctions between structures belonging to the different systems that they handle. By ‘language systems’ (codes) we include accents, dialects, varieties and registers of the same language as well as different languages. PS and SS operate efficiently and blindly with any relevant input that appears at their interfaces. The following remarks and examples in this section will be confined to how different languages are dealt with but the principles are the same for linguistically separable varieties of a single language.
The absence of distinctions in PS and SS immediately poses a ‘Tower of Babel’ dilemma. If there are no ways, in a multilingual's mind, of telling whether, say, Spanish or Mandarin or Quechua is being handled by these modules, why is a given stretch of auditory input not randomly assigned, say, a Spanish PS and a Mandarin SS? How is confusion avoided? In Sharwood Smith and Truscott (Reference Sharwood Smith and Truscott2014), two possible options were considered (pp. 186–191). The first was a system of language tags, a concept that has already featured in the literature in various forms, often in connection with lexical processing (see, for example, Albert & Obler, Reference Albert and Obler1978; Belazi, Rubin & Toribio, Reference Belazi, Rubin and Toribio1994; Costa, Reference Costa, Bhatia and Ritchie2006; Green, Reference Green1986; Li, Reference Li1998; Poulisse & Bongaerts, Reference Poulisse and Bongaerts1994). Note, here, that ‘lexical items’ in Jackendoff architecture are not single atomic units. Rather they are a composite of separate types of representation that happen to be coindexed (Jackendoff, Reference Jackendoff2002, p. 130). This first option, then, the language tagging hypothesis, involved an extra indexing system, i.e., a system of language tags on all relevant representations so that chains of structure would be established by only associating in working memory those PS's and those SS's that are consistently tagged for one language. These tags would be assigned on first encounter with input that was identified as belonging to a particular language, the obvious way being via a CS with the meaning “Spanish”, for example. This would mean the core language system was no longer language-neutral. Also it would add a new tagging mechanism over and above the indexing system already described. Apart from posing a potential problem for explaining how code-switching works, where consistency is disrupted, language tagging turns out to be an unnecessary complication (see also Mahootian & Santorini, Reference Mahootian and Santorini1996; MacSwan, Reference MacSwan, Bhatia and Ritchie2013).
Our preferred alternative was and is to restrict language identification to CS, leaving the core language systems blind to the identity of the language they are working with. This approach, called conceptual triggering, letting context processing direct the choice of code, is illustrated in Figure 2a, where the shaded PS-SS-CS chain together represents a linguistic construction, a word, for example. The CS SPANISH has been combined with the shaded CS (semantic meaning) of the word. These two CS's thus constitute a single composite representation. Reminiscent perhaps of the ideas of Landry and Bourhis (Reference Landry and Bourhis1997), who proposed the concept of ‘linguistic landscape’ (the “visibility and salience of languages on public and commercial signs in a given territory or region”, p. 23), the sound representations (AS) and visible sign representations (VS) of spoken, written and signed language can be associated with language meanings, i.e., CS's identifying which language they belong to. The sounds of Spanish, for example, or rather their auditory representation at AS, come to be associated with a CS representing the Spanish language: this activated association is via a direct link across the AS/CS interface. Whenever Spanish sounds are processed, the relevant AS⇔ CS chains will be strongly activated. Any PS ⇔ SS chain associated with that ‘Spanish’ AS⇔ CS chain will accordingly be strongly activated as well. It will therefore outcompete any co-activated rival PS and SS candidates. In other words, the PS and SS systems therefore do not need to ‘know’ which language they are handling in order to maintain consistency. They will be activated by activity in associated modules outside the core language system.
Speakers of a given language typically have the concept of that language. A Spanish speaker, say, will have a CS representation of the concept SPANISH. Whenever the identity of the language being used registers in CS – the clearest case being when the person is consciously aware of using that particular language – this representation is active. So, as a normal part of processing, the conceptual processor will combine it with the representation that is the meaning of the Spanish item currently being processed. Following APT, this composite representation will linger. The PS-CS chains that we know as Spanish are thus established as part of an extended network of coindexed representations across different modules and including CS SPANISH. The situation depicted in Figure 2a is therefore an example of the way processing and acquisition in the MOGUL framework operates.
Contextual differences between the languages of a bilingual are real and important. Their existence is nicely expressed in Grosjean's (Reference Grosjean2010, p. 29) Complementarity Principle:
-
(2) Bilinguals usually acquire and use their languages for different purposes, in different domains of life, with different people. Different aspects of life normally require different languages.
In MOGUL terms, representations in the different languages have different contextual and goal representations associated with them. This is in fact a consequence of the framework. As described above in relation to language identifier (CS) representations like SPANISH, any representation that is active in CS during linguistic processing is likely to be combined with representations of the meaning of the linguistic items that are currently being used (are active), creating a composite representation which then lingers in the store. The primary participants in this process are contextual representations, embodying such information as the identity of the interlocutor and salient features of him/her, along with the setting, the degree of formality, and any previous linguistic context. Any such representations that are active during the processing of a word, for example, will thus become part of the meaning of that word; i.e., they will be contained in the CS representation that is coindexed with the PS and SS of the word. The same story applies to any goal representations that are active during the linguistic processing. These could include general goals like affiliation or simply communication and also much more specialized goals of expressing very specific ideas for very specific purposes. Thus, the meaning of a word or other linguistic unit is not just the traditional semantic representation associated with it but also the contexts of its use and the goals for which it is used – the pragmatics.
Crucially, the contextual representations and the way they combine with semantic representations is likely to differ for the different languages that are present in the mind of an individual bilingual speaker, as expressed in the Complementarity Principle. Thus, contextual differences, including the variable presence and influence of general language representations, distinguish a bilingual's languages at CS, with no need for explicit markings in PS or SS. This more complete understanding of language identification is shown in Figure 2b. The CS that is the basic meaning of the linguistic item (shaded) is combined with a number of contextual representations, possibly including a general language marker (SPANISH), and goal representations. We will consider the implications of conceptual triggering, effectively a further development of the hypothesis presented by Sharwood Smith and Truscott (Reference Sharwood Smith and Truscott2014), for production and for code-switching in subsequent sections.
4. Production
We suggested at the outset that understanding representation and processing is the key to understanding code-switching. One refinement must be added. Switching by its standard definition occurs in production, not comprehension. It can be triggered and shaped by input, but the act of switching is a matter of production. So some discussion specifically of production is necessary. The central issue is the factors that influence which linguistic representations are used in a given utterance. These include factors within the linguistic modules, PS and SS, and influences from CS and AfS on activity in the linguistic modules.
One key factor influencing production is the resting activation level of the various candidate representations. The availability of a representation for use by its processor is determined, again, by its current activation level, which is determined in part by its resting level, serving as the starting point for any rise in current level. Thus, all else being equal, a competitor with a high resting level will triumph over one with a lower level. The limit on this influence is that all else is rarely equal. In a given situation a representation with a relatively low resting activation level may receive strong stimulation, based in large part on conceptual and affective influences, while its high-resting level competitor does not, in which case the former is likely to appear in production. The other factors within the linguistic modules are the processors’ in-built principles and the way representations have been combined in their stores.
Contextual CS representations are no less important in determining which linguistic representations are used in a given utterance. Every utterance is produced in a context and so is shaped by various contextual factors, including for instance the place, the time, the people who are present, and the content and character of any preceding dialogue. These factors are reflected in the system's current state, notably in the activation levels of its representations. Together they establish an internal context for the utterance that is to be produced, internal primarily meaning CS. This internal context is literally part of the CS representations that make up the message to be expressed in production. In other words, representations that express these contextual factors are embedded in the CS representations that constitute the message to be expressed and which initiate utterances. A message-to-be-expressed is necessarily written in CS in terms of currently active CS representations, including representations of internal context.
Thought and action in general are guided by goals, and language production is one kind of action. When a particular goal representation is active, it activates all the representations associated with it, including those in other modules that are coindexed with it and those which are in the same store and therefore contain it, are contained in it, or overlap with it. This additional activation includes elements of the message representation in CS and coindexed PS and SS representations. Active goals thus make relevant syntactic and phonological representations more likely to appear in production. The active goals can of course vary considerably, from simply expressing a linguistic message to the broad social goals of affiliation, power, and face and more specific goals such as sounding friendly, or grateful, or conciliatory, or threatening. Goals inevitably interact with contextual representations. Any unexpected response from the interlocutor could alter the immediate goal, stimulating goal representations that were previously inactive. This change could in turn activate a different set of linguistic representations. Any other sort of change in the situation could have a similar effect.
Value representations in the affective system influence production in two ways: (a) through direct connections to the CS representations that potentially participate in the utterance; and (b) through their connections with the goal representations that help determine which of these representations participate. The value assigned to linguistically-related CS representations is simply their connections to value representations. These connections indirectly constitute the value assigned to the PS and SS representations that potentially participate in the production of an utterance, as they are coindexed with the valued CS's. These valuations are important because highly valued representations are most likely to be used in production. The production of an utterance about abortion, for example, might involve competition between anti-abortion and pro-life or between pro-abortion and pro-choice. The winner is almost certain to be determined by the value that the terms have for the person, i.e., by the connections between value representations and the PS-SS-CS chains that constitute the words. The other way that value representations influence production is through their association with goal representations, which allows them to act as amplifiers, giving greater impact to active goals. Greater value associated with a goal representation means higher activation for that representation, implying greater influence on the activation of additional representations, namely those that contribute to achievement of the goal, including relevant linguistic items. If the goal of impressing the listener is of great importance (is highly valued), for example, this means the index connecting this goal representation to value has a very high activation level. Activation of the goal will therefore strongly activate value, the continuing activity of which will enhance and maintain the activation of the goal representation and therefore of the representations it is directly or indirectly connected to, including the relevant linguistic representations.
5. Implications for code-switching: the forest
MOGUL representation and processing automatically allow, in principle, for representations from either language to enter into production. There is only one PS store and one SS store, each containing representations of both languages. Thus, representations from one freely compete for inclusion in utterances dominated by the other. Understanding code-switching thus means explaining how this competition plays out, particularly the factors influencing which representations ultimately participate in an utterance.
Poplack (Reference Poplack and Durán1981, p. 182) identified two major questions for accounts of code-switching: “why it occurs” and “where it occurs”. Researchers have typically focused either on the “where” or on the “why”. It is preferable, we suggest, to develop theories that bring the two together. The MOGUL framework offers such a unification by addressing both questions within a single account of representation and processing. Conceptual structure contains the meanings of all linguistic items, connected to their forms in the linguistic modules, and also the representations of goals and contexts that constitute the motives for switches. The influence of affect, particularly its most fundamental feature, value, is accommodated through connections between these representations and those in affective structures. The “where”, on the other hand, is about what is done with these influences in the linguistic modules, the possibilities in any given situation depending on the interaction of linguistic constraints inherent in the syntax processor and the current state of the SS store.
6. Implications for code-switching: the trees
For a more detailed consideration of these matters, a good starting point is the minimalist idea of a lexical array, used extensively by MacSwan (Reference MacSwan1999, Reference MacSwan2000, Reference MacSwan, Bhatia and Ritchie2013). MOGUL is not inherently a minimalist approach but minimalism, and in some respects MacSwan's use of it, offers a useful instantiation of part of the framework. The array is the set of items in long-term memory (LTM) that are to be used in the derivation of a sentence, which in the bilingual case includes items from both languages. The MOGUL approach offers a significantly revised view of the array, placing it within a processing account. This means, first, treating membership in the array as the availability of a representation to its processor, where a representation's availability is its current activation level. As activation level is a continuous variable, the implication is that membership in the array is not a yes-no matter but rather a matter of degree. Activation level constantly shifts, so the array is never entirely fixed; items can enter or drop out at any point. It is thus an abstraction, picking out the most highly active representations at a given moment. Another difference with standard minimalist ideas is that a lexical item is a chain of coindexed representations, as described above, and so it may be best to think of a series of interacting arrays, in CS, SS, and PS.
In the early stages of the production of the following sentence,
-
(3) I have an appointment with a client this evening.
the array consists of any and all linguistic representations that are currently active, including the CS-SS-PS chains for each of the words that ultimately appear in the sentence, along with representations that each level requires to make legitimate overall representations, notably the functional categories of SS and any CS and PS representations coindexed with them. But many additional representations will also be active. The array is likely to include, for instance, not only appointment but also date (i.e., the CS-SS-PS chains that are these words), along with various other items that are related – phonologically, syntactically, or semantically – to those that ultimately appear in the sentence. The APPOINTMENT CS wins the competition for inclusion because its current activation level is higher than DATE and any other competitors, based on factors we will consider shortly. Importantly, in the bilingual case the array includes items from both languages. All the L2 counterparts of the English items listed are present.
As a processing approach, MOGUL offers an explanation of how these items become available. The immediate source is the message representation – the conceptual representation in CS constituting the idea that is to be expressed. The relation between this abstract meaning and the meanings of the linguistic items that could serve in the array plays a crucial role in determining which linguistic items are available and to what degree, i.e., how active they are. These individual meanings are CS representations. The issue then is to what extent the message representation includes each. In the above example, if appointment is a perfect expression of the relevant portion of the intended message, this means, by definition, that its CS representation is literally part of the initial message representation. In this case, it should be included in the sentence unless alternatives are strongly supported by other factors. On the other hand, if the CS representation of appointment differs in some specific features from the message representation, and another word, such as date, is a better fit, the latter is likely to take its place, subject again to additional factors.
If the word that wins this CS competition belongs to the bilingual's other language, this will be a case of code-switching. In the simplest case, the switch could occur simply because the meaning (CS representation) of the selected word is a direct or nearly direct expression of the intended message while its counterpart is not, as in the following example of a Mandarin speaker switching to English.
-
(4) Wo jin[tian] wanshang gen kehu you yige appointment.Footnote 4
I today evening with client have an appointment
The CS of Mandarin yuehui (“appointment”) was presumably also activated and so its coindexed PS-SS could have been used, producing a purely monolingual sentence. But the CS that is the intended message includes the concepts of formal and business-related and does not include the more informal variety of meeting. The CS of yuehui does include the latter (yuehui is also used to refer to dating, in particular), with the implication that it overlaps with but is not included in the message representation. The appointment CS, on the other hand, is the relevant portion of the message representation and so its PS/SS representations are strongly activated and become part of the utterance. Note that from a system-internal perspective there is no switch here, just a direct expression of the message.
But this relatively neat case leaves many questions open. First, it assumes that use of the other language representation does not introduce any problems at SS. Detailed analysis of the ways that such problems can occur requires the adoption of a specific theory of syntax and of its use in production, going beyond the MOGUL framework itself. For the sake of illustration we will assume the approach used in Sharwood Smith and Truscott (Reference Sharwood Smith and Truscott2014). In SS (the syntactic portion of LTM), functional category frames are the heart of representations constructed in comprehension and production. Selection of the particular items that constitute the frame is based on activation levels of coindexed representations at CS as well as their own resting levels. Linguistic representations activated by the presence of the message representation in CS are the potential fillers. Whenever a head, functional or lexical, becomes part of the frame, it activates, through standard spreading activation, the subcategorization frames in which it is included, thus bringing the frames along with them. These frames are then filled in ways that satisfy the subcategorization requirements of the heads. Representations that are not compatible with the active frame cannot legitimately be included in the overall SS representation under construction.Footnote 5 Construction of the SS representation is thus an interaction between syntactic requirements at SS and the activation coming from CS.
In example (4), use of English appointment in a Mandarin sentence encountered no problems at SS. Cases in which problems do arise often involve nouns and their complements, as in the following example.
-
(5) *neige guowang of England
the king of England
The SS of guowang (king) is contained in larger SS representations, constituting the frames in which it participates, which are activated whenever the head is active. These larger representations do not include postnominal complements, so of England is not a competitor for inclusion in the phrase, even if its CS component is initially highly active. The only possibility for a switch here is in the prenominal position.
-
(6) neige England de guowang
the England king
In this case England would compete with Mandarin Yingguo (England), the winner determined by all the factors that influence their relative activation levels.
These cases are about head-complement relations, the importance of which is widely recognized (e.g., Belazi et al., Reference Belazi, Rubin and Toribio1994; Di Sciullo, Muysken & Singh, Reference Di Sciullo, Muysken and Singh1986; Mahootian & Santorini, Reference Mahootian and Santorini1996; Myers-Scotton, Reference Myers-Scotton1993, Reference Myers-Scotton2006; Toribio, Reference Toribio2001). In developing a framework rather than a theory, we are not concerned with the linguistic principles underlying the word order. Possibilities range from a simple directionality parameter (which can itself be realized in a number of ways) to a more sophisticated account involving syntactic movement and constraints on it, many of which can be incorporated in the framework.
The discussion to this point abstracts away from the factors that established the message representation in the first place, namely goals, affect, and context. As described above, goals are representations in conceptual structures, strongly associated with the value representation in affective structures, while context consists primarily of all the representations in conceptual structures that are currently active – available to the conceptual processor – and therefore influence current processing. This is internal context. The external context is relevant exactly to the extent to which, and in the manner in which, it is represented in the internal context.
One possible goal of switching involves pride, particularly demonstration of knowledge of the second language. This appears to be a frequent motive for Mandarin-English switching in Taiwan, for example, where English ability carries prestige. Consider a case in which a government official, addressing a group of lower-level bureaucrats, includes the following utterance:
-
(7) Ni hui you yijong sense of achievement.
You will have a sense of achievement.
This example contrasts with the appointment case (4) in that the speaker's use of the English phrase in place of its Mandarin counterpart does not seem to make any contribution to expression of the message, which could have been expressed monolingually with no apparent loss of content, but instead asserts the speaker's knowledge of English. Here the internal context includes representations of the audience and the specific identity and character of its members. Activity of these representations spreads to the goal representation (displaying English knowledge), facilitated by strong connections between the latter and a value representation, in the form of a shared index with a high resting activation level.
This example refers, crucially, to “English”, so a word is required on what exactly this means within our analysis. One thing it can mean is that the CS representations of the individual words are coindexed with the value representation, as a result of past use of those words in conjunction with an active representation. The other, not inconsistent, possibility is that the CS representations of the words include an ENGLISH representation, which is strongly coindexed with value. This situation can arise from (a) the learner's positive valuation of English in general (probably explicit), and (b) a metalinguistic recognition (probably explicit) that each of the words is an English word.
Use of English in this example expresses the pride goal because of connections between English linguistic representations and positive value. The complementary case is when a language is negatively valued within the given context (compare the biscuit example above). A likely example comes from Taiwan's not too distant past, in which use of the Taiwanese languageFootnote 6 was strongly discouraged in formal settings and children were punished for speaking Taiwanese in school. Speakers with this background are likely to have negative value associated with the language's representations, particularly in formal contexts, discouraging any switches to it even in situations where it might contribute to expression of the message or serve other goals. Thus a switch to Taiwanese in a case like (7) is unlikely.
Many different goals are of course involved in linguistic production and can play a role in code-switching. Bhatt and Bolonyai's (Reference Bhatt and Bolonyai2011) review included, as motives behind code-switching, factors of social affiliation, solidarity with interlocutors, power, and face. To these can be added more specific goals such as sounding friendly, grateful, conciliatory, or threatening. In each case the logic is the same as for (7): an active goal representation in CS influences the activation levels of linguistic representations in SS and PS, determining in part the occurrence or non-occurrence of a switch.
In that example we briefly noted the importance of context. Because the message representation in CS includes active contextual representations, the beginnings of an utterance necessarily have current contextual factors built in. This context helps to determine the current activation levels of CS representations, which in turn strongly influence the current levels of SS and PS representations. SS-PS representations that are coindexed with CS representations activated by the current context will for that reason have elevated activation levels as well. So they are likely to dominate the competition for inclusion in the representations being constructed at SS and PS, resulting in contextually appropriate production, most of the time.
An important part of this internal context is representations of the interlocutor, which are inevitably active during a conversation and typically include information or beliefs about the person's linguistic status, in the form of active conceptual representations. The latter inevitably influence activation of the speaker's linguistic representations. Returning to the appointment example, if they involve a monolingual assumption, they will have positive influences on the activation level of yuehui, discouraging the switch to appointment.
Another case of contextual influence is Bhatt and Bolonyai's (Reference Bhatt and Bolonyai2011) example of a Hungarian-American speaking entirely in Hungarian but inserting the English term homeland security. The core meaning could presumably have been expressed without this switch, but no Hungarian word is likely to include the contextual representations that the English term has acquired in the American post-9/11 context. So the presence of these active contextual representations in the speaker's CS, as components of the intended meaning, greatly favors use of the English term, in the same way that English appointment was favored over Mandarin yuehui in (4).
7. Conclusion
We have presented a framework for understanding and studying code-switching, treating it entirely as a consequence of representation and processing as they are understood within this relatively explicit cognitive framework. This approach unifies the research on formal constraints (the “where” issue) with that on the social, communicative motives for switching (the “why” question). Research carried out within the framework in one of these areas will thus be connected to work in the other, as well as to related areas to which the framework applies.
Three points should be emphasized. First, we are presenting a framework for understanding switching rather than a specific theory. Theories developed within this framework will require more specific accounts of goal representations and contextual representations and their connections to value representations, as well as adoption of specific linguistic theories specifying the nature of phonological and syntactic processors and representations. Second, while we have focused on switching between languages, the same analysis applies to switches between dialects or between registers. Finally, this approach to code-switching was not created for the purpose of explaining code-switching: it is a spelling out of the implications of a general cognitive framework.