1. Introduction
One of the fundamental axioms of modern cognitive-functional linguistics is that “[word] meaning is highly context-sensitive, and thus mutable” (Evans, Reference Evans2005, p. 71). When interpreting a particular utterance, language users must not only rely on the meaning encoded in linguistic forms, but also on what they infer from contextual information. Such notions were explicitly acknowledged in the early work of Grice (Reference Grice1957), with a distinction being made between signal meaningFootnote 1 and contextual meaning (Evans & Green, Reference Evans and Green2006). Signal meaning refers to the senses stored in semantic memory, forming part of the user's linguistic knowledge. Contextual meaning is constructed on-line and constitutes an extension of the original signal meaning through an individual’s inferential capacities (cf. Evans & Green, Reference Evans and Green2006; Hoefler, Reference Hoefler2009, p. 6). Put simply: “[…] some meaning is encoded in linguistic forms and some is inferred” (Wedgwood, Reference Wedgwood2007, p. 652).
In this sense, context broadly refers to the set of premises used in interpreting an utterance, besides the information already specified in the signal meaning, and constitutes a psychological construct that comprises a subset of an individual's assumptions about the world (Sperber & Wilson, 1986, pp. 15–16).Footnote 2 Consider the word mole. Besides referring to a small burrowing animal, mole can also denote a form of espionage, a type of birthmark, and a unit in chemistry. Each of these senses are said to be stored in semantic memory, with their use and interpretation being governed by the very specific contexts in which they occur. Viewed in isolation, words such as mole might be construed as communicatively dysfunctional. Yet, in context, it is typically easy to distinguish one sense from another. Having specific knowledge of the context thus enables a hearer to change their expectations regarding the intended meaning of a given word. In other words, when the context is known and informative, it necessarily decreases uncertainty (Piantadosi, Tily, & Gibson, Reference Piantadosi, Tily and Gibson2012).
As context is used as a resource to reduce uncertainty, it might alter our conception of how an optimal communication system should be structured (Zipf, Reference Zipf1949; Piantadosi, Tily, & Gibson, Reference Piantadosi, Tily and Gibson2012). Levinson (2000, p. 29), for instance, argues that our cognitive abilities favour communication systems which are skewed in their design towards hearer inference over speaker effort. Meanwhile, Pinker and Bloom (1990) note language exhibits design for communication because it allows for “minimising ambiguity in context” (p. 713, emphasis added). Evidence for the role of context is also apparent in the way we structure our utterances, with syntax being sensitive to the wider discourse and the immediate communicative needs of interlocutors (Chafe, Reference Chafe and Li1976; Du Bois, Reference Du Bois1987; Fery & Krifka, Reference Fery, Krifka and Sterkenburg2008). Furthermore, these immediate communicative needs can give rise to longer-term patterns: here, the way in which speakers pragmatically design utterances (invited inferences; Traugott & Konig, Reference Traugott, Konig, Traugott and Heine1991), as well as how hearers interpret utterances (context-induced interpretation; Heine, Claudi, & Hünnemeyer, Reference Heine, Claudi and Hünnemeyer1991), is posited to play a fundamental role in historical processes, such as grammaticalization (cf. Traugott & Trousdale, Reference Traugott and Trousdale2013).
There are a number of different kinds of context we could talk about in relation to a particular usage event (Evans & Green, Reference Evans and Green2006; Bach, Reference Bach, Garcia-Carpintero and Kolbel2012). Our present study is specifically focused on the situational context: the immediate communicative environment in which an utterance is situated (Evans & Green, Reference Evans and Green2006, p. 221) and how it influences the distinctions a speaker needs to convey. In an experimental setting, situational context can be manipulated by tailoring both the types of stimuli and the way in which they are organized. For example, in a study examining how adjectives were used in referring expressions, Sedivy (Reference Sedivy, Trueswell and Tanenhaus2005) discovered that speakers were more likely to use an adjective when one object shared a feature dimension with another object (e.g., a blue cup and green cup), but not when the object belonged to a different category (e.g., a cup and a teddy bear). Similarly, Ferreira, Slevc, and Rogers (2005) found that when speakers were faced with conceptual ambiguities, such as having to discriminate between two types of bat (the flying mammal), they would disambiguate on a relevant dimension (e.g., using the small bat in their utterance rather than just the bat when a large bat was also present in the context), whereas when speakers were presented with linguistic ambiguities (e.g., a baseball bat and an animal bat) they were less likely to engage in ambiguity avoidance.
If the situational context plays a fundamental role in how language is structured, then the general observation that some meaning is encoded and some is inferred leaves open the questions: (i) To what extent does the situational context influence the encoding of features in the linguistic system? (ii) How does the effect of the situational context work its way into the structure of language? To help answer these questions we investigate how situational context influences the emergence of linguistic systems. Using an artificial language paradigm, we experimentally simulate cultural transmission in a pair-based communication game set-up (cf. Scott-Phillips & Kirby, Reference Scott-Phillips and Kirby2010; Galantucci, Garrod, & Roberts, Reference Galantucci, Garrod and Roberts2012). Participants learn an artificial language which provides labels for a set of pictures, ‘meanings’ to be communicated. These stimuli vary on the dimension of shape, with each referent also having a unique, idiosyncratic element. After learning the language, participants play a series of communication games with their partner, taking turns to describe pictures for each other. We modified the situational context in which communication took place by manipulating whether the feature dimension of shape was relevant or not for a discrimination task: for example, some participants would encounter only situational contexts in which the objects to be discriminated during communication differed in shape, whereas others would be confronted with contexts in which the objects to be discriminated during communication were of the same shape. Finally, these pairs of participants were arranged into transmission chains (Kirby, Cornish, & Smith, Reference Kirby, Cornish and Smith2008; Scott-Phillips & Kirby, Reference Scott-Phillips and Kirby2010; Thiesen-White, Kirby, & Oberlander, Reference Thiesen-White, Kirby, Oberlander, Carlson, Hölscher and Shipley2011), such that the language produced during communication by the nth pair in a chain became the language that the n+1th pair attempted to learn. This method allows us to investigate how the artificial languages change and evolve as they are adapted to meet the participants’ communicative needs and/or as they are passed from individual to individual via learning. We predict that languages in different types of situational context will adapt to become optimally structured as follows:
• When the feature dimension of shape always differs between pairs of referents which are to be discriminated, we predict that the languages will evolve to only encode shape in the linguistic signal, and become underspecified on all other dimensions.
• When the feature dimension of shape is always shared between pairs of referents which are to be discriminated, we predict that a holistic system will emerge, in which each referent is associated with an idiosyncratic label that encodes that referent’s idiosyncratic feature.
• When the feature dimension of shape sometimes differs and is sometimes shared within pairs of referents, we predict that the languages will become systematically structured to encode both the shape (via a category marker) and idiosyncratic features (via an individuating element of the signal).
1.1. iterated learning and communication games: a method for investigating the emergence and evolution of language
Language is not only a conveyer of cultural information, but is itself a socially learned and culturally transmitted system, with an individual’s linguistic knowledge being the result of observing and reconstructing the linguistic behaviour of others (Kirby & Hurford, Reference Kirby, Hurford, Cangelosi and Parisi2002). This process can be explored experimentally using iterated learning: a cycle of continued production and induction where individual learners are exposed to a set of data, which they must then reproduce and pass on to the next generation of learners (Kirby et al., Reference Kirby, Cornish and Smith2008).
Using this method, researchers have demonstrated that cultural transmission can account for the emergence of some design features in language, including arbitrariness (Thiesen-White et al., Reference Thiesen-White, Kirby, Oberlander, Carlson, Hölscher and Shipley2011; Caldwell & Smith, Reference Caldwell and Smith2012), regularity (Reali & Griffiths, Reference Reali and Griffiths2009; Smith & Wonnacott, Reference Smith and Wonnacott2010), duality of patterning (Verhoef, Reference Verhoef2012), and systematic compositional structure (Kirby et al., Reference Kirby, Cornish and Smith2008; Theisen-White et al., Reference Thiesen-White, Kirby, Oberlander, Carlson, Hölscher and Shipley2011). Typically, a participant is trained on a target system (e.g., an artificial language) and then tested on their ability to reproduce what they have learned, with the test output being used as the training input for the next participant in a chain.
These studies show that cultural transmission can account for the emergence of structure in communication systems. In particular, communication systems adapt to constraints inherent in the learning process: domain-general limitations in our memory and processing capabilities (Christiansen & Chater, Reference Christiansen and Chater2008) introduce a learnability pressure (Brighton, Kirby, & Smith, Reference Brighton, Kirby, Smith and Tallerman2005), meaning that languages that are difficult to learn tend not to be accurately reproduced, and therefore change. Recent work in this paradigm shows that the incorporation of situational context can change the extent to which the evolving language encodes certain features of referents. Silvey, Kirby, & Smith (Reference Silvey, Kirby and Smith2014) show, using a transmission chain paradigm, that word meanings evolve to selectively preserve distinctions which are salient during word learning. Using a pseudo-communicative task, where participants needed to discriminate between a target meaning and a distractor meaning, the authors were able to manipulate which meaning dimensions (shape, colour, and motion) were relevant and irrelevant in conveying the intended meaning. If a meaning dimension was backgrounded, in that it was not relevant in distinguishing between the target and distractor, then the languages evolved not to encode this particular meaning dimension. Instead, the languages converged on underspecified systems based on the relevant feature dimensions for discriminating between meanings.
However, language is not merely a task of passively remembering and reproducing a set of form–meaning pairings. Language is also a process of joint action (Bratman, Reference Bratman1992; Clark, Reference Clark1996; Croft, Reference Croft2000): that is, language is fundamentally a social and interactional phenomenon, whereby the role of usage, communication, and coordination are salient pressures on the system (also see Tomasello, Reference Tomasello2008; Bybee, Reference Bybee2010). Experimental communication games have been used to investigate the emergence of combinatorial (Galantucci, Kroos, & Rhodes, Reference Galantucci, Kroos and Rhodes2010) and compositional (Selten & Warglien, Reference Selten and Warglien2007) structure, the emergence of arbitrary symbols from iconic signs (Garrod, Fay, Lee, Oberlander, & MacLeod, Reference Garrod, Fay, Lee, Oberlander and MacLeod2007), and how common ground influences the extent to which a communication can become established in the first place (Scott-Phillips, Kirby, & Ritchie, Reference Scott-Phillips, Kirby and Ritchie2009).
Converging evidence from iterated learning and communication games point to both learning and communication as powerful forces in shaping the structure of language (Fay & Ellison, Reference Fay and Ellison2013; Smith, Tamariz, & Kirby, Reference Smith, Tamariz, Kirby, Knauff, Pauen, Sebanz and Wachsmuth2013). With this in mind, the basic premise of the current experiment is to expend upon this work by: (a) adding a communicative element to the experimental setup of Silvey et al. (Reference Silvey, Kirby and Smith2014), and (b) manipulating the types of situational context.
1.2. the problem of linkage: language strategies and the emergence of language systems
Explaining how context works its way into the structure of language requires that we consider the problem of linkage (Kirby, Reference Kirby1999, Reference Kirby, Tallerman and Gibson2012). Rather than there being a straightforward link between our individual cognitive machinery and the features we observe in language, we are instead faced with an additional dynamical system: socio-cultural transmission. Treating language as a complex adaptive system (Beckner et al., Reference Beckner, Blythe, Bybee, Christiansen, Croft and Schoenemann2009; Cornish, Tamariz, & Kirby, Reference Cornish, Tamariz and Kirby2009) solves this problem of linkage because we can consider how short-term language strategies (Evans & Green, Reference Evans and Green2006, p. 110) used in solving immediate communicative needs can give rise to language systems through long-term patterns of learning and use (Bleys & Steels, Reference Bleys and Steels2009; Steels, Reference Steels and Steels2012; Beuls & Steels, Reference Beuls and Steels2013).
The language strategy a speaker selects to enable a listener to identify their intended meaning is dependent not only on the referential information available, but also the context in which the utterance is situated. Take the relatively simple communicative situation in Figure 1: here, there are several language strategies that a language user could employ to convey the intended meaning. In context 1A, the intended meaning can easily be conveyed by using the label dog as opposed to cat. If, however, the situational context pairs the intended referent with another dog (as in context 1B), then it makes little sense to use the referential label of dog, as the listener is very unlikely to be able to distinguish between the two referents on the basis of that label. Instead, other strategies must be employed, such as providing a unique identifier that is more specialized (dalmatian) or creating a compound signal (Ay, Flack, & Krakauer, Reference Ay, Flack and Krakauer2007) that has both specialized and generalized components (spotted dog).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160711030649-24559-mediumThumb-S1866980814000350_fig1g.jpg?pub-status=live)
Fig. 1. Language strategies and example contexts. The boxes (coloured green in the experiment) correspond to the intended referent. As we can see, dog is a viable strategy for conveying the intended meaning in context A, but we need to either use a more specific label (dalmatian) or provide additional referential information alongside the generalized form (spotted dog) to convey the intended meaning in context B.
The current experiment explores how these short-term strategies of achieving communicative success in a situational context influence the emergence of different types of language system. In particular, we focus on the evolution of three types of language system: underspecified, holistic, and systematic. Underspecification captures the observation that languages abstract across referents by encoding some feature dimensions and ignoring others (Silvey et al., Reference Silvey, Kirby and Smith2014). Using the examples above, the word dog is underspecified with respect to whether or not its referent is spotted or brown. Conversely, the labels dalmatian, poodle, siamese, and tabby are holistic, in that they embody an arbitrary set of one-to-one mappings between signals and their meanings:Footnote 3 holistic signals serve the purpose of individuation (Lyons, Reference Lyons1977). Finally, in a systematic mapping between forms and meanings, the signals share elements of form (unlike in a holistic mapping, where each signal is unrelated to the other signals) but are nonetheless one-to-one: systematic languages consist of compound signals (e.g., spotted dog), whereby part of the structure refers to a general-level category (e.g., dog) and part of the structure refers to an individuating component (e.g., spotted).
To test for the effect of situational context, we use a guessing game set-up (cf. Steels, Reference Steels2003; Silvey et al., Reference Silvey, Kirby and Smith2014): the task is to discriminate between pairings of a target object and a distractor object. In our case, possible referents are drawn from a set of images which vary in shape (see Figure 2). Manipulating these pairings gives us three experimental conditions based on: (a) whether the feature dimension of shape is relevant or not in discriminating between two referents, and (b) the extent to which stimuli pairings remain consistent over time with respect to the relevance of the feature dimension of shape.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160711030649-79747-mediumThumb-S1866980814000350_fig2g.jpg?pub-status=live)
Fig. 2. All eight meanings for the image stimulus set used in this experiment. Note that each individual image comprises two components: a basic-level of shape (star or blob) and a subordinate-level (a unique idiosyncratic feature).
In the shape-different condition, pairings of target and distractor are constructed such that the feature dimension of shape is always relevant with respect to discrimination, i.e., target and distractor differ in shape. Since the two objects in such situational contexts have different shapes, then they can be discriminated merely by referring to shape. We therefore predict that the languages in the Shape-Different condition will evolve to become underspecified, specifying shape but not differentiating between the objects within a given shape category: such an underspecified system is functionally adequate for achieving communicative success in this situational context, and is highly learnable (Kirby et al., Reference Kirby, Cornish and Smith2008). Conversely, in the shape-same condition, target and distractor are always of the same shape – differing only on their idiosyncratic features. Consequently, the feature dimension of shape is always irrelevant in discrimination, and therefore does not need to be specified linguistically, with abstracting across referents of the same shape being communicatively dysfunctional for these situational contexts. We therefore predict that holistic systems will emerge in the Shape-Same condition, where each individual referent is associated with a unique label that maps onto its idiosyncratic feature. Lastly, for the mixed condition we manipulated the predictability of situational contexts across trials: on some trials target and distractor share the same shape and on others they differ in shape. When encountering this mix of situational contexts, we hypothesize languages will become systematically structured, encoding in the linguistic signal both the basic-level of shape and individuating information of the idiosyncratic feature. Furthermore, we expect that the labels for the basic-level feature will becomes conventionalized earlier than those specifying the individuating information, with participants attempting to meet their immediate communicative needs on a piecemeal basis, through minimizing effort and maximizing communicative success; the quickest way to achieve this would be for participants to first align on conventional forms for two shapes (as this minimizes effort and will ensure communicative success in contexts where shape is relevant in discrimination) followed by conventional forms for the eight idiosyncratic features (as these are needed to make these distinctions in contexts where shape is irrelevant in discrimination).
1.3. ecologically sensitive, learning bias, and historically contingent accounts
Our prediction that manipulations to the situational context will bias the probability of one linguistic system emerging over another is consistent with a broader class of predictions that we will term ecologically sensitive accounts. Under this perspective, languages adapt to the structure of their niche in an analogous manner to that of biological organisms: just as environmental niches constrain and guide the evolution of species, so too are socio-cultural niches salient constraints on the types of language that emerge (Lupyan & Dale, Reference Lupyan and Dale2010). The ecologically sensitive account is consistent with a range of observations including: that social structure patterns with differences in language structure (Wray & Grace, Reference Wray and Grace2007; Lupyan & Dale, Reference Lupyan and Dale2010); that word frequency is a product of the range of individuals and topics (Altmann, Pierrehumbert, & Motter, Reference Altmann, Pierrehumbert and Motter2011); that interactional constraints and conversational infrastructure lead to cultural convergence of linguistic form (Dingemanse, Torreira, & Enfield, Reference Dingemanse, Torreira and Enfield2013); that objects and events in the world guide word learning discrimination (Ramscar, Yarlett, Dye, Denny, & Thorpe, Reference Ramscar, Yarlett, Dye, Denny and Thorpe2010); that word length patterns with the complexity of the meaning space (Lewis, Sugarman, & Frank, Reference Lewis, Sugarman, Frank, Bello, Guarini, McShane and Scassellati2014); that the structure of languages is shaped by the structure of meanings to be communicated (Perfors & Navarro, Reference Perfors and Navarro2014).
These ecologically sensitive accounts can be contrasted with two other theoretical perspectives that make different predictions about the relationship between the situational context and the emergence of linguistic systems. The first of these is the learning bias approach. This makes the prediction that language structure is closely coupled to the prior expectations and biases of language learners (e.g. Griffiths & Kalish, Reference Griffiths and Kalish2007; Reali & Griffiths, Reference Reali and Griffiths2009; Fedzechkina, Jaeger, & Newport, Reference Fedzechkina, Jaeger and Newport2012; Culbertson, Smolensky, & Wilson, Reference Culbertson, Smolensky and Wilson2013; Culbertson & Adger, Reference Culbertson and Adger2014). The learning bias approach can be further contrasted with what we term the historical contingency account, which holds that the types of system that emerge are primarily constrained by random historical events, subtly biasing the language in one direction or another. When compared with the ecologically sensitive and the learning bias accounts, a historical contingency prediction is that language structure is the result of lineage-specific outcomes (Lass, Reference Lass1997), with “the current state of a linguistic system shaping and constraining future states” (Dunn, Greenhill, Levinson, & Gray, Reference Dunn, Greenhill, Levinson and Gray2011, p. 79).
In their extreme incarnations, the learning bias and historical contingency accounts both predict that manipulating the situational context will have little effect on the types of system that emerge in our experiment. For a learning bias account we would predict considerable convergence across all experimental conditions: there will be a globally optimal solution in terms of a prior constraint (or set of constraints), with the languages then converging towards this prior. By contrast, the historical contingency account would predict a much higher degree of variation in the types of system that eventually emerge, with the states of these systems being better predicted by individual variation and lineages than by either contextual or prior cognitive constraints.
2. Method
2.1. participants
Seventy-two undergraduate and graduate students at the University of Edinburgh (42 female, median age 22) were recruited via the sage careers database and randomly assigned to twelve diffusion chains. Each chain consisted of a pair of initial participants who learned a random language, and two pairs of successive participants who learned the previous pair of participants’ output language, making three generations in total. These chains were further subdivided into three experimental conditions (see §2.3).
2.2. stimuli: images and target language
Participants were asked to learn and then produce an alien language, consisting of lower-case labels paired with images. The images were drawn from a set of eight possible pictures, which varied on the dimension of shape (4 blobs and 4 stars), with each individual image also having one unique, idiosyncratic subordinate element (see Figure 2).
The training language for the first participant pair in each chain was created as follows. From a set of vowels (a,e,i,o,u) and consonants (g,h,k,l,m,n,p,w) we randomly generated nine CV syllables which we then used to randomly generate a set of twenty-four 2–4 syllable words. These parameters ensured that there were three unique labels for every picture. Each chain was initialized with a different random language. The training language for later pairs of participants consisted of the language produced by the previous participant pair while communicating (see below).
2.3. procedure: training phase and communication phase
At the start of the experiment, participants were told they would first have to learn and then communicate using an alien language. Participants completed the experiment in separate booths on networked computers. The experiment consisted of two main phases: a training phase and a communication phase. Before each phase began, participants were given detailed information on what that phase would involve and were explicitly told not to use English or any other language they knew during the experiment.Footnote 4 For the training phase, participants were trained separately, and it was only during the communication phase that they interacted (remotely, over the computer network).
2.3.1. Training phase
In each training trial, the participant was presented with a label and two images, one of which was the target and one a distractor. The participant was told that the alien wanted them to pick which of the two images corresponded to the label. Once the participant had selected an image (by clicking on it using the mouse) they were told whether their choice was correct or incorrect, shown the label and target image for 2 seconds, and then instructed to retype the label before proceeding to the next trial. Both targets and distractors were presented in a random order within the following constraints: (i) the pairing of target and distractor varied based on the experimental condition (see §2.4 below for more details on the conditions); (ii) within each training block, each of the eight meanings appeared three times as a target. The training phase of the experiment consisted of four such blocks, each of twenty-four trials; each block contained the same twenty-four training trails, with the order of these trials being randomly shuffled.
2.3.2. Communication phase
During the communication phase of the experiment, participants took alternating turns as director and matcher:
• DIRECTOR: As directors, participants were presented with two images: a target and a distractor. Targets were highlighted with a green border. The director was prompted to type a label that would best communicate the target to the matcher. The label was then sent to the matcher’s computer.
• MATCHER: Participants were presented with the same two images as the director, with the label provided by the director appearing underneath. The matcher was then prompted to click on the image they thought corresponded to the label provided.
Following each trial, participants were given feedback as to whether or not the matcher had correctly identified the picture described by the director, followed by a display showing the image the director was referring to and the image the matcher selected. Target and distractor pairings were randomly generated within the constraints imposed by the experimental conditions (see §2.4 below), and communication trials were presented in random order. The communication phase consisted of two blocks, the length of each block varied depending on the experimental condition (see below).
2.4. manipulating context: mixed, shape-same, and shape-different conditions
To test the role of context, a simple manipulation was made to the possible combinations of target and distractor images within a single trial during training and communication. This provides three experimental conditions. For the Shape-Same condition, participants only ever saw pairings of images that shared the same shape, but differed in their idiosyncratic element (see Figure 3A). In the Shape-Different condition, participants were exposed to pairings of images that differed in both their shape and idiosyncratic features (see Figure 3B). Participants in the Mixed condition encountered a mixture of image pairings: some image pairings shared the same shape but differed on their idiosyncratic features, whereas other image pairings differed on both their shape and idiosyncratic features (see Figure 3C).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160711030649-32015-mediumThumb-S1866980814000350_fig3g.jpg?pub-status=live)
Fig. 3. 3A is an example of a pairing in the Shape-Same condition: here, participants only ever observe pairings that share the same shape. 3B is an example of a pairing found in the Shape-Different condition; the two stimuli always differ in shape. 3C shows an example of the pairings used in the Mixed condition: here, we get a mixture of stimuli that in some contexts differ in shape and in other contexts share the same shape.
In the Mixed condition, one communication block contained fifty-six trials, with twenty-four trials consisting of pairs of images that shared the same basic-level category but differed on subordinate-level features (24 trials exhausting all such possible pairings), whereas the remaining thirty-two trials differed on both their basic-level category and subordinate-level features (again, 32 trials covering all such possible pairings). To ensure that Shape-Different and Shape-Same conditions were comparable to the Mixed condition in the number of trials, we doubled up the possible combinations of images in the other two conditions, i.e., the Shape-Different condition involved sixty-four trials (32 × 2) per communication block and the Shape-Same condition involved forty-eight trials (24 × 2) per communication block; participants underwent two such blocks of communication.
2.5. iteration
The labels produced by a pair of participants in the second block of the communication phase, and their associated target and distractor images, were used to construct the training language for the next pair of participants: we simply randomly sampled from the communicative output of generation n to produce the training language for generation n+1, (see Figure 4).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160711030649-30718-mediumThumb-S1866980814000350_fig4g.jpg?pub-status=live)
Fig. 4. An example of the random selection process employed for a single meaning in the Shape-Same condition. Here, one target meaning is associated with six (possibly unique) signals during communicative testing. However, only three trials are required to construct a training block for the next generation: in order to generate this training block, we sample randomly from the appropriate contexts.
The random sampling process was constrained in the following ways. First, for all three conditions, we had a bottleneck on the number of signals that could be passed on to the next generation, i.e., for a single meaning we could only pass on three labels. As such, the number of signals transmitted from the final communication block of a generation stayed consistent between conditions, but the size of the sampling space differed slightly: Mixed (24/56 signals sampled), Shape-Different (24/64 signals sampled), Shape-Same (24/48 signals sampled). Second, in the Mixed condition, the random selection process was additionally constrained so that a given stimuli would appear in at least one shape-same context and one shape-different context, and that that there were an equal number (12) of shape-same and shape-different contexts in total. This meant that, in the Mixed condition, individual stimuli might appear in different ratios of shape-same and shape-different contexts. By contrast, the Shape-Same condition contained all possible pairings of target and distractor in training, and the Shape-Different condition had a subset of all possible contexts (24 out of 32 possible stimuli pairs).
2.6. dependent variables and hypotheses
2.6.1. Measuring communicative success
To measure communicative success we simply recorded the number of successful interactions, where the matcher clicked on the target image. Given the differing trial numbers, the maximum success score differs across conditions: Shape-Different (128 points for two blocks of 64 interactions), Mixed (112 points for two blocks of 56 interactions), and Shape-Same (96 points for two blocks of 48 interactions). These maximum scores are converted into proportions to allow visual comparison between the three conditions, but the statistical analyses are conducted on the binary dependent variable.
2.6.2. Measuring language types: difference scores
In addition to conducting qualitative analyses of the languages that are produced during communication, we used the Normalized Levenshtein edit distanceFootnote 5 to provide objective measures for within-category difference and between-category difference. To compute within-category difference for a given block, all labels associated with objects of a given category were compared with one another (i.e., all labels for the 4 blob-shaped images are paired with one another and given a total Normalized Levenshtein edit distance, as were all labels for the 4 star-shaped images); the resulting pair of scores (a score for the blob-shaped category and a score for the star-shaped category) were then averaged to obtain a composite within-category difference score. Between-category difference was calculated for a given block by pairing all four labels for blobs with all four labels for star-shaped images at the same block and calculating average Normalized Levenshtein distance.
These two difference scores provide us with an objective measure of language type. In particular, holistic, systematic, and underspecified languages are discriminable on these scores, primarily the within-category difference scores. A holistic language only encodes the idiosyncratic feature of objects in the linguistic system – shape category distinctions are not encoded. As such, we should expect the within-category and between-category differences to be similar. As a systematic language encodes both the shape category and the idiosyncratic element, systematic languages should exhibit smaller within-category difference scores than between-category difference scores, and should also exhibit lower within-category difference scores than holistic languages. For an underspecified language, we expect that only shape category information will be encoded, leading to substantial differences in within-category and between-category difference scores, with within-category scores being close to 0.
2.6.3. Measuring uncertainty: conditional entropy
To further assist in quantifying the language types that emerge, we can calculate the degree of uncertainty in the system, which allows us to quantify the relationship between signals and their associated meaning. First, we need to operationalize two types of uncertainty about signal–meaning pairs. signal uncertainty arises from one-to-many pairings of meanings-to-signals (as in cases of synonymy in natural language). Conversely, meaning uncertainty arises from one-to-many pairings of signals-to-meanings (as in cases of homonymy and polysemy in natural languagesFootnote 6). We predict that the languages in all three conditions will evolve over cultural transmission to lower their signal uncertainty: that is, as a system becomes more conventionalized, it is more likely to only have one signal for each meaning (cf. Reali & Griffiths, Reference Reali and Griffiths2009). The Mixed and Shape-Same conditions are predicted to evolve toward a one-to-one mapping between signals and meanings (i.e., we should see 8 signals for 8 meanings in these conditions), leading to low meaning uncertainty. However, the Shape-Different condition is predicted to show higher levels of meaning uncertainty: the prediction is that these chains should involve one-to-many signal–meaning pairs, as an underspecified system leads to the same label being associated with multiple objects which share the relevant feature (here, shape).
To quantify signal uncertainty and meaning uncertainty we measure two aspects of the conditional entropy of the system. This gives us a measure of predictability that we can apply to both meaning uncertainty and signal uncertainty. H(M|S) is the expected entropy (i.e., uncertainty) over meanings given a signal, and therefore captures meaning uncertainty,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20151116105202725-0949:S1866980814000350_equ1.gif?pub-status=live)
where the rightmost sum is simply the entropy over meanings given a particular signal s∈S. P(m|s) is the probability that meaning, m is the intended meaning given that signal s has been produced. This entropy is weighted by a distribution P(s) on signals. We can also reverse the position of signals and meanings in this equation to get the conditional entropy of H(S|M), i.e., a measure of signal uncertainty:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20151116105202725-0949:S1866980814000350_equ2.gif?pub-status=live)
High H(M|S) means that a signal is highly uninformative about the intended meaning (due to the signal having multiple meanings), whilst a high H(S|M) means that a meaning is highly uninformative about the intended signal (due to the meaning having multiple signals).
While these measures capture relevant aspects of the structure of the evolving languages, they do not take context into account, and therefore do not capture the functional adequacy of the system for communication in context. To account for the contextual meaning we incorporate one last measure meaning uncertainty in context, H(M|S, C):
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20151116105202725-0949:S1866980814000350_equ3.gif?pub-status=live)
where the various sums are over signals and meanings given a context. This measure captures the (potential) communicative utility of a system: we predict that the degree of in-context meaning uncertainty will decrease in all three conditions (the languages will be functionally adequate for conveying the correct/intended meaning), whereas meaning uncertainty (disregarding context) will differ across conditions depending on the emerging linguistic systems, as discussed above. As such, we are able to compare these two measures to provide an accurate account of how these types of system are evolving over time, and whether or not they are adapting to their situational contexts.
2.6.4. Mixed effects model overview
We used R (R Core Team, 2013) and lme4 (Bates, Maechler, & Bolker, Reference Bates, Maechler and Bolker2012) to perform several separate linear mixed effects analyses based on the dependent variables of (a) communicative success, (b) within-category difference scores, (c) between-category difference scores, (d) H(S|M), (e) H(M|S), and (f) H(M|S,C). For our independent variables, we entered condition (Mixed, Shape-Same, and Shape-Different), generation and block as fixed effects with interactions. As random effects, we had random intercepts for chain and participant, as well as chain and participant random slopes for generation and block. Each of these models used the Mixed condition as a baseline category. Visual inspection of residual plots did not reveal any noticeable deviations from assumptions of normality or homoscedasticity. P-values were obtained using a MCMC sampling method (pvals.fnc) provided by the languageR package (Baayen, Reference Baayen2008).
2.6.5. Hypotheses
Here we recap and summarize our various hypotheses.
HYPOTHESIS ONE: Participants will increase their communicative success over successive blocks and generations.
HYPOTHESIS TWO: Languages in the Mixed condition will consistently evolve towards systematic category-marking systems.
HYPOTHESIS THREE: Languages in the Shape-Same condition will consistently evolve towards holistic systems.
HYPOTHESIS FOUR: Languages in the Shape-Different condition will consistently evolve towards underspecified systems.
HYPOTHESIS FIVE: The degree of signal uncertainty will decrease across all three conditions over successive blocks and generations.
HYPOTHESIS SIX: The Shape-Different condition is predicted to show higher levels of meaning uncertainty than the Mixed and Shape-Same conditions.
HYPOTHESIS SEVEN: The degree of meaning uncertainty in context will decrease across all three conditions.
3. Results
3.1. qualitative results: languages
This section will provide an overview of a representative selection of languages observed in each of these three conditions (please refer to the supplementary material for the full set of languages). We contrast the initial starting language participants were trained on with very early systems at the start (generation 1, block 1 of communicative interaction) and at the end (generation 1, interaction block 2) of a single generation, as well as systems in the final generation of the chain (generation 3, interaction block 2).
Figure 5 shows an example from chain 1, from the Mixed condition. In generation 1, the labels for each individual referent tend to show some individuation: for instance, muwumuwu is only ever associated with one particular blob. However, even at this early stage, we start to see evidence that the labels are patterning systematically according to shape. For instance, the initial syllable mu is consistently associated with blob-shaped referents, and the template h*pa is associated with star-shaped referents. There is also some underspecification: hapa, for instance, is used with all four stars (albeit at different frequencies). Words lengths also appear to differ systematically between shapes (although this strategy is not repeated in other chains). At the end of the first generation (block 2) a few clear patterns emerge. First, the degree of heterogeneity has decreased in terms of the number of unique words and the number of unique syllables. Second, there is a higher degree of conventionality for each individual referent, as evident in some labels only ever appearing with one referent (e.g., muhumu and hepa). Lastly, there is less underspecification across star-shaped referents – hapa is now only associated with two stars. The language of the third generation extends these patterns of increased conventionality: each individual referent has a unique label that distinguishes it from other referents. Furthermore, these labels show systematic relations with one another: three of the blob-shaped images are distinguished from one another through varying the length of (partially) reduplicated syllables (muwu, muwumu, and muwumuwu). Meanwhile, all of the star-shaped images persist with the basic template of h*pa, and individual referents within this category differ only in the vowel of the first syllable. Finally, there is no underspecification by generation 3: as predicted, the language marks the basic-level category of shape as well as the individuating element. This observation supports our hypothesis that systematic structure will emerge in the Mixed condition, with languages first converging on conventionalized forms for shape followed by the idiosyncratic features.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160711030649-57773-mediumThumb-S1866980814000350_fig5g.jpg?pub-status=live)
Fig. 5. A table showing the initial training language and all of the signal–meaning pairs produced at generation 1 (communication block 1), generation 1 (communication block 2), and generation 3 (communication block 2) in chain 1 (Mixed condition). Each meaning appears with a collection of labels beneath it: this constitutes the combined output of a pair of participants in a particular generation.
In the first generation of the Shape-Same condition (Figure 6) we see some commonalities with the early stages of the Mixed condition: there are examples of conventionality (e.g., gigi and zara) as well as diversity (e.g., the wide range of labels for the blob with antennae and the star with dots) in the labels used for the individual referents. By time we reach block 2 of the first generation there is almost a completely conventionalized system (in that the participants are aligned on a stable set of labels for each referent). Furthermore, unlike the Mixed condition, this conventionalized system tends to recycle holistic variants instead of introducing systematicity: while there are pockets of systematicity (e.g., kanaku and nakaku), these are circumscribed when compared to the Mixed condition. Interestingly, at the third generation, the napawe variant has been favoured over the nakaku variant, lending additional weight to the notion that the situational context is biasing the system against systematic structure. These observations provide support for our hypothesis that holistic languages evolve in the Shape-Same condition. However, we should note systematicity is tolerated to a certain extent, as is the case for the blob-shaped images (kapa and kapapa and gugu and gigi).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160711030649-31078-mediumThumb-S1866980814000350_fig6g.jpg?pub-status=live)
Fig. 6. A table showing the initial training language and all of the signal–meaning pairs produced at generation 1 (communication block 1), generation 1 (communication block 2) and generation 3 (communication block 2) in chain 6 (Shape-Same condition).
For the Shape-Different condition (Figure 7), we see that there is a high level of heterogeneity in both the labels used between and within the referents. There is, however, some clustering of syllable types (e.g., no, go, ni, etc.) and combinatorial patterns (e.g., pugo, gogo, puma) according to the basic-level category of shape. Interestingly, this diversity persists in the first generation (block 2), with less conventionality than that found in the Mixed and Shape-Same conditions. Still, there is an increase in conventional patterns, with forms becoming more predictable over time in both the number of syllables and the way in which they are arranged (e.g., me and he tend to disproportionately occur in the initial syllable position). The most noticeable difference between generation 1 and generation 3 is the collapse towards underspecification: we see high-frequency forms for all blob-shaped referents (e.g., pugu) and all star-shaped referents (e.g., heha). In addition to this loss of variation at the word level, variation also decreases at the syllable level (e.g., there are only four syllables for blob-shaped images: pu, po, gu, and go). The emergence of underspecified languages supports our hypothesis that languages in Shape-Same condition will evolve to abstract across the meaning dimension of shape.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160711030649-78855-mediumThumb-S1866980814000350_fig7g.jpg?pub-status=live)
Fig. 7. A table showing the initial training language and all of the signal–meaning pairs produced at generation 1 (communication block 1), generation 1 (communication block 2), and generation 3 (communication block 2) in chain 12 (Shape-Different condition). Highlighted labels show underspecification.
It is important to note that all three conditions started off with a language that consists of randomly generated pairings of labels and meanings. Although the individual pairings differ between conditions, they do share an important structural characteristics: all initial languages have high levels of synonymy (three labels for each meaning). A consistent pattern shared across all three conditions is a shift from this system with many-to-one signal–meaning mappings to systems where we observe one-to-one and one-to-many mappings.
3.2. communicative success
Communicative success scores tended to follow a similar trajectory in all three conditions (see Figure 8). Over successive blocks we observe a clear increase in the overall communicative success rate, leading to near-perfect communication by the end of generation 3. Analysis of the logistic mixed-effects model revealed a significant main effect of Generation (β = 1.13, SE = 0.19, z = 6.646, p < .001) and Block (β= 1.03, SE = 0.30, z = 3.399, p < .001), but no effect of Condition and no other significant interactions (p > .074).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160711030649-17795-mediumThumb-S1866980814000350_fig8g.jpg?pub-status=live)
Fig. 8. Average communicative success scores by generation (1–3), communication block, and condition. The vertical dotted lines represent the start of the next generation. Error bars represent the 95% confidence intervals.
These results show that, in all conditions, the languages are becoming increasingly effective at achieving communicative success through (a) repeated interactions between individual participant pairs and (b) across successive generations of participant pairs.
3.3. difference scores
Table 1 shows the idealized and observed (in the second block of generation 3) values for the within- and between-category difference measures. Figure 9 shows how these measures evolve over time. Our hypothesis that languages in the Mixed condition should evolve systematic category-marking and should therefore produce a within-category difference score of around 0.5 (characteristic of a system in which signals tend to be composed of a general category-marker and an individuating element) and a between-category difference score of 1 (distinctive labels used across categories). For the Shape-Same condition, we predicted the emergence of holistic languages, where each object is associated with a unique and distinctive label: this is characterized by high within- and between-category differences. As can be seen from Table 1, these predictions were borne out. For the Shape-Different condition, we predicted the emergence of systems that underspecified, using a single label for all objects sharing a shape, which would correspond to 0 within-category difference and a high between-category difference: as can be seen from the table, while this prediction was partially supported (within-category difference is lower than between-category difference), the within-category difference in this condition remains high – this is due to the slower conventionalization seen in this condition, as highlighted in the qualitative analysis above (see also measures of signal uncertainty below).
table 1. The idealized (left-hand columns) and observed (right-hand columns) scores for within-category differences and between-category differences. Numbers in parentheses indicate the bootstrapped standard deviation.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160711030649-86619-mediumThumb-S1866980814000350_tab1.jpg?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160711030649-83888-mediumThumb-S1866980814000350_fig9g.jpg?pub-status=live)
Fig. 9. Between-category (solid lines) and within-category (dotted lines) difference scores (measured by the average Normalized Levenshtein edit distance) over successive communication blocks for the Mixed (diamond/blue lines), Shape-Same (triangle/green lines), and Shape-Different conditions (square/red lines). Generation 0 gives values for the initial random language. Error bars indicate 95% confidence intervals.
Analysis of the mixed-effects model for Within-Category difference showed a significant effect of Generation (β = –0.07, SE = 0.02, t(84) = –2.823, p < .001), and a significant main effect of Shape-Same condition (β = 0.20, SE = 0.04, t(84) = 5.043, p < .001). There was one significant interaction for Shape-Same condition × Generation (β = 0.07, SE = 0.03, t(84) = 2.266, p = .017). All other main effects and associated interactions were non-significant (p >.061). These results partially support our predictions: Within-Category difference remains high in the Same-Shape condition, reflecting the development of labels which individuate within categories, and decreases in the other conditions; however, the Within-Category differences remain surprisingly high in the Shape-Different condition, where we predicted the emergence of a fully underspecified system, associated with a Within-category difference of 0.
Analysis of the model for Between-Category difference showed that only the main effect of Generation was significant (β = 0.04, SE = 0.02, t(84) = 2.46, p < .001), supporting the contention that Between-Category labels become increasingly distinct from one another over generations. All other main effects and associated interactions were non-significant (p > .139).
3.4. conditional entropy
3.4.1. Signal uncertainty H(S|M)
For the conditional entropy of signals given meanings, H(S|M), we observe a general decrease across all three conditions (see Figure 10). However, the decline in entropy for the Shape-Different condition appears to be less pronounced than that of the Mixed and Shape-Same conditions: as discussed above, within-category variation persists unexpectedly in this condition. For H(S|M) the mixed-effects model contained significant results for the main effects of Generation (β = –0.61, SE = 0.13, t(72) = –4.561, p <.001), Block (β = –0.38, SE = 0.09, t(72) = –4.366, p < .03) and Shape-Different condition (β = 0.62, SE = 0.31, t(72) = 2.011, p < .009). There were no other significant main effects or interactions (p > .259).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160711030649-07227-mediumThumb-S1866980814000350_fig10g.jpg?pub-status=live)
Fig. 10. Degree of signal uncertainty, measured as H(S|M), against Generation and Block. Higher entropy scores indicate a higher degree of signal uncertainty. The error bars indicate the 95% confidence intervals.
3.4.2. Meaning uncertainty H(M|S)
Figure 11 plots the conditional entropy of meanings given signals, H(M|S), against the number of blocks. As predicted, there is a clear difference between the conditions, with the Shape-Different condition showing a general increase in entropy in contrast to the Mixed and Shape-Same conditions, corresponding to the development of underspecified labels. For H(M|S) the mixed-effects model contained significant results for the main effect of the Shape-Different condition (β= 0.41, SE = 0.10, t(72) = 4.053, p < .001). There was also a significant Shape-Different condition × Block interaction (β = 0.32, SE = 0.09, t(72) = 3.424, p < .001). There were no other significant main effects or interactions (p > .265).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160711030649-98918-mediumThumb-S1866980814000350_fig11g.jpg?pub-status=live)
Fig. 11. Degree of meaning uncertainty, measured as H(M|S).
3.4.3. Meaning uncertainty of signals in context H(M|S,C)
The conditional entropy of meanings given signals in context, H(M|S,C), is shown in Figure 12. In all three conditions we observe a decrease in entropy over time, with each of the conditions showing strikingly similar trajectories of change: as indicated by the communicative accuracy scores, the languages in all conditions evolve towards allowing optimal communication in context. For H(M|S,C) the mixed-effects model contained significant results for the main effects of Generation (β = –0.08, SE = 0.03, t(72) = –3.300, p < .001) and Block (β = –0.07, SE = 0.01, t(72) = –5.927, p < .001). There were no other significant main effects or interactions (p > .078).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160711030649-08292-mediumThumb-S1866980814000350_fig12g.jpg?pub-status=live)
Fig. 12. Meaning uncertainty of signals in context, measured as H(M|S,C).
4. Discussion
Our findings support the general hypothesis that language structure adapts to the situational contexts in which it is learned and used. As we outlined in the ‘Introduction’, some meaning is encoded and some meaning is inferred, with interactional short-term strategies of conveying the intended meaning feeding back into long-term, system-wide changes. In our experiment, languages gradually evolved to encode information relevant to the task of achieving communicative success in context, with different language systems evolving in each experimental condition. In the Shape-Same condition, where the dimension of shape was always the same for stimuli pairings, holistic systems of communication emerged, whilst in the Shape-Different condition, where the dimension of shape was always different for stimuli pairings, the system generalized and became underspecified (although unexpectedly variable: see discussion below). For the Mixed condition, which featured both Shape-Same and Shape-Different contexts, the systems that emerged were systematically structured: that is, both shape category and individual identity were encoded in the linguistic signal. These divergent systems arise given a very simple meaning space, through slight manipulations to the situational context.
Despite these inherent differences between the languages that emerged, all of the conditions showed: (a) an increased level of communicative success and (b) a reduction in in-context meaning uncertainty, H(M|S,C). This observation suggests each condition produces languages that are functionally adequate for the task of achieving communicative success in context. The fact that different systems evolve for conveying the same set of meanings is important for how we view the role of context. Our explanation rests on the premise that languages are adapting to their niche, which in this case comprises the situational context, to become optimally structured.
Underspecified systems emerge in the Shape-Different condition because “when context is informative, any good communication system will leave out information already in the context” (Piantadosi et al., Reference Piantadosi, Tily and Gibson2012, p. 284). This lends weight to studies showing that participants are making use of pragmatic reasoning to convey information at the least cost, given common knowledge and the task at hand (Frank & Goodman, Reference Frank and Goodman2012). These underspecified systems could be construed as being highly ambiguous when taken out of their communicative context. However, when we take into account the context in which the signals were used (as measured by the H(M|S,C)) then the apparent ambiguity is not counter-functional: that is, the system is perfectly adequate for achieving communicative success. When examined out of context, adapted communication systems can give the appearance of ambiguity, as Miller (1951, pp. 111–112) noted: “Why do people tolerate such ambiguity? The answer is that they do not. There is nothing ambiguous about ‘take’ as it is used in everyday speech. The ambiguity appears only when we, quite arbitrarily, call isolated words the unit of meaning.”
While the amount of synonymy (as measured by H(S|M)) decreased over time across all conditions, the Shape-Different condition appeared to tolerate a higher level of synonymy than the other two conditions. One possible explanation is the way in which participants viewed the task. An initially diverse input could be construed as priming the participants to reproduce a diverse output. If the labels are easy enough to learn and reproduce, and they achieve the goal of successfully allowing the matcher to choose the correct image, then this variation may be tolerated for longer. This also partly explains why the Shape-Different condition deviates from its predicted within-category difference score: labels are not conventionally associated with any one particular meaning within a category. For instance, as discussed in the qualitative analysis (see Figure 7), pugu and pogo (which are quite distinct, with a Normalized Levenshtein edit distance of 0.5) are not conventionally associated with any particular blob; instead, they pattern synonymously, with the two labels being optional forms for any blob-shaped image. This reflects a limitation of the difference measurement to distinguish between systematic languages and this kind of synonymy. However, these languages do have distinct profiles, as evidenced by the various entropy measurements.
It is also worth noting that not all chains in the Shape-Different condition converged on an underspecified system, with chain 11 evolving a holistic-like system. This mismatch with our predictions is perhaps due to the Shape-Different condition having more optionality provided by the situational context: that is, any of three hypothesized systems (Underspecified, Holistic, Systematic) are expressively adequate for conveying the intended meaning, although these systems differ in their parsimony in terms of memory and learning demands. This increases the probability that we will see more variation in the types of system that evolve in the Shape-Different condition. Whereas underspecified and, to a lesser extent, systematic category-marking languages are communicatively sub-optimal in the Shape-Same condition, the Shape-Different condition does not share such restrictions. A similar story applies when comparing the Mixed and Shape-Different conditions: neither holistic nor systematic category-marking languages are disfavoured for either condition, but an underspecified system would be problematic in the Mixed condition (as 43% of the contexts have images that share the same shape). Chain 11 thus serves as an important reminder of lineage-specificity, and how the historical properties of a particular system can bias future states.
For the Shape-Same condition, the chains consistently converge on holistic systems: that is, each individual stimulus has a unique label, with these labels being relatively distinct from one another. The decrease in H(S|M) and H(M|S) shows that the system is converging towards a one-to-one mapping of forms and meanings, whereas the high within-category difference scores show these signals are highly distinct from one another, and indeed more distinctive than those found in the other two conditions. Our rationale for the emergence of holistic systems in the Shape-Same condition is similar to that of the Shape-Different condition: where the situational context is informative, information will be left out of the linguistic system. In this instance, the context was informative through virtue of having the pairs of stimuli always sharing the same shape. This explains why systematicity is minimized in the Shape-Same condition: the linguistic system does not need to conventionally encode shape into the signal because context makes it irrelevant in discriminating between meanings. Instead, these languages specialize and become holistic, allowing them to meet the participants’ communicative needs in context.
Even though the languages which emerge in the Shape-Same condition do reliably differ from those that evolve in the Mixed condition, through being more holistic, there is some evidence of systematicity in these chains. In chain 6, for instance, a language evolved in which two of the blob-shaped stimuli share similar labels (kapa and kapapa), as do the other two blob-shaped stimuli (gugu and gigi). These pockets of correlations between word forms suggest a certain degree of systematicity is tolerated – albeit not to the same extent as that found in the Mixed and Shape-Different conditions. One explanation for this finding is that the situational context and communication are not the only factors shaping the system, with learnability pressures also acting on the structure of language (Kirby et al., Reference Kirby, Cornish and Smith2008).
Only in the Mixed condition do we consistently observe the emergence of systematic category-marking languages. The first line of evidence is that the observed within-category difference score lines up with our expected score (see Figure 9): this suggests part of the label is specifying shape and the other part is specifying the individuating component. While, as noted above, a difference score of approximately 0.5 is not necessarily indicative of systematic language structure, the H(S|M) and H(M|S) scores show that, by generation 3, the languages in the Mixed condition have low conditional entropy, showing that the form–meaning pairs embody one-to-one mappings.
A holistic language would be just as successful at conveying the correct meaning as a systematic language in the Mixed condition. So why do we see the emergence of systematic instead of holistic languages? Part of the reason rests on how these languages evolve in the early stages of their emergence: participants quickly establish a conventionalized specification of shape, before arriving upon conventionalized forms that encode the individuating elements. As a strategy, specifying shape information only requires participants to align on two signals, one that specifies star-shaped objects and one that specifies blob-shaped objects, which would allow them to successfully communicate on 57% of trials (those where discrimination only requires that shape information is conventionally encoded).
We can view this strategy as a negotiated exploration of the specification space during interaction, giving rise to a two-stage process: (i) the conventionalization of category-marking for shape; (ii) the conventionalization of individuating elements. Supporting this contention of a two-stage process is the main effect of Generation for both the within-category difference scores and the conditional entropy of H(S|M): even though the within-category difference scores suggest systematic category-marking emerges by the end of generation one, the H(S|M) entropy is much higher in this initial generation than it is at later generations. The decrease in H(S|M) reflects the conventionalization of individuating elements in the linguistic system – that is, there is less synonymy in later generations.
Another striking finding in the Mixed condition was the rate at which systematic category-marking emerged, within a single generation of participants. Part of the explanation could be in how the manipulation of context exerts a strong constraint for participants to quickly converge on conventional markers for shape. There are several reasons why the rapid evolution seen in this experiment might prove to be an exception, rather than a general tendency. First of all, there are only two possible dimensions that the language may encode: the basic-level category and the subordinate idiosyncratic component. There are also differences between the initial generation and successive generations (as mentioned above): namely, later generations show greater degrees of conventionalization in their label usage.
If languages are adapting to their contextual niche, then what are the implications for the learning bias and historical contingency accounts? Even though our results are broadly consistent with the ecologically sensitive account, there is also evidence consistent with the learning bias (e.g., pockets of systematicity in the Shape-Same condition and the overall reduction of synonymy across all conditions) and historical contingency (e.g., the emergence of a holistic language in chain 11 of the Shape-Different condition) accounts. It is likely that all these theoretical perspectives hold true to some extent, with the role of context being mediated by partially competing motivations of prior learning biases and historical contingency. Such notions reflect the converging evidence that languages, and the way in which they are organized, “are better explained as stable engineering solutions satisfying multiple design constraints, reflecting both cultural-historical factors and the constraints of human cognition” (Evans & Levinson, Reference Evans and Levinson2009, p. 429).
5. Conclusion
We set out to investigate the role of situational context in the emergence of different types of linguistic system that evolve through iterated learning. By manipulating the ways in which stimuli were paired with one another, we showed that situational context is an important factor in determining what is and is not encoded in the linguistic system. Our results offer a potential insight into how the situational context can bias the cultural evolution of language. The type and predictability of the situational contexts relate to how language users will employ certain communicative strategies for conveying the intended meaning, with the resulting language systems reflecting the contextual constraints in which they evolved.
One of the major findings in our experiment is that the types of linguistic system that evolve are highly predictable based on their contextual constraints during communication. This interplay between short-term linguistic strategies for resolving communicative interactions, and the implication for language systems through long-term patterns of change, speaks to real-world processes such as grammaticalization: the types of change we observe in languages show predictable patterns, as evident in the unidirectionality hypothesis (cf. Hopper & Traugott, Reference Hopper and Traugott2003), but importantly these changes show how contextual constraints on the moment-to-moment communicative strategies deployed can have widespread ramifications on whole linguistic systems (Steels, Reference Steels and Steels2012). Natural languages are subject to a larger and more diverse range of contexts, with a key future question being the extent to which our experimental results are generalizable to patterns observed in natural language systems.
Supplementary Materials
For supplementary material for this article, please visit dx.doi.org/10.1017/langcog.2014.35