INTRODUCTION
A key empirical question in language acquisition research is the developmental trajectory children follow in mastering adultlike understanding of the pragmatic features of function words such as pronouns. Pronouns like he, she, and it replace nouns or noun phrases referring to the most available entities in the current perceptual and discourse context (Gundel, Hedberg & Zacharski, Reference Gundel, Hedberg and Zacharski1993). Their use and interpretation build on knowledge of morphosyntactic restrictions (e.g. case assignment, grammatical role), semantic distinctions (e.g. gender, number, animacy), and discourse–pragmatic factors, including information structure and order-of-mention (e.g. Arnold, Reference Arnold2010; Chiat, Reference Chiat, Fletcher and Garman1986; Sekerina, Reference Sekerina, Serratrice and Allen2015). Despite the sizable literature on the acquisition of pronominal reference (e.g. Bergmann, Paulus & Fikkert, Reference Bergmann, Paulus and Fikkert2012; Budwig, Reference Budwig, Budwig, Užgiris and Wertsch1999; Chiat, Reference Chiat, Fletcher and Garman1986; Chondrogianni, Reference Chondrogianni, Serratrice and Allen2015; Karmiloff-Smith, Reference Karmiloff-Smith1981; Salazar Orvig & Morgenstern, Reference Salazar Orvig, Morgenstern, Serratrice and Allen2015; Sekerina, Reference Sekerina, Serratrice and Allen2015), we still have a limited understanding of when children become able to interpret personal pronouns anaphorically, as referring to highly accessible or previously mentioned (given) referents.
Experimental studies suggest that, during their third year, children become increasingly aware of how pronouns are used and what information they encode. One piece of evidence comes from a production experiment that explored young children's use of different linguistic expressions in response to a specific versus generic question (Campbell, Brooks & Tomasello, Reference Campbell, Brooks and Tomasello2000). The study showed that English-speaking children aged 2;6 made more attempts at using pronouns in response to a specific question that contained a noun (i.e. What did noun do?) than in response to a generic question that did not contain a noun (i.e. What happened?). This suggests that even at the age of 2;6 children have some understanding that pronouns are used as substitutes for nouns from immediately preceding utterances. Building on Campbell et al. (Reference Campbell, Brooks and Tomasello2000), Wittek and Tomasello (Reference Wittek and Tomasello2005) tested referential choices in specific and generic questions in young German-speaking children. The study showed that children at 2;6 were influenced by the immediately preceding context and used null arguments and pronouns when the referent was mentioned in the previous utterance. A similar pattern of response has been confirmed in other experimental studies showing that two-and-a-half-year-olds, but not younger children, select a referring expression depending on whether the referent was mentioned in the previous discourse or not (e.g. Matthews, Lieven, Theakston & Tomasello, Reference Matthews, Lieven, Theakston and Tomasello2006).
These findings from production experiments are complemented by similar evidence from comprehension studies showing that children aged 2;6 can interpret pronouns in an adultlike way. Song and Fisher (Reference Song and Fisher2005, Reference Song and Fisher2007) exposed children to short videos with two characters in order to assess whether their interpretation of a pronoun is guided by discourse prominence. In each story, one of the characters was more prominent than the other because it was mentioned first, appeared as the subject of two sentences, and was also mentioned once as a pronoun. The study found that, when multiple cues are taken into consideration, children at 2;6 indeed interpreted the pronoun as referring to the more prominent character. A follow-up study confirmed that, even when discourse prominence was reduced to first mention and subject position, the children successfully mapped a pronoun subject to the target referent. Taken together, these results suggest that at two-and-a-half years of age children already display some awareness of the information status of anaphoric pronouns.
It is much less clear, however, how much knowledge of anaphoric pronouns children have in earlier stages. Because evidence from comprehension studies with children younger than 2;6 is essentially missing (see Sekerina, Reference Sekerina, Serratrice and Allen2015, for a review on older children), the primary source of our current understanding of the acquisition of pronouns in younger children comes from corpus studies of children's early referential strategies in naturalistic interactions. These studies show that children use pronouns in their speech from the earliest word combinations, together with other types of referring expressions, including nouns, demonstratives, clitics, and null arguments (e.g. Chinese: Huang, Reference Huang2011; English: Hughes & Allen, Reference Hughes and Allen2013; Rozendaal & Baker, Reference Rozendaal and Baker2010; French: Salazar Orvig, Marcos, Morgenstern, Hassan, Leber-Marin & Parès, Reference Salazar Orvig, Marcos, Morgenstern, Hassan, Leber-Marin and Parès2010; Italian: Serratrice, Reference Serratrice2005; Korean: Clancy, Reference Clancy, Sohn and Haig1997). The main generalization is that two-year-olds are sensitive to the information flow in discourse and are more likely to use pronouns, clitics, and null arguments for referents that are accessible to the listener, whereas they are more likely to use lexical nouns for referents that are not accessible to the listener (for a review see Allen, Hughes & Skarabela, Reference Allen, Hughes, Skarabela, Serratrice and Allen2015).
These observations are, however, not sufficient to establish with any degree of confidence whether or not two-year-olds understand anaphoric pronouns and the specific information these forms encode. It is possible that children's production of pronouns is largely dependent on parental scaffolding and is limited to a few restricted contexts in which the forms are frequently used (e.g. Chiat, Reference Chiat, Fletcher and Garman1986; Karmiloff-Smith, Reference Karmiloff-Smith1981; Salazar Orvig & Morgenstern, Reference Salazar Orvig, Morgenstern, Serratrice and Allen2015). For example, the pronoun it is nearly exclusively used in the object position, as in want it, see it, drop it, or cut it (e.g. Angiolillo & Goldin-Meadow, Reference Angiolillo and Goldin-Meadow1982; Chiat, Reference Chiat, Fletcher and Garman1986; Kirby & Becker, Reference Kirby and Becker2007). Children may thus consider it to be a suffix or they might ignore it all together, given that the pronominal form constitutes less phonetic material than non-pronouns (Bloom, Reference Bloom1990; Gerken, Reference Gerken1991). Young children may thus initially learn and use pronouns as part of larger unanalyzed units and not as independent forms that signal a previously mentioned (given) referent as in the adult language (e.g. Chiat, Reference Chiat, Fletcher and Garman1986).
In fact, under closer examination, children's initial use of pronouns in spontaneous speech is limited and inconsistent (e.g. Salazar Orvig & Morgenstern, Reference Salazar Orvig, Morgenstern, Serratrice and Allen2015). Instead of using pronouns, young children often rely on null arguments (e.g. I put __ on there.) or lexical nouns (e.g. I hurt my finger. My finger hurts.) to refer to given referents (see, e.g. Hughes et al., Reference Hughes and Allen2013). Conversely, two-year-olds, unlike older children and adults, may use pronouns when they introduce new referents (e.g. Campbell et al., Reference Campbell, Brooks and Tomasello2000; Chiat, Reference Chiat, Fletcher and Garman1986; Demir, So, Özyürek & Goldin-Meadow, Reference Demir, So, Özyürek and Goldin-Meadow2012; Hughes et al., Reference Hughes and Allen2013; Matthews et al., Reference Matthews, Lieven, Theakston and Tomasello2006; Rozendaal & Baker, Reference Rozendaal and Baker2010; Salazar Orvig et al., Reference Salazar Orvig, Marcos, Morgenstern, Hassan, Leber-Marin and Parès2010; Serratrice, Reference Serratrice2005). This raises the possibility that two-year-olds and younger children might in fact assign individual pronouns functions that are distinct from the adult language (Budwig, Reference Budwig, Budwig, Užgiris and Wertsch1999; Chiat, Reference Chiat, Fletcher and Garman1986). Thus, there is currently no conclusive evidence that English-speaking children younger than 2;6 understand that pronouns are used anaphorically as substitutes for nouns or noun phrases and describe entities typically introduced earlier in the discourse.
To address this issue, we examined whether English-speaking two-year-olds and younger children understand the anaphoric function of pronouns. Specifically, we tested whether children at 1;6 and 2;0 understand that the referent of it encodes a previously linguistically mentioned (given) entity rather than a newly introduced visual competitor. We focused on the pronoun it because children use referential it in their speech as early as 1;6 (Brown, Reference Brown1973; Chiat, Reference Chiat, Fletcher and Garman1986; Kirby & Becker, Reference Kirby and Becker2007).
To examine children's interpretation of the pronoun it, we used a cross-modal preferential-looking paradigm with two scenes: the participants first saw a single object (e.g. a ball) introduced with an indefinite noun phrase (e.g. Look, a ball!), followed by another scene involving the same object (i.e. the ball) and the image of a new object (e.g. a hat). The participants then heard a test sentence including either a definite noun referring to the old object from the first scene (e.g. the ball), an indefinite noun referring to a newly introduced object (e.g. a hat), or it. There was also a silent control condition in which the same visual scene was presented without a test sentence. If the participants understood the anaphoric function of it, their looks to the referent from the first scene were predicted to be higher in the pronoun than the silent condition. Similarly, we predicted that the participants would interpret the referent of the definite noun as the old object, but they would interpret the referent of the indefinite noun as the newly introduced object. We first tested a control group of adults to assess whether the set-up yields results corresponding to our predictions and also to establish a baseline for comparison for children's behavior (Experiment 1). We then tested children aged 2;0 (Experiment 2) and 1;6 (Experiment 3).
EXPERIMENT 1
METHOD
Participants
Twelve adult participants (M = 20 years; Range: 18–25 years) took part in this study (six males). All were native speakers of English studying at a Scottish university. They received a voucher to a local café for their participation.
Materials
Visual stimuli. The stimuli consisted of still images of twelve highly familiar inanimate objects (i.e. ball, hat, sock, car, shoe, cup, spoon, book, chair, star, house, and bus). The objects were chosen because they depicted items whose labels are commonly known by two-year-old children according to estimates of receptive vocabulary in the LEX database (Dale & Fenson, Reference Dale and Fenson1996). The images were presented in a video. Each video lasted 10 s and included two scenes. In Scene 1, one of the test objects (e.g. a ball) was presented in the center of the screen for 4 s. In Scene 2, the image from Scene 1 (i.e. the ball) was re-introduced simultaneously with a new image (e.g. a hat), and they both remained on screen for 6 s. The two images were presented on the left and right sides of the screen, separated by a distance of 50 cm.
Auditory stimuli. The auditory stimuli used in Scene 1 included twelve monosyllabic English nouns for twelve familiar words (i.e. ball, hat, sock, car, shoe, cup, spoon, book, chair, star, house, and bus). They were recorded in two carrier phrases: Look, a ___ and Oh, a ___ (e.g. Look, a ball!). The stimuli used in Scene 2 included the pronoun it and the twelve nouns embedded in a definite or indefinite noun phrase in two carrier sentences Can you find__? and Can you see__? (e.g. Can you find a ball / the ball / it?). All stimuli were read by a female native speaker of Standard Scottish English in child-directed speech. The stimuli were digitally recorded in a soundproofed room at 22050 Hz, using 16-bit mono sampling.
Procedure
The study was carried out with the ethical approval of the Ethics Committee at the University. Before the study, informed consent was obtained from all participants.
The experiment was conducted in a semi-dark test room. The order and presentation of the stimuli were controlled using Habit X running on a Macintosh computer in a controlled room. Images were displayed on a large television screen, and auditory stimuli were delivered from the built-in loudspeakers of the TV set. Looking times towards each image in Scene 2 were recorded at a rate of 25 frames per second by a hidden remote-controlled video camera positioned centrally under the television screen.
The participants sat on a chair placed approximately one meter in front of the TV screen. Each participant was presented with a total of sixteen trials, fully randomized, with four trials in each of the four conditions. In Scene 1, a single object was displayed. After a silence of 1 s, a female voice introduced the object with Look or Oh, followed by an indefinite noun referring to the object (e.g. a ball!). Scene 2 began 4 s into the trial, and displayed two objects simultaneously for the remainder of the trial. After a 2 s silence in this scene (i.e. at 6 s after the onset of Scene 1), participants heard a question Can you see __? or Can you find__?, with one of three types of label (i.e. the pronoun it, definite noun, or indefinite noun). The fourth condition involved no auditory stimulus. The target word (except the ‘silent’ trials) began at 6·671 s from the onset of Scene 1 (i.e. 2·671 s after the onset of Scene 2) (see Figure 1 for details of the experimental timeline). The test trials were counterbalanced for the position of the target image (half of the target images appeared on the left and half of them appeared on the right).
The trials were initiated by an experimenter in an adjacent control room when the participant showed central fixation to the monitor. Each trial was separated by an attention-getting sequence that presented an animation of moving bubbles with the soundtrack of children's laughter.
Coding
Participants' looks towards each side of the screen were coded for every trial of Scene 2. Videos were coded off-line using a frame-by-frame analysis. Coders were blind to the side of the target image. For each frame, coders assessed whether the participant was fixating their gaze to the left side of the monitor, to the right, or elsewhere. Inter-coder reliability for two coders was assessed for a random sample of 15% of the data. The two coders achieved a 96% agreement, with a Cohen's kappa of .937.
RESULTS AND DISCUSSION
We measured the proportion of looks to the ‘given’ referent calculated as the total number of looks to the given referent divided by the total number of looks to given and new referents. This measure was taken for two temporal windows of analysis: the baseline window before the experimental word and the response window after the experimental word. The baseline window began at the onset of Scene 2 and lasted until the onset of the experimental word 2·671 s later (i.e. between 4 s and 6·671 s after the onset of Scene 1). Given the target age group in this study, we adopted a post-stimulus response window typical for young children and applied it to all age groups (Fernald, Zangl, Portillo & Marchman, Reference Fernald, Zangl, Portillo, Marchman, Sekerina, Fernández and Clahsen2008). The response window was set between 0·329 s and 2 s after the onset of the target word (i.e. between 7 s and 8.6 s after the onset of Scene 1 and between 3 s and 4.6 s after the onset of Scene 2) (see Figure 1 for details on timing of the baseline and response windows). We then calculated a difference score for each trial, defined as the proportion of given referent looks in the response window minus the proportion of given referent looks in the baseline. A positive difference score indicates that the participants shifted their attention to the given referent after hearing the critical word. A negative difference score indicates a shift to the new referent. As predicted, adults responded with increased looks toward the given referent in the definite noun condition (Mean change score = 0·430, SD = ·142) and the pronoun condition (Mean change score = 0·428, SD = ·204), with increased looks toward the new referent in the indefinite noun condition (Mean change score = –0·346, SD = ·235), and no clear direction preference in the silent condition (Mean change score = 0·049, SD = ·174, one-sample t(11) against 0 = 0·97, p = ·35) (see right panel of Figure 2). A one-way repeated-measures ANOVA revealed a significant effect of Condition (definite vs. indefinite vs. pronoun vs. silent) (F(3,33) = 51·50, p < ·001). Holm pairwise comparisons showed that, compared to the silent condition, the difference score was significantly higher for the ‘definite’ condition (p < ·001) and the ‘pronoun’ condition (p < ·001), and significantly lower for the ‘indefinite’ condition (p < ·001).
The results show that adults correctly identified the target object in all three linguistic conditions, including the pronoun condition: they responded with increased looks toward the given referent upon hearing a definite noun and a pronoun. This demonstrates that the paradigm is successful at targeting the predicted mappings between the three different types of linguistic structures, including the pronoun it, and the visual stimuli. The results from adults can thus be used as a baseline for comparison with children's responses. We next examined children's understanding of the anaphoric it at 2;0 using the same method.
EXPERIMENT 2
METHOD
Participants
Sixteen two-year-olds (M = 2;0; Range: 1;11–2;2) participated in the experiment (10 boys). All children came from English-only or English-dominant middle-class families. Parents/carers received a voucher to a local café for their participation.
Materials
Visual and auditory stimuli. As in Experiment 1.
Procedure
The study was carried out with the ethical approval of the Ethics Committee at the University. Informed consent was obtained from parents or carers of the participants prior to the experiment. Parents were also asked to fill out a short vocabulary questionnaire to check that all children were familiar with the target words.
The child participant was sat on their parent's lap in a chair placed approximately one meter in front of the visual monitor. The parents were instructed not to interact with their child, but to sit back and relax while listening to masking music via headphones. The rest of the procedure was identical to that described in Experiment 1.
Coding
As in Experiment 1.
RESULTS AND DISCUSSION
Like adults, two-year-olds responded with increased looks toward the given referent in the definite noun condition (Mean change score = 0·278, SD = ·187) and the pronoun condition (Mean change score = 0·275, SD = ·263) (see middle panel of Figure 2). There was a preference for the new referent in the indefinite noun condition (Mean change score = –0·203, SD = ·146) and no preference in the silent condition (Mean change score = –0·086, SD = ·201, one-sample t(15) against 0 = 1·72, p = ·11). A one-way repeated-measures ANOVA showed that the looking response differed significantly as a function of Condition (F(3,45) = 23·71, p < ·001). According to Holm pairwise comparisons, the difference score was significantly higher for the ‘definite’ condition (p < ·001) and the ‘pronoun’ condition (p = ·002) than the silent condition.
These results indicate that two-year-olds correctly map the pronoun it onto the referent that was linguistically introduced in the previous scene, even in the presence of a visual competitor. This shows that at two years of age children understand the pronoun it anaphorically. To further explore the developmental trajectory of anaphoric reference, the same experiment was conducted with children at 1;6.
EXPERIMENT 3
METHOD
Participants
Sixteen children aged 1;6 (M = 1;6; Range: 1;5–1;7) were included in the analysis (10 boys). Two additional children participated, but were excluded due to restlessness. All children came from English-only or English-dominant middle-class families.
Materials
Visual and auditory stimuli. As in Experiment 1.
Procedure
As in Experiment 2.
Coding
As in Experiment 1.
RESULTS AND DISCUSSION
The difference scores indicated that children at 1;6 increased their looks in the definite noun condition (Mean change score = 0·294, SD = ·209), but changes were much less pronounced in the pronoun condition (Mean change score = 0·066, SD = 0·186), the silent condition (Mean change score = 0·100, SD = 0·195), and the indefinite noun condition (Mean change score = –0·063, SD = ·192). The results are illustrated in the left column of Figure 2. A one-way repeated-measures ANOVA indicated that their looking response differed significantly by Condition (F(3,45) = 8·31, p < ·001). However, Holm pairwise comparisons showed that none of the conditions with the critical stimulus word (definite noun, indefinite noun, or pronoun) was significantly different from the ‘silent’ condition. The different score for pronouns was significantly lower than that for the definite noun (p = ·002).
In order to compare the children's performance with that of our adult participants, we ran a two-way mixed ANOVA with Group (adult vs. 2;0 vs. 1;6) and Condition (definite vs. indefinite vs. pronoun vs. silent) as factors. There was a significant main effect of Condition (F(3,123) = 60·73, p < ·001) and a significant Group × Condition interaction (F(6,123) = 7·70, p < ·001). Post-hoc Tukey's pairwise comparisons of difference scores in the pronoun condition indicated that preference for the given referents was significantly lower in children aged 1;6 than in adults (p < ·001) and children aged 2;0 (p = ·028). There was no significant difference in pronoun difference scores between the two-year-olds and the adults (p = ·181). The results indicate that children's understanding of pronouns at 1;6 is different from two-year-olds and adults in that they do not yet link the pronoun it to a previously mentioned referent in a context with a visual competitor.
Figure 2 summarizes the looking preference for the given referent in the four conditions across the three populations and illustrates that two-year-olds, but not children at 1;6, approach the adults' looking patterns in their interpretation of the pronoun it.
GENERAL DISCUSSION
Children use pronouns in their speech from the earliest word combinations (Brown, Reference Brown1973; Chiat, Reference Chiat, Fletcher and Garman1986; Kirby & Becker, Reference Kirby and Becker2007). Yet, it is not clear from these early utterances whether they understand that pronouns are used as substitutes for nouns and entities in the discourse. The aim of this study was to examine whether two-year-olds understand the anaphoric function of pronouns, focusing on the interpretation of the pronoun it in children at 1;6 and 2;0. The results of the experiment showed that two-year-olds looked significantly more to the given object in the pronoun condition compared to the silent and indefinite noun conditions. This performance was similar to the adult participants, but different from children at 1;6, who did not show preference for given referents in response to the pronoun it. This demonstrates that some time between 1;6 and 2;0, children come to understand that it refers to a highly accessible referent introduced in the prior context.
The youngest participants in our study failed to map the pronoun it to a previously introduced antecedent in a context with another visually accessible inanimate competitor. This finding contrasts with the observation that children at 1;6 directly replace full noun phrases with the pronoun it in spontaneous speech (Kirby & Becker, Reference Kirby and Becker2007). Under closer examination, it turns out, however, that the young children in Kirby and Becker (Reference Kirby and Becker2007) primarily used it to refer to an object in the environment rather than as a substitute for a linguistic form in the prior discourse (p. 582), indicating that the use and interpretation of the anaphoric it is initially limited. This may be possibly related to children's limited exposure to pronouns in their input: several studies of child-directed speech report that English-speaking parents often represent given referents with definite nouns (Rozendaal & Baker, Reference Rozendaal and Baker2010). In fact, definite nouns are primarily used in child-directed speech for given referents (Rozendaal & Baker, Reference Rozendaal and Baker2008). Young children may thus first associate previously mentioned referents with definite nouns rather than pronouns. This conclusion also finds some support in our study since the children at 1;6 more reliably interpreted the definite noun as a referent for the previously mentioned noun.
As this study shows, however, at the age of two, English-speaking children begin to understand pronouns anaphorically. What can, then, account for the mismatch between their comprehension of pronouns and the inconsistent use of these forms in their speech when they often rely on null arguments or lexical nouns instead? The asymmetry in comprehension and production in the early stages of language development is well documented across various linguistic domains (see e.g. Clark, Reference Clark2003; Hendriks & Koster, Reference Hendriks and Koster2010). Young children, for example, tend to have more limited productive than receptive vocabularies: they understand more words than they produce (e.g. Benedict, Reference Benedict1979; Hoff, Reference Hoff2013). Huttenlocher (Reference Huttenlocher and Solso1974) relates the early production delays to the difference between recall and recognition of words versus objects. This distinction is likely to play a role in the context of referring expressions too. Two-year-olds may thus recognize that it is used to signal a highly accessible referent. But when it comes to production, they are presented with the added demands of recognizing a referent and recalling the target form from amongst several competing options since various linguistic expressions may be used to refer to a particular object or event. This account fits well with the proposal that young children often leave out subject arguments as a result of their limited abilities to plan and produce speech (Bloom, Reference Bloom1990; Gerken, Reference Gerken1991). It would also explain why children at 2;6 produce forms inconsistently in response to the demands of the context: their production tends to be more variable in experimental studies than in naturalistic conversations in familiar environments (Allen et al., Reference Allen, Hughes, Skarabela, Serratrice and Allen2015; Bergmann et al., Reference Bergmann, Paulus and Fikkert2012).
The above account implicitly builds on the assumption that the child is aware of various linguistic options to choose from. But it is also possible that the two-year-old may have not yet fully grasped how to conventionally signal accessibility in their language. This interpretation finds support in spontaneous speech studies showing that two-year-olds maximally exploit discourse context in their production. For instance, a two-year-old may fail to represent a new referent with a lexical noun and use a null argument instead when the child and their interlocutor are both involved in joint attention (Skarabela, Reference Skarabela2007). As a result, children's early utterances differ from the observed conventionalized patterns in the adult language (e.g. Skarabela, Allen & Scott-Phillips, Reference Skarabela, Allen and Scott-Phillips2013). There may thus be a mismatch between the child's understanding of what a linguistic convention (i.e. pronoun) communicates and how a specific function or information (e.g. highly accessible referent) gets conventionally encoded in the adult language.
In conclusion, this study demonstrates that English-speaking two-year-olds, like adults, understand that it refers to a previously linguistically introduced referent, even in the presence of a visually accessible competitor. There is no evidence that this knowledge is established in children at 1;6. Future research will aim to identify the sources of the developmental change from 1;6 to 2;0 and whether the inconsistency in two-year-olds' use of pronouns in their speech is related to children's limited processing and planning abilities or their developing awareness of linguistic conventions in their language.