A growing body of evidence indicates that various morphosyntactic cues for lexical categories such as nouns and verbs are available in input (e.g. Mintz, Reference Mintz2003; Reference Mintz, Hirsh-Pasek and Golinkoff2006; Redington, Chater & Finch, Reference Redington, Chater and Finch1998). Such information could potentially act as a cue in forming lexical categories, given that several studies have indicated that preverbal infants have the ability to detect various patterns in auditory linguistic input (e.g. Mintz, Reference Mintz2003; Reference Mintz, Hirsh-Pasek and Golinkoff2006; Saffran, Aslin & Newport, Reference Saffran, Aslin and Newport1996; Shady, Gerken & Jusczyk, Reference Shady, Gerken, Jusczyk, MacLaughlin and McEwen1995; Shi, Werker & Morgan, Reference Shi, Werker and Morgan1999). For instance, a corpus analysis by Mintz (Reference Mintz2003; Reference Mintz, Hirsh-Pasek and Golinkoff2006) revealed that a distributional analysis using frequent sentence frames as contexts can successfully group words into their respective lexical categories such as nouns and verbs. When novel words were presented in frequently used noun or verb sentence frames during the familiarization phase of a preferential listening experiment, English-learning children aged 1 ; 0 showed longer looking times for the novel words presented in ungrammatical test sentences than those presented in grammatical ones (Mintz, Reference Mintz, Hirsh-Pasek and Golinkoff2006). Interestingly, a significant difference in children's looking time between the ungrammatical and the grammatical sentences was found for the verb frames but not for the noun frames, which may reflect the difference in corpus frequency between the noun and the verb frames in the input. This finding suggests that they categorized the novel words based on morphosyntactic information in the input. A bigram study by Hohle, Wissenborn, Keifer, Schulz & Schmitz (Reference Hohle, Weissenborn, Kiefer, Schulz and Schmitz2004) found that German-learning children aged 1 ; 3 categorized novel words following a determiner as nouns, suggesting that they use the morphosyntactic contexts of novel words to categorize them. While these studies show that young children are able to use morphosyntactic information in input to categorize novel words, it has yet to be determined whether children use such morphosyntactic information when learning the meaning of novel words.
There is an ongoing debate as to whether children's representations of morphosyntactic information are abstract enough to guide early verb learning (Dittmar, Abbot-Smith, Lieven & Tomasello, Reference Dittmar, Abbot-Smith, Lieven and Tomasello2008; Gertner, Fisher & Eisengart, Reference Gertner, Fisher and Eisengart2006; Gleitman, Reference Gleitman1990; Tomasello, Reference Tomasello2003). According to the item-based account (Tomasello, Reference Tomasello2003), children's early morphosyntactic knowledge of verbs is verb-specific and constructed around the specific environments in which verbs occur. Their representations gradually become abstract, based on accumulating individual exemplars. Therefore, their early representations of verb morhosyntactic information would not be abstract enough to guide word learning. By contrast, the rule-based account (e.g. Gertner et al., Reference Gertner, Fisher and Eisengart2006; Gleitman, Reference Gleitman1990; Naigles, Hoff & Vear, Reference Naigles, Hoff and Vear2009) assumes that children's morphosyntactic knowledge is abstract from early on and constrains children's search space for the meaning of novel words through syntactic bootstrapping. Most evidence for the item-based account stems from studies with a production task or an act-out comprehension task indicating that children under three years of age are very conservative about extending newly learned verbs beyond particular argument structures and morphology in which the verbs were modeled during the teaching trials (e.g., Akhtar, Reference Akhtar1999; Oluguin & Tomasello, Reference Oluguin and Tomasello1993). There is also plenty of evidence from spontaneous speech data indicating children's limited productivity of early verb use (e.g. Lieven, Behrens, Speares & Tomasello, Reference Lieven, Behrens, Speares and Tomasello2003; see also Naigles et al., Reference Naigles, Hoff and Vear2009, for a comprehensive review). However, based on their diary data of first verb use with eight English-speaking children, Naigles et al. (Reference Naigles, Hoff and Vear2009) have argued that children producing their first verbs are not as conservative as the item-based account has claimed. In fact, the children in their diary study used their first verbs flexibly in multiple sentence frames, including different subjects, objects and prepositions as well as some changes in morphology within the first ten instances of use, suggesting that children's early verb morphosyntactic knowledge is not item-based at least by the time they begin to produce their first verbs.
While Naigles et al.'s diary speech data support the rule-based account, much of the evidence for this account comes from comprehension studies using children's looking time measures (e.g. Gertner et al., Reference Gertner, Fisher and Eisengart2006; Naigles, Reference Naigles1990). For instance, using an Intermodal Preferential Looking (IPL) paradigm, Naigles (Reference Naigles1990) has shown that English-speaking children aged 2 ; 1–2 ; 4 who heard a novel verb in a transitive sentence frame looked longer at transitive action events, whereas those who heard the verb in an intransitive sentence looked longer at intransitive action events. A more recent IPL study by Gertner et al. (Reference Gertner, Fisher and Eisengart2006) has demonstrated that English-speaking children aged 1 ; 9 and 2 ; 1 use word order to interpret transitive sentences containing novel verbs. Gertner et al. took these findings as evidence for early abstract morphosyntactic representation in children under age 2 ; 6.
Recently Dittmar et al. (Reference Dittmar, Abbot-Smith, Lieven and Tomasello2008) conducted a similar IPL study with German-speaking children aged 1 ; 9 to compare their results with those of Gertner et al. (Reference Gertner, Fisher and Eisengart2006). They found that the German-speaking children performed similarly to those in Gertner et al.'s study only when a practice phase similar to the test phase was given, and concluded that the practice phase in Gertner et al.'s study must have taught the children important information relevant to the test phase, and suggested that weak representations, as opposed to strong, fully abstract representations, may be sufficient for successful performance in a comprehension task with preferential looking. However, as Dittmar et al. noted, crucial empirical data for evaluating this explanation are lacking. More research is needed to examine the nature of morphosyntactic representation in children younger than 2;6. Furthermore, there are disagreements regarding the age at which children begin to use morphosyntactic information reliably in word-mapping research (Echols & Marti, Reference Echols, Marti, Hall and Waxman2004; Brandone, Pence, Golinkoff & Hirsh-Pasek, Reference Brandone, Pence, Golinkoff and Hirsh-Pasek2007; Imai, Haryu & Okada, Reference Imai, Haryu and Okada2005; Imai et al., Reference Imai, Lianjing, Haryu, Okada, Hirsh-Pasek, Golinkoff and Shigematsu2008; Maguire, Hirsh-Pasek, Golinkoff & Brandone, Reference Maguire, Hirsh-Pasek, Golinkoff and Brandone2008; Oshima-Takane, Satin & Tint, Reference Oshima-Takane, Satin, Tint, Chan, Jacob and Kapia2008).
Disagreements are especially notable in verb-mapping research compared to noun-mapping research. For instance, Imai et al. (Reference Imai, Lianjing, Haryu, Okada, Hirsh-Pasek, Golinkoff and Shigematsu2008) reported that English-, Japanese- and Chinese-speaking children were able to use morphosyntactic cues to learn a novel verb at age 5 ; 0 but not at age 3 ; 0. By contrast, Echols & Marti (Reference Echols, Marti, Hall and Waxman2004) reported that English-speaking children were able to do so by age 1 ; 6. However, evidence for children's ability to use morphosyntactic cues in verb-mapping tasks before the age of 2 ; 0 is scant (Bernal, Lidz, Millotte & Christophe, Reference Bernal, Lidz, Millotte and Christophe2007; Gertner et al., Reference Gertner, Fisher and Eisengart2006; Echols & Marti, Reference Echols, Marti, Hall and Waxman2004; Oshima-Takane et al., Reference Oshima-Takane, Satin, Tint, Chan, Jacob and Kapia2008) as compared to evidence for noun-mapping tasks (e.g. Echols & Marti, Reference Echols, Marti, Hall and Waxman2004; Fennell, Reference Fennell, Banman, Magnitsikai and Zaller2006; Hollich et al., Reference Hollich, Hirsh-Pasek, Golinkoff, Brand, Brown, Chung, Hennon, Rocroi and Bloom2000; Pruden, Hirsh-Pasek, Golinkoff & Hennon, Reference Pruden, Hirsh-Pasek, Golinkoff and Hennon2006; Trehub & Shefield, Reference Trehub and Shefield2007).
The aim of the present study was to provide evidence that children under age 2 ; 0 can use verb morphosyntactic cues in a word-mapping task that requires the extension of newly learned verbs to new instances of the same action with a different agent. The use of verb morphosyntactic cues in this task would provide evidence for children's strong representation of verb morphosyntactic information. Of particular interest was to examine whether resource limitations could account for the disagreements about the age at which children use morphosyntactic cues reliably in verb mapping. A resource limitation hypothesis has been proposed to explain the difficulty that young infants have using phonetic details in word–object mapping (Fennell, Reference Fennell, Banman, Magnitsikai and Zaller2006; Werker & Fennell, Reference Werker, Fennell, Hall and Waxman2004). According to this view, infants can use phonetic details to map phonetically similar novel words onto novel objects when a word-mapping task is simple. However, they are not able to do this in a difficult task that requires too much of their cognitive resources. The notion that limited cognitive resources affect children's performance in production and comprehension is not new in the field of language acquisition. Researchers agree that young children are likely to omit some sentence part such as the subject when their cognitive processing abilities are exceeded; for instance, when new verbs or negation are added to the sentence or when they are asked to repeat long sentences (e.g. Bloom, Reference Bloom1993; Valian, Prasada & Scarpa, Reference Valian, Prasada and Scarpa2006). Limited working memory capacity in children also affects comprehension and the learning of new words; the constraining effects of working memory on word learning continue into adolescence (Gathercole, Reference Gathercole2006). There is also evidence that young children are able to retrieve object labels when they can see the objects but not when they cannot see them, even though they know the labels (Dapretto & Bjork, Reference Dapretto and Bjork2000). These findings suggest that difficulty in verb learning in young children may not be due to lack or weakness of their representations but due to their limited cognitive resources. The present study investigated whether young children's difficulty in using morphosyntactic details in verb learning can be explained by their limited cognitive resources.
Imai et al.'s (Reference Imai, Haryu and Okada2005) study was among the first to investigate if children can use morphosyntactic information in the input to learn novel words when more than one interpretation is possible (i.e. agent, action and patient interpretations). Their experiments with a novel word embedded in a single syntactic frame showed that Japanese-speaking three-year-olds were unable to map a novel verb onto an unfamiliar transitive action but could map a novel noun onto an unfamiliar object without difficulty. They also had difficulty generalizing a novel verb to an action when the action was the same but was presented with a different object from that in the teaching phase. Imai et al.'s (Reference Imai, Lianjing, Haryu, Okada, Hirsh-Pasek, Golinkoff and Shigematsu2008) subsequent study with English-, Japanese- and Chinese-speaking children showed the same results except that Chinese-speaking five-year-olds needed pragmatic information in addition to morphosyntactic cues in order to map novel verbs onto actions. Based on these results, Imai and her colleagues concluded that the three-year-olds' failure to generalize novel verbs to actions when the object was changed reflected their limited knowledge of the novel verb meaning. They suggested that, unlike five-year-olds and adults, three-year-olds are not able to learn the full meaning of novel verbs quickly.
Recent studies with simpler word-mapping tasks and less demanding procedures have shown that children under age 2 ; 0 can use morphosyntactic information in verb learning (Bernal et al., Reference Bernal, Lidz, Millotte and Christophe2007; Echols & Marti, Reference Echols, Marti, Hall and Waxman2004; Oshima-Takane et al., Reference Oshima-Takane, Satin, Tint, Chan, Jacob and Kapia2008). These findings suggest that the cognitive load in Imai et al.'s verb-mapping task may have been too heavy for three-year-olds, and thus, according to the resource limitation hypothesis, this could be a major reason why they failed to map the novel words to actions. The IPL procedure used by Echols & Marti (Reference Echols, Marti, Hall and Waxman2004) was less demanding in terms of cognitive load as children's visual fixations were measured during the test phase to assess their comprehension of novel words. In Imai et al.'s task, children were asked to point to the correct screen or to answer yes/no questions. Furthermore, the visual events used in Echols & Marti's (Reference Echols, Marti, Hall and Waxman2004) study were simpler because only the agent and the action were switched. As a result, only two interpretations (agent and action) were possible instead of all three (agent, action and patient). This less demanding nature of the word-mapping task must have facilitated the children's use of morphosyntactic information in verb mapping.
Several studies have reported that, by age 2 ; 0, English-speaking children are able to use morphosyntactic cues in learning new nouns but not new verbs unless morphosyntactic cues coincide with perceptual cues or children's preferences (e.g. Brandone et al., Reference Brandone, Pence, Golinkoff and Hirsh-Pasek2007; Pruden et al., Reference Pruden, Hirsh-Pasek, Golinkoff and Hennon2006). While Echols & Marti (Reference Echols, Marti, Hall and Waxman2004) provide evidence that English-speaking children are able to use morphosyntactic cues in noun mapping at age 1 ; 1 and in verb mapping at age 1 ; 6, their design does not clearly rule out the effect of visual novelty on children's looking behavior. Unlike a conventional IPL paradigm (e.g. Naigles, Reference Naigles1990), children were only familiarized with one agent engaging in an action and were then tested with two events, one with the familiarized agent engaging in a novel action and the other with a novel agent engaging in the familiarized action. Therefore, it is possible that the infants aged 1 ; 1 and 1 ; 6 simply looked at the target screen because they preferred to look at the novel aspect of the event (i.e. the novel action or agent) rather than the familiarized agent or action. Furthermore, the children's looking preference at age 1 ; 1 coincided with the target screen for the noun condition and their looking preference at age 1 ; 6 coincided with the target screen for the verb condition. Therefore, it would be difficult to draw any definite conclusions from the results of the verb-mapping task. A more plausible interpretation might be that English-speaking children aged 1 ; 6 can use morphosyntactic cues reliably in the noun-mapping task. A study controlling for perceptual salience and novelty effects is needed to determine whether children in fact rely on morphosyntactic cues in mapping novel verbs onto actions.
Using a similar IPL design, Bernal et al. (Reference Bernal, Lidz, Millotte and Christophe2007) reported that French-speaking children aged 1 ; 11 are able to distinguish function words associated with verbs from those associated with nouns. They used pointing behavior as the dependent measure instead of looking behavior because children in their previous IPL study shifted their gaze back and forth between the two screens and did not show clear looking patterns. However, the looking behavior could be especially noisy when the IPL task has more than one correct answer, as was the case in their noun context condition. They were shown the same familiar object performing either the familiar action on one side or the novel action on the other side during the test phase. Thus, both screens were correct if they interpreted the word as a noun referring to the object, although they were expected to point to the screen that did not match the verb meaning. In addition, an overt behavior such as pointing may not be a good dependent measure for children under age 2 ; 0. This was likely the case given that approximately one-third of the children were excluded from their final sample because they were unwilling to point during the warm-up and training sessions. Even those included in the final sample did not point at all an average of 20% of the time during the testing phase.
The habituation paradigm using a switch design offers several advantages over a pointing procedure. Like an IPL paradigm, a switch habituation procedure can eliminate effects of social support such as pragmatic cues and social encouragement. It also measures children's looking time, which reduces children's cognitive load relative to a pointing procedure. Furthermore, it controls for the novelty effects of visual and linguistic stimuli used in the test trials better than the IPL paradigm because children are familiarized with all aspects of the visual scenes paired with novel words during the habituation trials. In addition, the switch design is less cognitively demanding for young children than the IPL paradigm because they are only presented with one screen instead of two (Casasola & Cohen, Reference Casasola and Cohen2000; but see Yoshida, Fennell, Swingley & Werker, Reference Yoshida, Fennell, Swingley and Werker2009, for a contrary argument).
In fact, the habituation switch design has been successfully used to study early word mapping in children as young as 1;2 (Casasola & Cohen, Reference Casasola and Cohen2000; Casasola & Wilbourn, Reference Casasola and Wilbourn2004; Casasola, Wilbourn & Yang, Reference Casasola, Wilbourn and Yang2006; Kobayashi, Mugitani & Amano, Reference Kobayashi, Mugitani and Amano2006; Werker, Cohen, Lloyd, Casasola & Stager, Reference Werker, Cohen, Lloyd, Casasola and Stager1998). For example, Werker et al. (Reference Werker, Cohen, Lloyd, Casasola and Stager1998) have shown that English-speaking infants are able to learn the association between novel words and objects by age 1 ; 2. Casasola & Cohen (Reference Casasola and Cohen2000) reported that English-speaking children learn the association between novel words and transitive actions by age 1 ; 6. However, these studies presented children with two different objects engaging in the same action (word–object mapping) or with the same objects engaging in two different actions (word–action mapping) and, therefore, only one interpretation (object or action) was possible.
In a recent cross-linguistic study, Katerelos, Poulin-Dubois & Oshima-Takane (Reference Katerelos, Poulin-Dubois and Oshima-Takane2003; Reference Katerelos, Poulin-Dubois and Oshima-Takane2010) examined children's preferences for mapping a novel isolated word onto an action or agent when both interpretations were equally possible. Children in this study were habituated to two events in which an animal-like or a vehicle-like object engaged in a jumping-like or bouncing-like action, each paired with a novel isolated word. Unlike previous switch habituation studies, three switch test trials (i.e. word switch, agent switch and action switch) were employed to examine children's preferences for mapping a novel word onto an agent or action. They found that English-, French- and Japanese-speaking children aged 1 ; 6–1 ; 8 looked significantly longer when the agent was switched but not when the action was switched, indicating that they prefer to map the novel word onto the agent when given the choice of assigning a novel isolated word to an agent or action.
The design and findings reported by Katerelos et al. (Reference Katerelos, Poulin-Dubois and Oshima-Takane2003; Reference Katerelos, Poulin-Dubois and Oshima-Takane2010) provide an interesting opportunity to test children's ability to use morphosyntactic cues in word learning. If children alter their preference to map the novel word onto an action when it is presented in a verb sentence frame, this would provide evidence that they can use morphosyntactic cues in word learning. The present study used this idea to investigate if Japanese-speaking children aged 1 ; 8 can use verb morphosyntactic cues to map a novel word onto an action instead of an agent when both the agent and the action interpretations are equally possible. In addition, we investigated whether their performance depended on the complexity of the verb-mapping task as predicted by the resource limitation hypothesis. In order to compare the results with the findings from Japanese-speaking children aged 1 ; 8 in Katerelos et al.'s (Reference Katerelos, Poulin-Dubois and Oshima-Takane2003) study, Experiment 1 used the same experimental set-up and apparatus except for a few modifications described below. Experiment 2 examined whether performance at age 1 ; 8 would improve significantly when the task was simplified as predicted by the resource limitation hypothesis. Experiment 3 examined children's responses to the same visual stimuli in the absence of linguistic stimuli to test whether differences in perceptual salience or preference could account for the performance in Experiments 1 and 2.
EXPERIMENT 1
METHOD
Participants
Forty-two Japanese-speaking children (17 boys, 25 girls) with a mean age of 20.80 months (range: 20.03–21.85) participated in the study. Nine additional children (3 boys, 6 girls) were tested but were excluded from the final sample for the following reasons: experimental error (1), fussiness (1), no habituation (4) and failure to meet the task engagement criteria described below (3). Since the Japanese version of the MacArthur-Bates Communication Development Inventory (Fenson, Dale, Reznick, Bates, Thal & Pethick, Reference Fenson, Dale, Reznick, Bates, Thal and Pethick1994) was not available at the time, the Tsumori standardized developmental inventory (Tsumori & Inage, Reference Tsumori and Inage1974) was administered by the experimenter after the habituation experiment in order to estimate the children's language development level. Children's mean language age was 21.6 months with a range of 15–29.5. All children lived in Isesaki-city in Gunma prefecture in Japan.
Stimuli and apparatus
The visual stimuli were adopted from Katerelos et al.'s (Reference Katerelos, Poulin-Dubois and Oshima-Takane2003) study, in which computer animated drawings of a pink animal-like figure and a green and red vehicle-like figure engaged in distinct actions. However, eyes were added to the vehicle-like figure to minimize the animate–inanimate difference between the two figures, given that the Japanese language often uses different verbs to make animate–inanimate distinctions (see Figure 1). In each movie, the figures moved from the left side of the screen towards a blue wall located in the middle of the screen. Each figure then engaged in a jumping-like or a bouncing-like action. A green screen came down at the beginning of each event to separate the events. The duration of each event was 9 s and was repeated up to three times per trial. Thus, the maximum duration of a single trial was 27 s.
Each action event was paired with the novel word moke /moke/ or seta /seta/ embedded in an intransitive syntactic frame.
Moke/seta-shi-te(i)ru-yo
moke/seta-do-Present progressive-final particle
‘(It) is moke/seta-ing.’
The subjects of the sentences were dropped as in Imai et al.'s (Reference Imai, Haryu and Okada2005) study. The linguistic stimuli were presented twice during each 9-s event. The first occurred at 1 s while the green screen was down, and the second took place at 6·5 s while the action was being performed. The duration of the green screen was thus increased from 1 s in Katerelos et al.'s (Reference Katerelos, Poulin-Dubois and Oshima-Takane2003) study to 2 s so that the sentence would fit in. This modification was based on Tomasello & Kruger's (Reference Tomasello and Kruger1992) finding that children learned verbs better when their mother modeled verbs before or after the referred action occurred. However, the second stimulus was presented before the completion of the action in order to use the same intransitive verb sentence frame (i.e. present progressive). The event–sentence pairing was counterbalanced such that half of the children heard the novel word seta paired with the animal bouncing event, whereas the other half heard the novel word moke.
An attention-getter in which a green circle expanded and contracted on a black background with a ‘bing’ sound was used to capture children's attention and redirect them to the screen whenever they looked away from the screen for more than 1 s, as in other similar studies (e.g. Casasola & Cohen, Reference Casasola and Cohen2000). Once children looked back at the screen, the next trial was presented. Children were presented with an additional 27 s movie, in which a blue geometric object with moving appendages slid across the screen at the beginning (pretest) and end of the experiment (post-test) to control for fatigue. Children were excluded from the final sample if the looking time at the post-test was less than 25% of that of the pretest or if it was less than 5 s.
The experimental set-up and apparatus were identical to those in Katerelos et al.'s (Reference Katerelos, Poulin-Dubois and Oshima-Takane2003) study except that Habit 2002 Beta 1 OS 9 was used instead of Habit version 7.8 (Cohen, Reference Cohen2002). The stimuli were shown on a Sony Trinitron Multiscan E230 17-inch monitor, placed 117 cm away from the chair in which children were seated. Sony SRS-Z750PC speakers were placed directly above the monitor, behind a small mesh opening made on the black panel. A Sony DCR-TRV17K digital video camera was placed behind a small opening in the panel, and about 20 cm above the monitor. The camera was connected to an Aiwa TV-14GT33 television to allow the experimenter to code for the child's eye fixation to the monitor online. To minimize distractions, the experimenter and the set-up were hidden behind a black wooden panel.
Design and procedure
An infant-controlled habituation paradigm with a switch design was used to teach children the novel words. First, children watched two habituation events a maximum of twenty times until their total looking time during any sample of four consecutive habituation trials was less than 50% of their total looking time during the first four habituation trials. Children who did not meet this criterion (i.e. those who did not have a 50% reduction in looking time by twenty trials) were classified as non-habituators and were excluded from the final sample. The two habituation events were presented in semi-random order with no more than two consecutive presentations of each. Once children were habituated to the initial events, they were presented with three switch test trials and their looking time during each trial was measured in order to determine how they interpreted the novel words. In the switch test trials, the agent or the action or both elements paired with one of the habituation events were ‘switched’ to evaluate which association the child made with the novel word (see Table 1). For instance, in the agent switch trial, the agents of the two events were switched, while all other aspects (action, word) associated with the event remained the same. It was thus possible to determine which pairing (word–event, word–agent or word–action) represented a violation of the children's expectations, as they would look longer (i.e. recover) at the combination perceived as novel. The test trial with the switched element to which the children recover would indicate their initial word learning association.
All children underwent the word switch trials before the agent and action switch trials, as in Katerelos et al.'s study (Reference Katerelos, Poulin-Dubois and Oshima-Takane2003), to determine whether they had learned an associative link between the original visual events and the novel words.Footnote 1 The order of the agent and action switch trials was counterbalanced. It was assumed that if children had made an associative link between the original visual events and the novel words, they would look longer at the word switch trials than at the baseline. These children were classified as recoverers. Those who did not were classified as non-recoverers because it was unclear whether they learned the original sentence–event combinations. In addition, those who looked at the word switch trial for less than 5 s were included in the non-recoverer group. Since each distinctive action occurred 5 s into the event, this ensured that not only those who mapped the novel words onto the agents but also those who mapped them onto the actions could notice that the word had been switched.
In Katerelos et al.'s study (Reference Katerelos, Poulin-Dubois and Oshima-Takane2003), one of the habituation events was used as the baseline. However, the average looking time of the last four habituation trials was used as a baseline in the present study because Japanese-speaking children in the pilot study often looked away before the action in the word switch trial was shown, making it impossible to determine whether they had learned the original event–sentence combinations in the habituation trials. Given that the same linguistic stimulus was presented across test trials while the green screen was down, it is plausible that children lost interest because they assumed that the baseline and the subsequent test trials were identical.
In order to conduct inter-rater reliability, all testing sessions were videotaped, and a second coder did off-line coding on 20% of the original sample (randomly selected). The Pearson-product moment correlations of the on-line coding ranged from 0·95 to 0·99, with a mean of 0·98.
RESULTS
The looking time data for the habituation trials were analyzed for 42 children with a two-way analysis of variance (ANOVA) with gender (males and females) as a between-subjects factor and habituation blocks (the first four trials vs. the last four trials) as a within-subject factor. There was a significant main effect of the habituation block (F(1, 40)=626·52, p<0·001Footnote 2). The mean total looking time for the first four trials was 78·8 s (SD=18·4) and 32·2 s (SD=10·6) for the last four trials. This confirmed the expectation that children's total looking time would drop substantially by the last four habituation trials. There was no main effect of gender nor was there an interaction between gender and block. A t-test performed on the total habituation looking times (males: M=135·22 s, SD=59·90; females: M=131·97 s, SD=53·55) and the total number of habituation trials (males: M=8·94, SD=2·81; females: M=9·00, SD=3·15) indicated that there were also no significant gender differences in either of these measures.
Figure 2 presents the looking times during the test trials and the baseline looking time. The looking times from the Japanese sample in Katerelos et al.'s (Reference Katerelos, Poulin-Dubois and Oshima-Takane2003, Reference Katerelos, Poulin-Dubois and Oshima-Takane2010) study are also included to compare children's looking patterns when the novel words were presented in isolation versus in a verb sentence frame. A two-way ANOVA with gender (males and females) as a between-subjects factor and trials (baseline, word switch, agent switch and action switch) as a within-subjects factor was performed on the data. The results indicated that none of the effects were significant (ps>0·05). The finding that there was no main effect of trials was surprising. This indicates that even the mean looking times of the word switch trial were not significantly longer than the baseline looking time. In contrast, in the single word experiment (Katerelos et al., Reference Katerelos, Poulin-Dubois and Oshima-Takane2003, Reference Katerelos, Poulin-Dubois and Oshima-Takane2010), the children recovered at both the word and the agent switch trials but not at the action switch trial, as shown in Figure 2. The current findings suggest that the Japanese-speaking children aged 1 ; 8 had not learned the original pairings when the novel words were embedded in a verb sentence frame.
However, a closer examination of the individual data revealed that not all of the children had failed to learn the original sentence–event pairings. A substantial number of children looked longer at the word switch trial than the baseline (recoverers), suggesting that these children had learned the original pairings during the habituation phase. We analyzed their data separately from those who did not look longer at the word switch trial than the baseline (non-recoverers). Sixteen children met our inclusion criteria for the recoverer group. Their looking time data were examined in a two-way ANOVA with gender as a between-subjects factor and trials as a within-subjects factor. This analysis produced a significant main effect for the trials (F(3, 42)=5·26, p=0·004). Since there was no main effect of gender and no interaction effect of trial and gender, subsequent analyses were performed on the male and female data combined.
Figure 3 shows the mean looking times of the three test trials and the baseline looking time for recoverers and non-recoverers. For the recoverer group, children looked reliably longer at the word switch trial (M=16·19 s, SD=6·53) than the baseline (M=8·03 s, SD=2·56), (LSD t(15)=5·22, one-tailed, p<0·001). Thus, these children were able to differentiate the habituated event–sentence pairings from the new event–sentence pairings. They also looked significantly longer at the action switch trial (M=14·24 s, SD=9·37) than the baseline (LSD t(15)=2·59, one-tailed, p=0·0105). However, they did not look longer at the agent switch trial (M=9·99 s, SD=8·96) than at the baseline (p>0·05). These results indicate that the children in the recoverer group associated the novel word with the action rather than with the agent.
There were twenty-six children who were classified as non-recoverers (i.e. they habituated but did not recover or did not look longer than 5 s at the word switch trial).Footnote 3 None of them had a post-test looking time of less than 5 s or less than 25% of their pretest looking time, indicating that their performance in the word switch trial was not due to fatigue. Their looking time data were further analyzed to determine why they did not recover at the word switch trial. A two-way ANOVA with gender as a between-subjects factor and trials as a within-subjects factor revealed a significant main effect of trial but no other significant effects. As indicated in Figure 3, the mean looking time at the word switch trial (M=3·42 s, SD=1·89) was significantly shorter than the baseline (M=8·12 s, SD=2·73, one-tailed, LSD t(25)=7·52, p<0·001). Although there were no significant differences (p>0·05), the mean looking times at the agent switch trial (M=8·00 s, SD=7·21) and action switch trial (M=6·75 s, SD=5·40) were shorter than the baseline. Therefore, we may conclude that the non-recoverers did not learn the original pairings between the linguistic stimuli and the visual events for the agent or the action, as they did not notice the mismatch between the linguistic stimuli given while the green screen was down and the visual event that followed. Interestingly, their mean looking times at the agent and action switch trials were longer than their mean looking time at the word switch trial (one-tailed, LSD t(25)=−3·30, p=0·0015 and t(25)=−2·87, p=0·004). Since the recoverer and non-recoverer groups did not significantly differ in the total habituation looking time (M: 135·99 s vs. 131·63 s), total number of habituation trials (M: 9·44 vs. 8·69), and baseline looking time (M: 8·03 s vs. 8·12 s), it is possible that the non-recoverers habituated to the visual events but not to the pairings of linguistic stimuli with the visual events during the habituation phase. As a result, the word switch test event would have been treated as a habituation event and the children would have lost interest, thus resulting in a decreased looking time.
There was no significant difference in language development level (Tsumori language age) between the recoverers (M=20·88, SD=3·46) and the non-recoverers (M=22·04, SD=3·84) (p>0·05). Furthermore, neither group showed significant correlations between language age and either of the habituation looking measures (ps>0·05). However, they showed quite different correlation patterns in the total habituation looking time and the test trial looking times. As shown in Table 2, the recoverer group showed significant positive correlations between the action switch looking time and both habituation measures: the total habituation looking time (r=0·62, p=0·010) and the total number of habituation trials (r=0·58, p=0·019). This finding suggests that children who looked longer at the action switch trial were also those who tended to look longer during the habituation phase. This makes sense given that these children had to look longer at the habituation events in order to associate the novel word with the action because the distinctive actions occurred at 5 s. Conversely, the non-recoverer group did not show significant correlations between any of the habituation measures and switch test trial measures. This finding supports the interpretation that the children in the non-recoverer group did not learn the pairings between the visual events and the sentences during the habituation phase.
note: * p<0·05, ** p<0·01 (2-tailed).
DISCUSSION
The results of Experiment 1 showed that Japanese children aged 1 ; 8 who learned the original sentence–visual event pairings during the habituation phase (i.e. recovered at the word switch trial) mapped novel words onto the action and not onto the agent. In Katerelos et al. (Reference Katerelos, Poulin-Dubois and Oshima-Takane2010), Japanese-speaking children aged 1 ; 8 looked significantly longer at the agent switch trial but not at the action switch trial, indicating that they tended to map the novel word onto the agent when both the agent and action interpretations were available. The present study replicated a large part of the methods and procedures employed in their study. The most prominent difference between the two studies was that the novel linguistic stimuli presented in the previous study were single words, whereas the present study provided morphosyntactic cues in an intransitive verb sentence frame. Thus, it seems reasonable to infer that this extra information facilitated the mapping of novel words onto actions rather than agents. This constitutes the first evidence that a single intransitive verb sentence frame is sufficient for children aged 1 ; 8 to map novel words onto actions when the agent and action interpretations are equally possible. However, approximately 60% of the children did not show clear evidence that they learned the original sentence–visual event pairings during the habituation phase as they did not recover at the word switch trial. This finding suggests that the mapping task used in Experiment 1 was too difficult for the majority of children aged 1 ; 8 and that evidence for the ability to use a verb sentence frame by age 1 ; 8 is limited.
In the present task, children heard a novel verb embedded in a single syntactic frame. In Echols & Marti's study (Reference Echols, Marti, Hall and Waxman2004), English-speaking children aged 1 ; 6 heard a novel verb in two syntactic frames with two different verb inflections: one indicating the present progressive aspect -ing and the other indicating the third person present tense -s (It's gepping; see? It geps.). It is possible that many children at this age need to hear a novel verb in more than one syntactic frame in order to learn the meaning of the verb. In fact, most previous verb learning studies have used more than one sentence frame during the teaching phase (Behrend, Reference Behrend1990; Bernal et al., Reference Bernal, Lidz, Millotte and Christophe2007; Dittmar et al., Reference Dittmar, Abbot-Smith, Lieven and Tomasello2008; Gertner et al., Reference Gertner, Fisher and Eisengart2006).
Another possible source of difficulty is the use of a sentence with a null subject. Imai et al. (Reference Imai, Lianjing, Haryu, Okada, Hirsh-Pasek, Golinkoff and Shigematsu2008) reported that English-speaking children were unable to learn new verbs when they were presented in a null subject and object sentence. However, the use of a transitive sentence frame with an overt subject and an overt object did not improve Japanese three-year-olds' performance in their study whatsoever. This is consistent with the observation that Japanese-speaking children frequently hear sentences without subjects and objects. Guerriero, Oshima-Takane & Kuriyama (Reference Guerriero, Oshima-Takane and Kuriyama2006) reported that Japanese-speaking mothers dropped arguments in sentences 56–72% of the time when talking to their children aged 1 ; 9, whereas English-speaking mothers dropped them 0–23% of the time. These findings suggest that a sentence frame with null subject or null object is unlikely to make a sentence-processing task more difficult for Japanese-speaking children.
A third possible factor that might make the mapping task difficult for children aged 1 ; 8 is that the first linguistic stimulus was presented before the visual event was shown (i.e. during the blank screen). This was done to present a novel verb in the same syntactic frame before and during the action. Thus, some children may have been unable to relate this first linguistic stimulus to the visual event that followed. Tomasello & Kruger (Reference Tomasello and Kruger1992) reported that children learned verbs better when mothers modeled verbs before the referred action occurred or after the action was completed, rather than while the action was being performed. This suggests that children are able to map verbs onto the referred action even though they hear the verbs before the referred actions occur. However, the mothers in their study modeled verbs when the action was about to take place. For instance, the mother stated her intention to perform the action (‘Now I'm going to roll it’), or she inferred the child's intention to perform it (e.g. ‘Are you going to roll it?’). Therefore, it is possible that the majority of children in Experiment 1 could not relate the first linguistic stimulus to the visual events that occurred a few seconds later because there was no pragmatic or contextual support.
Oshima-Takane et al. (Reference Oshima-Takane, Satin, Tint, Chan, Jacob and Kapia2008) examined this possibility by conducting a habituation experiment with French- and English-speaking children aged 1 ; 8 using the same design and a similar set-up with a novel word presented twice in an ongoing action event. The results indicated that both French- and English-speaking children were able to map a novel word onto the action and not onto the agent. Furthermore, approximately 60% of the children clearly learned the initial pairings of the visual events and sentences during the habituation phase, suggesting that presenting novel verbs while the events were ongoing made the word–action mapping task easier than the task used in Experiment 1. These results indicate that it is too difficult for children of this age to relate linguistic stimuli to action events shown a few seconds later without pragmatic or social support. This could have been a major reason explaining why the majority of Japanese-speaking children in Experiment 1 failed to use morphosyntactic cues in word–action mapping. Experiment 2 addressed this directly by presenting Japanese-speaking children with linguistic stimuli during ongoing events, as did Oshima-Takane et al. (Reference Oshima-Takane, Satin, Tint, Chan, Jacob and Kapia2008).
In Experiment 2, we further simplified the word-mapping task by shortening the beginning of the movie used by Oshima-Takane et al. (Reference Oshima-Takane, Satin, Tint, Chan, Jacob and Kapia2008) by 3 s. This made children less likely to look away before the first linguistic stimulus and before the first distinctive action at 1·5 s. Those who associated the novel words with the agents during the habituation phase might see the agent in the event and look away before hearing the first linguistic stimulus during the test trials. These children would not look longer at any of the switch test trials and would be classified as non-recoverers. It is possible, then, that the recoverers in Oshima-Takane et al.'s study showed a looking pattern consistent with the action interpretation simply because the children who made the agent interpretation were not included in the recoverer group. In this case, their findings are confounded by these children. Unless this possibility is ruled out, evidence for the ability of children aged 1 ; 8 to use verb morphosyntactic cues is limited. Experiment 2 was designed to examine this possibility. In addition, we predicted that children who were given the simplified mapping task in Experiment 2 would outperform those who were given Oshima-Takane et al.'s (Reference Oshima-Takane, Satin, Tint, Chan, Jacob and Kapia2008) task, following the resource limitation hypothesis.
EXPERIMENT 2
METHOD
Participants
Sixteen Japanese-speaking children (8 boys, 8 girls) with a mean age of 20.38 months (range: 19.99–21.30) participated in the study. Ten additional children (4 boys, 6 girls) were tested but were excluded from the final sample because of fussiness (6) and no habituation (4). Parents were asked to complete the Japanese version of the MacArthur Communicative Development Inventory (CDI; Ogura & Watamaki, Reference Ogura and Watamaki2004). The mean comprehension and production scores were 233 (range: 84–405) and 74·4 (range: 12–309), respectively. All children lived in Kyoto, Japan.
Stimuli and apparatus
Visual stimuli were the same movie clips as those used in Oshima-Takane et al.'s (Reference Oshima-Takane, Satin, Tint, Chan, Jacob and Kapia2008) study with French- and English-speaking children, with some modifications. In their study, the animal- and vehicle-like figures were modified slightly to look more unfamiliar than those in Experiment 1. In particular, the vehicle-like figure was made more similar to the animal-like figure in size, orientation and salience (see Figure 4). Furthermore, the same event was shown occurring alternately in the two possible directions (i.e. left-to-right and right-to-left) so that infants could not associate a novel word with a particular direction. For the movie clips in Experiment 2, the first 3 s of the event where the object was moving toward the wall were removed. Instead, a static image of the object near the wall was shown for 1 s before the object (agent) initiated the jumping- or bouncing-like action. In this way, children who associated the novel words with agents would no longer be labeled non-recoverers simply because they looked away before the agent performed the action. Each event began with a 1 s green screen as in Oshima-Takane et al.'s study. Thus, the duration of each event was 5·5 s and the maximum duration of a trial was 22 s.
Sentences with a null subject (‘Hora! Moke/seta-shiteiru yo’ meaning ‘Look! (It)'s moke/seta-ing’), were presented once instead of twice during each 5·5 s event. ‘Hora’ occurred at 1·5 s, and ‘moke/seta-shiteiru yo’ took place at 3·5 s while the action was being performed. The pretest, post-test, attention-getter movie clips, and the habituation criterion were the same as those in Experiment 1. The first trial of the test phase was used as the baseline, which was one of the habituation events as in Oshima-Takane et al.'s (Reference Oshima-Takane, Satin, Tint, Chan, Jacob and Kapia2008) study. The three switch test trials (i.e. word, agent and action) were presented using the same procedure as in Experiment 1. Eight different movie clips, combining different agents, actions and novel words, were created using the same verb–syntactic sentence frame to counterbalance the effects of agents (2), actions (2) and novel words (2), as in Oshima-Takane et al.'s (Reference Oshima-Takane, Satin, Tint, Chan, Jacob and Kapia2008) study. There was one female and one male participant assigned to each movie clip in the final sample (N=16).
The experimental set-up and apparatus used were similar to those in Experiment 1. The stimuli were shown using the Habit OS X program (Cohen, Atkinson & Chaput, Reference Cohen, Atkinson and Chaput2004) installed on a Power Macintosh G4. The child sat on a parent's lap approximately 120 cm from a Mitsubishi RDT191S 19-inch monitor in a dimly lit testing room (2·5 m×2·5 m). Roland MS50 speakers were placed directly above the monitor, behind a black curtain. A Sony DCR-PC120 digital video camera was placed behind a small opening in the black curtain, approximately 15 cm below the monitor. The camera was connected to a Sony PVM-20N1J television in an adjacent control room to allow the experimenter to code the child's eye fixation online. The experimenter and experimental set-up were hidden from the child by the black curtain to minimize any distractions.
In order to conduct inter-rater reliability, all testing sessions were videotaped, and a second coder did off-line coding on 20% of the original sample (randomly selected). The Pearson-product moment correlations of the on-line coding ranged from 0·935 to 0·990, with a mean of 0·963.
RESULTS
The mean total habituation looking time was 213·78 s (SD=59·30) and the mean number of habituation trials was 13·13 (SD=3·53). Children's looking times during the habituation phase were analyzed with a two-way ANOVA with gender as a between-subjects factor and habituation blocks as a within-subject factor. There was a significant main effect of the habituation block (F(1, 14)=1113·94, p<0·001). The mean total looking time for the first four trials was 21·60 s (SD=2·19) and 9·55 s (SD=1·51) for the last four trials. This means that children's total looking time dropped substantially in the last block of habituation trials as expected. There was no main effect of gender nor was there an interaction between gender and block. A t-test performed on the total habituation looking time (males: M=214·89 s, SD=67·04; females: M=212·68 s, SD=55·13) and the total number of habituation trials (males: M=12·63, SD=3·54; females: M=13·63, SD=3·70) indicated that there were no significant gender differences in either measure (p>0·05).
Children's looking time during the test trials were examined in a two-way ANOVA with gender groups as a between-subjects factor and trials as a within-subjects factor. There was a significant main effect for the trials (F(3, 42)=4·869, p=0·005), but no other effects were significant. Since there was no main effect of gender and no interaction effect between trials and gender, the subsequent analyses were performed on the male and female data combined.
As shown in Figure 5, children looked significantly longer at the word switch trial (M=14·91 s, SD=6·42) than at the baseline (M=8·32 s, SD=6·29), (LSD t(15)=−3·398, one-tailed, p=0·002). This indicates that the children aged 1 ; 8 were able to differentiate the habituated event–sentence pairings from the new event–sentence pairings. Children looked significantly longer at the action switch trial (M=14·67 s, SD=6·58) than the baseline (LSD t(15)=−3·252, one-tailed, p=0·0025). They also looked significantly longer at the action switch trial than the agent switch trial (M=9·59 s, SD=6·72), (LSD t(15)=2·498, one-tailed, p=0·0125). However, they did not look significantly longer at the agent switch trial than at the baseline (p>0·05). These results indicate that they mapped the novel word onto the action and not onto the agent. Evidently, these results were not due to fatigue because they looked significantly longer at the post-test (M=18·32 s, SD=6·29) than at the baseline (t(15)=−4·557, one-tailed, p<0·001).
Only two out of 16 children in Experiment 2 failed to recover at the word switch trial (non-recoverers). The remaining 14 children looked longer at the word switch trial than the baseline (recoverers), indicating that they learned the original sentence–visual event pairings during the habituation phase. The two non-recoverers showed a significantly longer looking time at the baseline than the recoverers (non-recoverers: M=18·75 s, SD=5·44; recoverers: M=6·83 s, SD=4·92, one-tailed, t(14)=−3·182, p=0·004). This indicates that they failed to recover at the word switch trial because they did not fully habituate. Furthermore, the non-recoverers tended to have a lower mean CDI comprehension score than the recoverer group, and the difference was marginally significant (M=149, SD=35·36 vs. M=245, SD=97·42; one-tailed, t(14)=1·537, p=0·100). This suggests that the language development of non-recoverers was less advanced than that of the recoverers.
DISCUSSION
The results of Experiment 2 demonstrated that the Japanese-speaking children aged 1 ; 8 mapped the novel word onto the action and not onto the agent. Furthermore, only 2 out of 16 children (12·5%) failed to learn the original sentence–visual event pairings. This stands in stark contrast to the results of Experiment 1 and that of Oshima-Takane et al.'s (Reference Oshima-Takane, Satin, Tint, Chan, Jacob and Kapia2008) study where 30–60% of the children failed to do so. The results of Experiments 1 and 2 together suggest that hearing the linguistic stimulus in an ongoing context helps children use morphosyntactic cues in a verb-mapping task when no pragmatic or social support is available (see also Oshima-Takane et al., Reference Oshima-Takane, Satin, Tint, Chan, Jacob and Kapia2008) and that it is too difficult for children aged 1 ; 8 to relate linguistic stimuli to a visual dynamic event shown a few seconds later. This could explain why the majority of children aged 1 ; 8 failed to provide clear evidence that they learned the original sentence–visual event pairings during the habituation phase in Experiment 1. The use of a null subject in the sentence and the use of a single syntactic frame do not explain the findings in Experiment 1 because the same single null sentence frame was used in Experiment 2.
Shortening the visual events before the first linguistic stimulus by 1·5 s helped reduce the chance that children would look away before hearing the first linguistic stimulus. This reduced the number of non-recoverers in Experiment 2 and made it unlikely that children who associated the novel word with the agent would show very short looking times during the switch test trials. Nonetheless, the results showed a looking pattern consistent with an action interpretation. This finding ruled out the possibility that the looking patterns of the recoverer group in Experiment 1 were consistent with the action interpretation simply because the children who associated the novel words with the agents tended to look away before the actions began and were classified as non-recoverers. However, the question remains whether this finding simply reflects a difference in perceptual salience between the action and the agent switches used in Experiment 2, because several studies have shown that children under age 2 ; 0 have difficulty in mapping novel verbs onto actions unless perceptual cues coincide with linguistic cues (e.g. Brandone et al., Reference Brandone, Pence, Golinkoff and Hirsh-Pasek2007). Experiment 3 examined this by conducting the same switch habituation experiment but with a linguistic stimulus that did not contain a novel word.
EXPERIMENT 3
METHOD
Participants
Sixteen Japanese-speaking children (8 boys, 8 girls) with a mean age of 19.86 months (range: 19.59–20.35) participated in the study. Three additional children (2 boys, 1 girl) were tested but were excluded from the final sample because of fussiness. Parents were asked to complete the Japanese version of the MacArthur Communicative Development Inventory (Ogura & Watamaki, Reference Ogura and Watamaki2004). The mean comprehension and production scores were 216·44 (range: 33–380) and 73·50 (range: 4–224), respectively. All children lived in Kyoto, Japan.
Design and procedure
The linguistic stimuli were removed except for the attention-getter ‘hora’, in order to assess the baseline response to the animations. The pretest and post-test and the attention-getter movie clips, the experimental room and apparatus used were the same as those for Experiment 2.
After the pretest, children were presented with two habituation events alternately during the habituation phase. The order of the two events was counterbalanced across children. Therefore, there were a total of four versions of movie clips with two different combinations of agents (animal-like/vehicle-like) and actions (jump-like/bounce-like). The habituation criterion was the same as in Experiment 2. In the test phase, children were first presented with one of the habituation events as the baseline as in Experiment 2. Then, they were given the agent and the action switch test trials (see Table 3). The order of the agent and the action switch trials was counterbalanced. If children were to look longer at the action or/and the agent switch trial than at the baseline, this would indicate that children noticed the action or/and agent switch. If there is no difference in the action and the agent looking times, this would indicate no difference in perceptual saliency or in children's preference between the action and the agent changes.
A different coder did off-line coding on 20% of the original sample (randomly selected) and the Pearson-product moment correlations to the on-line coding ranged from 0·95 to 0·99, with a mean of 0·98.
RESULTS AND DISCUSSION
The mean total habituation looking time was 182·26 s (SD=61·50) and the mean number of habituation trials was 11·63 (SD=3·30). A t-test performed on the total habituation looking time (males: M=205·95 s, SD=65·11; females: M=158·58 s, SD=50·83) and the total number of habituation trials (males: M=12·75, SD=3·54; females: M=10·50, SD=2·83) indicated that there were no significant gender differences in either measure (ps>0·05).
Means and standard errors of children's looking times during the test trials are presented in Figure 6. A two-way ANOVA with gender groups as a between-subjects factor and trials as a within-subjects factor showed no main effects or interaction effect between trials and gender (ps>0·05). Therefore, there was no difference in perceptual salience or preference between the agent and the action switch trials, confirming that the agent and the action interpretations are equally possible. Furthermore, children did not notice the agent and the action switches. The results were not due to fatigue because the children looked significantly longer at the post-test (M=18·23 s, SD=6·22) than at the baseline (M=8·58 s, SD=4·59, t(15)=−4·87, one-tailed, p<0·001). These findings together with those of Experiments 2 suggest that children aged 1 ; 8 are able to map novel verbs to actions without relying on perceptual information.
GENERAL DISCUSSION
The present study demonstrated that Japanese-speaking children presented with novel words in a single intransitive verb sentence frame with a null subject can rapidly map novel words to actions by age 1 ; 8. The fact that they are able to do so in the verb-mapping task provides strong evidence that Japanese-speaking children are able to use morphosyntactic information in verb learning by age 1 ; 8, in particular as the agents and the actions used in the visual events did not differ in perceptual salience or preference. These results corroborate and extend previous findings that children aged under 2 ; 0 can use morphosyntactic information such as inflectional morphology, word order and function words to constrain the meaning of novel verbs (Bernal et al., Reference Bernal, Lidz, Millotte and Christophe2007; Echols & Marti, Reference Echols, Marti, Hall and Waxman2004; Gertner et al., Reference Gertner, Fisher and Eisengart2006; Oshima-Takane et al., Reference Oshima-Takane, Satin, Tint, Chan, Jacob and Kapia2008). While these findings support the rule-based account, the present results also showed that the complexity of the word-mapping task greatly influences the performance of children aged 1 ; 8. In Experiment 1, approximately 60% of the children did not show clear evidence that they learned the original pairings of sentences and visual events. However, Oshima-Takane et al.'s (Reference Oshima-Takane, Satin, Tint, Chan, Jacob and Kapia2008) study showed that the number of non-recoverers was reduced by half (30%) when the novel verb sentence was presented in an ongoing context. In Experiment 2, the number of non-recoverers was further reduced to 12% by shortening the time before the first linguistic stimulus was presented. These findings indicate that, as predicted by the resource limitation hypothesis, when the word-mapping task is too demanding, children aged 1 ; 8 have difficulty using morphosyntactic cues effectively because their cognitive resources are limited. However, they can use morphosyntactic information in a simple verb-mapping task that has reduced memory load and attention span requirements.
This interpretation does not support the item-based account because the failure of children age 1 ; 8 to use morphosyntactic cues is due to their limited cognitive resources, rather than the absence of abstract morphosyntactic representation. There were only two non-recoverers in Experiment 2 and they tended to have a lower mean vocabulary score than the recoverer group. However, the non-recoverers and recoverers in Experiment 1 did not differ in language test scores, suggesting that children's success in word mapping requires more cognitive resources such as working memory and attention skills in addition to their morphosyntactic knowledge when the task is demanding. Therefore, the non-recoverers in Experiment 1 were more likely to fail to learn the original word–event pairings during the habituation phase because their cognitive resources (e.g. working memory and attention skills) were less mature than those of the recoverer group. Similarly, it is possible that the children in Dittmar et al.'s (Reference Dittmar, Abbot-Smith, Lieven and Tomasello2008) study needed the practice phase to prime relevant morphosyntactic information (i.e. word order) in order to interpret transitive sentences correctly in the test phase due to their limited working memory capacity.
In the present study, we investigated the child's ability to map a novel verb onto an action in complex visual events without social cues or contextual support. While many children (non-recoverers) did not learn the novel verbs in Experiment 1, this does not mean that they cannot rapidly map a novel verb onto an action in natural contexts because there are many more ways to engage children's attention and to compensate for their limited working memory to support the mapping of words to visual events than there are in a controlled experiment, as in the present study. For instance, the mother may provide non-linguistic cues such as pointing and gestures to attract and sustain the child's attention to the visual event (Tomasello & Farrar, Reference Tomasello and Farrar1986; Zukow-Goldring, Reference Zukow-Goldring1996). Even without such non-linguistic cues, the child may not lose interest in watching the whole event and would be capable of relating the linguistic stimuli to the event that follows if the person who utters the sentence is his/her mother and/or if some rich context is given (e.g. the child hears her mother telling a third person what happens in a movie from behind the screen before the child views it). In order to understand the nature of early morphosyntactic representations in children, further research is needed to specify the conditions under which children are able to use the morphosyntactic cues in word learning. It would be interesting to investigate what types of social or contextual cues are helpful to children of this age by systematically adding these cues to the word-mapping task.
One question that remains to be answered is whether the three-year-olds in Imai et al.'s (Reference Imai, Lianjing, Haryu, Okada, Hirsh-Pasek, Golinkoff and Shigematsu2008) study failed to use verb morphosyntactic cues reliably because of their limited cognitive resources. Children younger than 3;0 are able to use verb morphosyntactic cues reliably in simpler verb-mapping tasks, as the present study demonstrates. One may argue that the present study examined children's ‘recognitory comprehension’ that did not necessarily involve referential understanding of the novel words (Oviatt, Reference Oviatt1980), whereas the study by Imai et al. examined more adult-like lexical knowledge that could be generalized to new contexts. This explanation is unlikely, however, because the present mapping task requires an ability to generalize the verbs to new instances with different agents and the same actions (i.e. agent switch), but not to those with the same agents and different actions (i.e. action switch). Although the range of the generalizations examined was limited to the agents and actions appearing in the habituation phase in order to control for novelty effects, children's representation of morphosyntactic information must be abstract enough to perform the present verb-mapping task successfully.
One major difficulty with Imai et al.'s (Reference Imai, Lianjing, Haryu, Okada, Hirsh-Pasek, Golinkoff and Shigematsu2008) verb learning task is that the verb learning situation it presents is very different from those that children typically encounter in daily life. When learning novel verbs denoting the actions performed on artifact objects, children typically observe actions associated with the intended function of the artifacts (e.g. cutting an apple with a knife or cutting paper with scissors) and are able to see what result has been accomplished. Therefore, even though the objects involved in the novel actions are different, children would be able to extend the newly learned verbs to different objects without much difficulty. However, the novel actions taught in Imai et al.'s task had no such functional relationships with the novel objects. Furthermore, children were asked to extend the novel verb to a new object after observing the novel action event with a single object. Such an artificial verb learning situation would make it extremely difficult for young word learners to infer the intended meaning of the novel verbs. As a result, they are likely to be very conservative in the extension of newly learned verbs. Behrend (Reference Behrend1990) reported that English-speaking three-year-olds were able to generalize novel action verbs easily to new objects when the actions performed with the original object (e.g. spaghetti server) and the new objects (e.g. barbecue fork) produced the same result (e.g. twirling it to collect a bunch of tangled yarn lying on a tape). There is also evidence that three-year-olds' generalization of a novel verb to the same action with new instruments improved significantly when they were shown the same action performed on multiple instruments instead of always on the same instrument during the teaching phase (Forbes & Farrar, Reference Forbes and Farrar1995). These findings suggest that the three-year-olds' poor performance in Imai et al.'s verb-mapping task were not likely to be an accurate reflection of their capability to quickly learn the full meaning of the novel verbs. Instead, they failed to extend the novel verbs to the actions presented with different objects because their task was too demanding for three-year-olds and consumed too much of their cognitive resources, as predicted by the resource limitation hypothesis.
In sum, the present study demonstrates that Japanese-speaking children aged 1 ; 8 can rapidly map a novel word onto an action when it is presented in a single intransitive verb sentence frame with a null subject and when there are no differences in perceptual salience between the agent and action switches in the task. While more research is needed to understand the nature of morphosyntactic representation in children under 2;0, the present study provides evidence that their representation is strong enough to guide verb mapping onto actions when their cognitive resources are not excessively taxed. To test the robustness of the present finding, future research should include a single word or a noun condition as a control condition in a single experiment with the same visual events.