Introduction
This study examines Mandarin-speaking children's knowledge of event semantics, and considers how the acquisition data shed light on the properties of Mandarin, and language in general, in interpreting spatial modifiers. Place adverbials such as behind the museum in (1) modify motion events (Davidson, Reference Davidson and Rescher1967; Parsons, Reference Parsons1990), and the interpretation and word order of these spatial modifiers interact with different event-denoting verbs in complex ways. For instance, as shown by (1), the spatial modifier behind the museum is either associated with the Agent Mary, meaning that it is Mary who was behind the museum, or the Theme her kite, meaning that the kite flew behind the museum while Mary might not be behind the museum. The two different interpretations are shown in (1i) and (1ii). Spatial modifiers in Mandarin Chinese demonstrate the same type of ambiguity. For example, the spatial prepositional phrase with zai ‘at’ in (2), zai chuang shang ‘on the bed’, can either be oriented towards the Agent Zhangsan or the Theme liang-zhang xiangpian ‘two photos’ (Teng, Reference Teng1975; Fan, Reference Fan1982). The ambiguity hinges on the verbs, or the types of event that they embody, as spatial modifiers of other types of verb do not demonstrate this type of ambiguity.
(1) Mary flew her kite behind the museum.
i. Mary did something behind the museum.
ii. The kite flew behind the museum. (Parsons, Reference Parsons1990, p. 118)
(2) Zhangsan zai chuang shang tie-le liang-zhang xiangpian.Footnote 1
Zhangsan at bed top paste-Perf two-cl photos
i. ‘Zhangsan was in the bed and pasted two photos.’
ii. ‘Two photos were pasted on the bed by Zhangsan.’ (Teng, Reference Teng1975)
The examples above illustrate the complex interactions between spatial modifiers and verbs. In what follows, we will show that the interactions are governed by principles of event semantics, with Mandarin paralleling English in this regard (Deng, Reference Deng2014; Deng & Yip, Reference Deng and Yip2015), and we will report an empirical study that shows how Mandarin-speaking children interpret and place spatial modifiers headed by zai ‘at’ in a sentence. We first introduce the semantic principles of subevent modification and aspect shift. We then review relevant acquisition studies and raise our research questions. After that, we introduce the experimental methods used in this study and report our results. Finally, we discuss our findings and make some concluding remarks.
Theoretical background
The classification of verbs based on event type is the foundation of event semantics. Four major event types, including State, Activity, Achievement, and Accomplishment, have been used to classify verbs (Vendler, Reference Vendler1957; Dowty, Reference Dowty1979; Parsons, Reference Parsons1990; Smith, Reference Smith1991; Rothstein, Reference Rothstein2004). Accomplishments and Achievements have a terminal point given by the inherent properties of the events, while States and Activities do not. The feature [± extended minimal events] distinguishes Accomplishment and Activity on one hand, and Achievement and State on the other (Rothstein, Reference Rothstein2004, p. 194). An Accomplishment such as build a house is an event extending over an interval, whereas an Achievement such as recognize is nearly instantaneous and therefore not extendable. The shortest possible state such as love can hold at an instant, but the shortest possible activity such as run must extend over an interval. The aforementioned classification of verbs based on event type has been fruitfully used to account for a wide range of linguistic phenomena across languages (Smith, Reference Smith1991).
Besides event type, we will introduce event structure, aspectual head, aspect shift, and subevent modification, and see how these event-semantic principles/primitives explain the interpretation of Mandarin spatial modifiers. According to Pustejovsky (Reference Pustejovsky1991, Reference Pustejovsky1995), Accomplishment has an event structure which is made up of two subevents – process and state – and process is the aspectual head because it is more prominent in the event structure. The motivation for positing the aspectual head is to explain cases where an adverbial modifies the head of an event rather than the entire event (Pustejovsky, Reference Pustejovsky1995, pp. 74–5). For instance, in (3), the manner adverbial quietly modifies the process of drawing, rather than the result state of the picture coming into existence. Therefore the process is the head of Accomplishment.
(3) Mary quietly drew a picture. (Pustejovsky, Reference Pustejovsky1995, pp. 74–5)
In the domain of spatial modifiers, either of the two subevents in the event structure of Accomplishments may be modified by the spatial modifier, giving rise to ambiguity in interpreting the spatial modifier as shown in (1) (Dowty, Reference Dowty1979; Parsons, Reference Parsons1990). This mechanism is called ‘subevent modification’. Deng (Reference Deng2014) applied subevent modification to the ambiguity of the spatial modifiers headed by zai ‘at’ with placement verbs in Mandarin. Placement verbs, e.g., fang ‘place’, hua ‘draw, paint’, and gua ‘hang’, encode events in which an Agent causes a Theme to move to a Goal. When placement verbs like tie ‘paste’ as shown in (2), repeated as (4) below, co-occur with a preverbal prepositional phrase headed by zai ‘at’ (zai-PP), both the locational reading (4i) and the directional reading (4ii) are available (see Teng, Reference Teng1975; Li & Thompson, Reference Li and Thompson1981; Fan, Reference Fan1982).
(4) Zhangsan [zai chuang shang] tie-le liang-zhang xiangpian.
Zhangsan at bed top paste-Perf two-cl photos
i. ‘Zhangsan was in the bed and pasted two photos.’ (location)
ii. ‘Two photos were pasted on the bed by Zhangsan.’ (direction)
As Accomplishments, placement verbs such as tie ‘paste’ have an event structure which has two subevents: the process of sticking, and the state resulting from the sticking. In (4), if the zai-PP modifies the first subevent, the sentence means that the process of Zhangsan sticking photos took place on the bed, which is the locational reading. If it modifies the second subevent, the sentence gets its directional reading: as a result of sticking, the photos ended up on the bed.
It has been pointed out that, in English, a verb is not bound to a fixed event type, but can shift between two event types (Dowty, Reference Dowty1979; Smith, Reference Smith1991; Fong, Reference Fong1997; Nam, Reference Nam2000; Rothstein, Reference Rothstein2004). Of the two types, one is considered to be more prototypical and is called the basic-level event type, whereas the other is called the derived-level event type (Smith, Reference Smith1991, p. 18). Posture verbs specify the configuration of an entity in relation to a reference entity, and they show State–Achievement alternation in combination with non-directional prepositional phrases (PPs), giving rise to ambiguity (Fong, Reference Fong1997, pp. 72–4). For instance, a zai-PP following the Mandarin posture verb zuo ‘sit’ is ambiguous between a locational reading (5i) and a directional reading (5ii) (Li & Thompson, Reference Li and Thompson1981; Fong, Reference Fong1997).
(5) Ta zuo [zai shafa shang].
he sit at sofa top
i. ‘He sat / was sitting on the sofa.’ (location)
ii. ‘He sat down on the sofa.’ (direction)
The ambiguity can be explained by aspect shift of the posture verb. Posture verbs are normally dynamic state verbs (see Bach, Reference Bach1986). However, a posture verb such as zuo ‘sit’ in (5) can shift into an Achievement. When it is a State, the co-occurring zai-PP modifies the state, and yields the static location reading (5i). When zuo ‘sit’ is understood as an Achievement, the spatial PP modifies the result state in the event structure of Achievement, giving rise to the change-of-location reading (5ii).
To summarize so far, semantic principles of subevent modification and aspect shift account for the ambiguity of sentences involving spatial modifiers headed by zai ‘at’ in Mandarin. In what follows, we will show that the irregular word order of zai-PPs can be captured by event semantics. A basic distinction for spatial expressions across languages is that between stative-locational expressions and movement-directional ones (Nam, Reference Nam2000; Cinque, Reference Cinque, Cinque and Rizzi2010). In Mandarin, word order can be used to mark the difference: preverbal zai-PPs express static location, whereas postverbal zai-PPs convey the goal or result location of a movement, as shown in (6a) and (6b) (Wang, Reference Wang1957, Reference Wang1980; Tai, Reference Tai1975).Footnote 2 The word-order regularity can be represented as in (7).
(6a) Ta [zai mabei shang] tiao. (location)
he at horseback top jump
‘He was jumping on the horse's back.’
(6b) Ta tiao [zai mabei shang]. (direction)
he jump at horseback top
‘He jumped onto the horse.’
(7a) Location-V
(7b) V-Goal
However, the regularity in (7) is disrupted by zai-PPs occurring with posture verbs and placement verbs, as shown by (4) and (5). (4) under the directional reading and (5) under the locational reading demonstrate that some preverbal zai-PPs express Goal and some postverbal ones convey Location. These exceptions are counterexamples to the iconicity account of Tai (Reference Tai1975, Reference Tai and Haiman1985) who proposed that the word-order regularity ‘Location-V’ and ‘V-Goal’ reflects the temporal sequence of events in the real world. This iconicity account cannot explain why the Goal element, which happens last in a motion event, can occur before the placement verb, as in (4), and why the locational element, which represents the general location of the event, can appear after the posture verb, as in (5). These exceptions also challenge a syntactic account which equates the syntactic position of complement after a Chinese verb with a directional reading (Mulder & Sybesma, Reference Mulder and Sybesma1992; Huang, Reference Huang, Lust, Suñer and Whitman1994). This approach cannot account for the locational reading for zai-PPs after posture verbs, as in (5).Footnote 3 However, the systematic exceptions to the word-order regularity can be explained by principles of aspect shift and subevent modification as shown above.
By examining several classes of verbs, Deng (Reference Deng2014) observed that postverbal zai-PPs can only be licensed by verbs that are dynamic States, or have a state component in their event structures. For instance, non-motional process verbs such as chi ‘eat’ are Activities which do not have a state component in their event structures and thus are incompatible with postverbal zai-PPs, as in (8).
(8) *Ta chi zai guanzi li.
he eat at restaurant inside
Intended meaning: ‘He ate in the restaurant’.
Following Liu (Reference Liu and Xing2009), Deng (Reference Deng2014) made the generalization that zai ‘at’ follows a restricted set of verbs including posture verbs, placement verbs, displacement verbs, and manner-of-motion verbs, while other types of verb have to take the directional morpheme dao ‘reach, to’ to express change of location. The restriction on verbs taking postverbal zai-PPs, and the division of labor between zai ‘at’ and dao ‘reach, to’ in postverbal position, demonstrate language-specific features.
Research questions
Our primary research question is whether children show knowledge of event semantics in interpreting spatial modifiers. In the previous section, we have shown that the interpretation of some zai-PPs is governed by semantic principles of aspect shift and subevent modification, whereas the placement of zai-PPs is constrained by language-specific factors. The abstract semantic principles are not transparently manifested in the input, and children are not systematically informed of the ambiguity of zai-PPs with placement and posture verbs, while the word order of zai-PPs are represented in the input, though the regular word orders and systematic exceptions have different frequencies in the input. How do Mandarin-speaking children acquire adult-like knowledge of the interpretation and word order of zai-PPs? By exploring the effects of general semantic principles and of language-specific properties on child language, this study will illuminate the debate on input vs. inherent principles.
There is constant debate as to the relative contribution of nature and nurture in child language acquisition. An influential view is the poverty of the stimulus argument that genetically encoded Universal Grammar (UG) accounts for the fact that children born in different linguistic communities acquire language in an expedient and uniform manner despite extensive variability of the input (Crain, Reference Crain1991; Gleitman & Newport, Reference Gleitman, Newport, Gleitman and Liberman1995). However, some argue that input plays a determinant role. Tomasello and his colleagues’ experiments show that two- and three-year-old children use novel verbs only in structures that they have heard from the experimenter, and few of them can use the verb in a structure that is not found in the input (Tomasello, Reference Tomasello2000a, Reference Tomasello2000b, Reference Tomasello and Bavin2009). Increasingly, researchers emphasize the combined role of nature and nurture. Pinker (Reference Pinker1984, p. 42) argued that a child's learning strategies are a combination of semantic bootstrapping and distributional learning: once a basic scaffolding of semantically induced rules and lexical items is in place, other things can be learned by observing their distribution within the structures that children understand. By studying children's null subjects, Yang (Reference Yang2002, pp. 22–4) argued that statistical learning is most suitable for modeling the gradualness of language development, proposing that an innate UG provides the hypothesis space and that statistical learning provides the learning mechanism.
Properties of the input are argued to play a significant role in shaping language acquisition. When input exemplifies a constant distributional pattern of a structure, it provides unambiguous evidence that the structure is used in a certain way. Slobin (Reference Slobin and Slobin1997a, Reference Slobin and Slobin1997b) suggests that consistent cues in input facilitate language acquisition, whereas inconsistent cues confuse children. The word order of Mandarin zai-PPs has inconsistent cues in the input: while the interpretation of zai-PPs is governed by some general semantic principles, its distribution has language-specific idiosyncrasies. If inconsistency slows down acquisition, we expect the placement of zai-PPs and the division of labor between zai and dao ‘reach, to’ in the postverbal position to pose problems for language acquisition, as various types of verbs interact with zai-PPs inconsistently. The expectation is confirmed by some studies on acquisition of a first language (L1), and a second language (L2), as well as bilingual language acquisition. In Cantonese, another dialect of Chinese typologically close to Mandarin, hai ‘at’ is the counterpart of Mandarin zai ‘at’. Cantonese-speaking children up to age 5;0 interpreted as Goal the preverbal PPs headed by hai ‘at’ with transitive displacement verbs such as ngon ‘push’ in an act-out experiment, and two- and three-year-olds made word-order errors, putting Location PPs after motion verbs in a production task: Cantonese L1 learners have difficulty grasping the word orders of Location and Goal (Cheung, Reference Cheung1991, ch. 5). Non-target-like V-Location word order with activity verbs like sik6 ‘eat’ was found in the utterances produced by six Cantonese–English bilingual children aged between 1;3 and 4;6 (Yip & Matthews, Reference Yip and Matthews2007, pp. 190–9). In L2 acquisition, non-target locational postverbal zai-PPs are also found in the compositions written by L2 learners of Mandarin from a variety of linguistic backgrounds (Zhou, Reference Zhou2011, pp. 96–9). All verbs involving change of location can take postverbal dao ‘reach, to’, but only verbs from a limited set can take zai. Monolingual Mandarin-speaking children have been shown to overuse zai where dao ‘reach, to’ should be used, in corpus studies (Hsieh, Reference Hsieh, Wilder and Åfarli2010; Deng & Yip, Reference Deng and Yip2015). Preschool children may have not fully acquired the placement of zai-PPs and the division of labor between zai and dao in the postverbal position, given the complex cues.
However, there is conflicting evidence. Based on two corpora in the CHILDES database (MacWhinney, Reference MacWhinney2000), Deng and Yip (Reference Deng and Yip2015) found that, despite inconsistent cues from the input, monolingual Mandarin-speaking children place zai-PPs in the correct positions. In the adult input from the Beijing corpus (Tardif, Reference Tardif1993), which has longitudinal data from ten Mandarin-speaking children aged between 1;9 and 2;3, at least 11.3% of preverbal zai-PPs expressed Goal and 19.2% of postverbal zai-PPs expressed Location, serving as inconsistent cues to the ‘Location-V’ and ‘V-Goal’ word-order regularity. Nonetheless, children demonstrated adult-like placement of zai-PPs: all the preverbal and postverbal zai-PPs in the two corpora conformed to adult grammar. Moreover, despite the Location-V and V-Goal word-order regularity in adult input in the Beijing corpus, Deng and Yip (Reference Deng and Yip2015) found that all preverbal zai-PPs with placement verbs (3 tokens) produced by children younger than 2;3 expressed Goal, and all their postverbal zai-PPs with posture verbs (3 tokens) expressed Location. Clearly, children were not constrained by the word-order regularity in Mandarin in the initial stage of acquisition. Instead, principles of event semantics guide them to go beyond statistical learning.
Taken together, these studies suggest that L1 learners of Cantonese, L2 learners of Mandarin, and Cantonese–English bilingual children have difficulty with the placement of spatial PPs, and L1 learners of Mandarin have trouble with the division of labor between zai ‘at’ and dao ‘reach, to’ in postverbal position. However, Mandarin monolingual children made no mistake with the word order of zai-PPs, even before 2;3. Nonetheless, production data from corpus studies may lead to overestimation: children may have a non-adult-like grammar but fail to demonstrate it in the limited sampling of the corpus recording. For instance, in Deng and Yip (Reference Deng and Yip2015), there are only 10 tokens of preverbal zai-PPs and 10 tokens of postverbal ones produced by ten children before 2;3. Even though no mistake with regard to the placement of zai-PPs was spotted in this limited sample, we cannot guarantee that children have fully acquired the word order before 2;3. Furthermore, we cannot be sure if young children have knowledge of the ambiguity of spatial PPs by passively observing them in a corpus. Controlled experimentation, on the other hand, taps their underlying knowledge. These concerns provide the motivation to conduct experiments to further investigate Mandarin-speaking children's placement and interpretation of zai-PPs.
Besides aiming to evaluate the roles of inherent principles and input in acquiring Mandarin spatial modification, we also ask how acquisition data shed light on spatial modification in Mandarin, and on language in general. Child language researchers have long tried to bridge developmental studies and theoretical contentions. Snyder and Stromswold (Reference Snyder and Stromswold1997), for instance, used the relative acquisition sequence of the double object dative and the to-dative to evaluate the relative structural complexity of the two constructions. For this study, the acquisition data will also be used to resolve some theoretical issues. We raise three specific questions:
1. Is there acquisition evidence to prove the effects of the semantic principles of aspect shift and subevent modification?
2. What is the aspectual head of Accomplishments, placement verbs in particular, in Mandarin?
3. What is the basic-level event type for Mandarin posture verbs that can shift between State and Achievement?
Following Parsons (Reference Parsons1990) and Fong (Reference Fong1997), we hypothesize that aspect shift and event modification are universal semantic principles. If they are universal, children should be sensitive to the ambiguity of spatial modifiers at an early age. The connection between language universals and child language acquisition has been drawn since Jakobson (Reference Jakobson1968). In recent acquisition studies, Crain (Reference Crain2008) found that the semantic principles that govern the interpretation of disjunction instantiated by ‘or’ in certain structures emerge in early child language without decisive evidence from experience, and are common to all human languages. Similarly, the present study will test the youngest Mandarin-speaking children possible to determine whether the multiple readings of spatial PPs headed by zai before placement verbs and after posture verbs are accessible to them. If two- and three-year-olds demonstrate such knowledge, it is reasonable to conclude that the mechanisms of subevent modification and aspect shift are fundamental for human children, and may belong to language universals. We will also investigate children's mental representation of the event semantics of posture verbs and placement verbs, and compare children's representation with that of adults. Child data will give insight into the theoretical issues of aspectual head and basic-level event type.
Method
To explore whether event-semantics principles guide Mandarin-acquiring children's interpretation of zai-PPs and the language-specific distribution pattern of zai poses difficulty for them, three experimental methods were used: (a) modified forced choice, (b) video choice, and (c) elicited production.
(a) The modified forced choice (MFC) task combines elements of the felicity judgment task (Chierchia, Crain, Guasti, Gualmini, & Meroni, Reference Chierchia, Crain, Guasti, Gualmini, Meroni, Do, Domínguez and Johansen2001) and the forced choice grammaticality judgment task (Demuth, Machobane, & Moloi, Reference Demuth, Machobane and Moloi2003), which both require participants to choose one out of two sentences that is better in describing a situation. By juxtaposing two minimally contrasted sentences in one trial, the two tasks make the contrast clear and engage the child actively in comparing the two sentence forms. Similarly, our MFC task presents two sentences in each trial as alternative descriptions of a specific situation. A typical trial presents a picture or a video clip in the middle of the screen, and shows the cartoon figures Mickey and Minnie in two corners, as in Figure 1. The experimenter clicks the two characters one by one, and plays recordings of two test sentences prerecorded by a man and a woman, respectively. The child is then asked to reward the cartoon figure(s) who ‘said it right’. The crucial modification in our MFC task is to offer the child a third option: both are correct.Footnote 4 Children are given this option for two reasons. First, it is difficult to elicit rejection directly from children due to their ‘yes’ bias, as our pilot experiment shows. However, if children are given the both-are-right option and they only choose one from the two test sentences, it provides evidence that children reject the other. While this is only indirect evidence of rejection, all behavioral experiments can only test underlying competence indirectly. This method is the best that we can come up with for children as young as three. Second, the MFC task allows us to use two correct sentences in one trial to observe whether the participant accepts one of the two. To test whether children accept sentence X, if they are given an obviously incorrect alternative sentence Y, they would choose X, not because it is correct grammatically, but because its competitor Y is worse. Therefore, they are given another good sentence Z in some trials. Under this scenario, if they still choose X, or accept both, this would be strong evidence that X is grammatical for them. The modification to the forced-choice technique is considered more effective in revealing children's grammar. One possible objection is that giving the children a third choice might lead to another type of ‘yes’ bias: namely, the children will always choose the option that both sentences are correct. However, some children in our pilot experiment overused the both-are-right option in some trials but not for the filler items, suggesting that their overuse in some trials was due to lack of linguistic competence rather than the ‘yes’ bias.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220321092651042-0394:S0305000917000496:S0305000917000496_fig1g.jpeg?pub-status=live)
Figure 1. A typical trial on the computer screen for the modified forced choice (MFC) task.
(b) A video choice task is used to supplement the MFC task. In this task, children are asked to choose between two video clips to match with a target sentence. For example, in a typical trial, the participant was asked to pick out the video that showed the situation conveyed by the sentence in (9). In Figure 2, the video on the left of the screen shows a woman moving from the state of standing to the state of lying on the bed, and in the video on the right, the woman maintains a recumbent position on the bed. Participants who accepted both readings picked out both videos; those who had only the locational reading picked out the one on the right; and those who had only the directional reading opted for the one on the left. Again, by comparing the two minimally contrasting scenes, or the locational reading and the directional reading of the ‘posture V-zai’ structure, side by side, the child will actively consider which reading is correct for him or her. Children as young as 2;9 were shown to be able to pick out the correct video among three video clips (Deng, Reference Deng2010), so we expect that two-year-olds can handle the processing load in this task.
(9) Ayi tang zai chuang shang.
aunt lie at bed top
i. ‘Auntie is lying on the bed.’ (location)
ii. ‘Auntie lay down on the bed.’ (direction)
(c) An elicited production task is also used to gather converging evidence. Elicited production reveals aspects of children's grammars by having them produce particular sentence structures that are uniquely felicitous for a context (Thornton, Reference Thornton, McDaniel, Mckee and Cairns1996).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220321092651042-0394:S0305000917000496:S0305000917000496_fig2g.jpeg?pub-status=live)
Figure 2. A typical trial on the computer screen for the video choice task.
Participants
Five groups of children (two-, three-, four-, five-, and six-year-olds) with an adult control group participated in our experiment. Child participants were recruited from two kindergartens in Shenzhen, a major city in southern China. Children in Shenzhen use Putonghua at schools, and may also receive some input from their parents’ dialects at home. Before the experiments, consent forms were distributed, together with questionnaires asking for information on the children's linguistic background. After screening out bi- or multi-dialectal children, 98 were considered monolingual speakers of Mandarin, and only their data were included for this study. Twenty adults, half being teachers from the two kindergartens in Shenzhen and half from Beijing, were tested as the control group. They were born in northern China and have been exposed to Mandarin since birth or primary school. Detailed participant information is shown in Table 1.
Table 1. Participant Information
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220321092651042-0394:S0305000917000496:S0305000917000496_tab1.gif?pub-status=live)
Materials
As mentioned earlier, our experiment comprises three tasks: the modified forced choice (MFC), video choice, and elicited production. Our stimuli are distributed in these three tasks as summarized in Table 2.
Table 2. Distribution of the Test Materials
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220321092651042-0394:S0305000917000496:S0305000917000496_tab2.gif?pub-status=live)
The MFC task was used to test (a) participants’ interpretation of the postverbal zai-PPs with posture verbs, (b) their interpretation of the preverbal zai-PPs with placement verbs, and (c) their knowledge of the word order of zai-PPs. It contains five types of sentence pairs, each with three tokens, as listed in the ‘Appendix’.
(a) The first type tests participants’ interpretation of postverbal zai-PPs with posture verbs. As the locational reading is predicted to be preferred, this task aimed to investigate whether children accept the directional reading. Three pairs of sentences were designed, each containing a ‘posture V-Goal/Location’ sentence and a ‘Location-posture V’ sentence, as shown in (10). The video clip for this trial showed a motion event in which a woman sat down on a chair. (10a) under the directional reading matches the video, while (10b) does not. Participants who interpreted (10a) as having a directional reading were expected to choose it. Otherwise, they would choose one or the other randomly.
(10a) Ayi zuo zai yizi shang.
aunt sit at chair top
i. ‘Auntie is sitting on the chair.’ (location)
ii. ‘Auntie sat down on the chair.’ (direction)
(10b) Ayi zai yizi shang zuo.
Aunt at chair top sit
‘Auntie is sitting on the chair.’
(b) The second type of sentence pairs examines whether participants accept the locational reading for preverbal zai-PPs with placement verbs. We contrasted two types of ‘zai-placement V’ structures, as exemplified in (11). The video clip showed a motion event in which a woman sat on a bed and hung a piece of clothing onto the wardrobe, as shown in Figure 3. The sentence in (11a) is a correct description of the video under the locational reading, but is a mismatch under the directional reading. The sentence in (11b) can only have the directional reading, as real-world knowledge tells us that the Agent cannot possibly be located on the wardrobe. We called (11a) the neutral type and (11b) the biased type. Given the biased type which is unambiguously compatible with the video, if participants still chose the neutral type, then they had a locational construal of the neutral sentence.
(11a) Ayi zai chuang shang gua-le yi-jian yifu.
aunt at bed top hang-Perf one-cl clothes
i. ‘Auntie hung a piece of clothing while she was on the bed.’ (location)
ii. ‘Auntie hung a piece of clothing onto the bed.’ (direction)
(11b) Ayi zai yigui shang gua-le yi-jian yifu.
aunt at wardrobe top hang-Perf one-cl clothes
‘Auntie hung a piece of clothing onto the wardrobe.’
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220321092651042-0394:S0305000917000496:S0305000917000496_fig3g.jpeg?pub-status=live)
Figure 3. The video clip used in testing the locational reading for preverbal zai-PPs with placement verbs.
(c) The next three types of sentence pairs test the word order of zai-PPs. In the first type, a target ‘Location-V’ sentence and a non-target ‘V-Location’ sentence were juxtaposed. One example is shown in (12). The second type of sentence pairs examines whether participants reject inappropriate preverbal Goal-denoting zai-PPs. In each pair of test sentences, a ‘V-Goal’ sentence and a non-target ‘Goal-V’ sentence were contrasted, as shown in (13). The last type of sentence pairs was used to test whether participants were sensitive to the division of labor between postverbal zai ‘at’ and dao ‘reach, to’. Each trial tested participants’ responses to an ungrammatical V-zai sentence and a grammatical V-dao sentence, as in (14).
(12a) Xiao pengyou-men zai hai li wan.
kids-plu at sea inside play
‘The kids are playing in the sea.’
(12b) *Xiaopengyou-men wan zai hai li.
kids-plu play at sea inside
Intended: ‘The kids are playing in the sea.’
(13a) Qiu diao zai di shang.
ball drop at ground top
‘The ball fell to the ground.’
(13b) *Qiu zai di shang diao.
ball at ground top drop
Intended: ‘The ball fell to the ground.’
(14a) Ayi pa dao tizi shang-mian.
auntie climb to ladder top-face
‘Auntie climbed to the top of the ladder.’
(14b) *Ayi pa zai tizi shang-mian.
auntie climb at ladder top-face
Intended: ‘Auntie climbed to the top of the ladder.’
Apart from the 15 test items, there are 53 fillers (used to test other structures not reported in this paper) and 10 control items in the MFC task. In the control items as shown in the ‘Appendix’, sentences with basic word orders of Mandarin and sentences with non-canonical word orders are contrasted. All the test and control trials were randomly divided into four sessions. Trials in each session were randomized, and the positions of the correct answer were counterbalanced on the right/left corners of the computer screen.
The video choice and elicited production tasks were used to supplement the MFC task in testing participants’ interpretation of the ‘Posture verb-zai’ structure. As outlined above, when posture verbs encode States, their postverbal zai-PPs are locational, and when they shift into Achievements, their postverbal zai-PPs are directional. As the event structure of Achievement has a state component (Pustejovsky, Reference Pustejovsky1991, Reference Pustejovsky1995), the achievement/directional reading entails the state/locational reading. The entailment relationship between the directional reading and the locational reading of such sentences may be a confounding factor in the MFC test. It is not desirable to see participants accept the achievement/directional reading, because the video also shows the result state and is consistent with the state reading. Therefore, we designed a video choice task and an elicited production task to rule out the confounding element. A typical test sentence in the video choice task is shown in (9), with the corresponding video stimuli in Figure 2. The three test trials were interspersed with two fillers, and randomized, with the positions of the correct video on the screen (left or right) counterbalanced.
In the elicited production task, among eight trials used for various purposes, one trial tested whether participants employed the ‘posture V-zai’ word order to express direction. The video showed a woman sitting down on a chair, but her buttocks did not touch the chair, even at the end of the video. The question in (15a) was asked to elicit the target answer in (15b). Under this scenario, the ‘posture V-zai’ sequence produced by the participant expressed direction, because the actional verb gan ‘do’ in the question elicited a non-state answer. Besides, the video did not show the result state of the action of sitting, so if the participant produces the ‘posture V-zai’ sequence, it does not refer to a state, but to a change of state.
(15a) Ayi gangcai xiang gan shenme?
aunt just now want do what
‘What did auntie want to do just now?’
(15b) Xiang zuo zai dengzi shang.
want sit at bench top
‘(She) wanted to sit down on the bench.’
The three experimental tasks involve videos produced first-hand by the first author, and photos selected from sources that allow free downloads. All test materials were incorporated into PowerPoint slides.
Procedure
The pool of test items was divided into four parallel parts tested in four separate sessions. Each child had at least a one-day interval between the four sessions, and it took around 15 minutes for one child to finish one session. They were tested individually in a quiet room in the kindergarten, and were audio- and video-recorded. The adult controls participated in the experiment in two sessions, each lasting around 25 minutes.
Each session of the MFC task started with some training. In the training phase, Mickey and Minnie were introduced to the participants and they were told that the two characters were foreigners who had just started to learn Mandarin, and were therefore prone to make mistakes. This type of interaction establishes language rather than content as a topic (McDaniel & Cairns, Reference McDaniel, Cairns, McDaniel, Mckee and Cairns1996). The participants were then invited to judge who spoke Mandarin correctly in describing a picture or a video. There were three training items for Session 1. To prevent the child from forming the bias that one cartoon character always says the right thing, in one trial, Mickey Mouse said the right thing, and in another trial, Minnie was right. In the third trial, both cartoon characters said the right thing. This trial informs the participant that the task has a third option, the both-are-correct option. After the training, the participant was told that the two characters would start a competition. If one character was judged to speak correct Mandarin, the experimenter would give away a reward with a tick. In the end, whoever got more ticks would win the competition. This design not only makes the task attractive, but also makes it natural for the experimenter to record the participant's responses. The test phase followed the same pattern as the training phase except that the experimenter did not give any feedback. The experimenter just let the participant view the video clip or picture, played the two sound files and then asked them which character produced the correct sentence. If there was no response from the child, the experimenter would repeat both sentences and ask the question again.
Session 1 also comprised the video choice task and the elicited production task. In each trial of the elicited production task, a video clip or a picture was shown and the participant was asked what had happened. The participant was given three chances to produce the answer. The threshold was set at three because children needed to be provided with enough chances to produce the target answer on the one hand, and excessive repetition might cause them to lose confidence on the other. In each trial of the video choice task, the participant was asked to make a choice by matching a target sentence with one or both video clips. The participant was first given two training items, including one designed to let the participant know that the target answer could be ‘both are right’. If the participant did not choose one video after both videos were shown, the experimenter repeated the question and replayed the videos.
Coding
In the MFC task, ten control items involving sentences with obvious word order errors were used to screen out participants who had trouble understanding the task or who lagged significantly behind normal children in language development. Since there were three choices for each trial, the probability of success is 33% for each trial and for each participant's responses overall. If participants’ correct rates for control items exceeded 70% (seven correct answers out of ten chances), they were considered capable of understanding the task. Of 96 children tested in the MFC task, 19 did not meet this criterion for accuracy and therefore their results were excluded.Footnote 5 The general information for the remaining 77 child participants whose data were included for further analysis is shown in Table 3. All 20 adult participants’ correct rates for control items reached 100%.
Table 3. Participant Information for the MFC Task
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220321092651042-0394:S0305000917000496:S0305000917000496_tab3.gif?pub-status=live)
The acceptance rate for each sentence type was calculated: for example, if the participant accepted one out of three tokens of a sentence type, the acceptance rate is 0.33. A participant with two acceptance responses out of three is considered to accept the relevant sentence type. The acceptance rates of the age groups were compared using a one-way ANOVA test to see whether the effect of age is significant. A post-hoc Bonferroni test was also conducted to determine whether the differences between each pair of groups are significant. As only two two-year-olds’ data could be used in the MFC task, we did not calculate standard deviation or carry out statistical analysis for this group.
For the video choice task, the numbers 1, 2, and 3 were used to code left-is-right, right-is-right, and both-are-right responses, respectively. Based on these numbers, the acceptance rate of each interpretation was calculated. All the 98 children as shown in Table 1 were able to choose the correct video in the two control items, suggesting that they were all able to do the task. All the 98 children participated in the elicited production task. Participants’ responses were coded as 1 in the elicited production task if the participant's answer matched the target answer, and as 0 if the participant failed to provide an answer or the answer did not match the target. The percentage of correct answers for each age group was calculated subsequently. After the coding done by the first author, half of the child and adult responses were coded by another coder independently and the inter-coder agreement was 96% for the children and 100% for the adults. The audio and video files were used to double-check the notes, and to transcribe the answers accurately.
Results
Interpretation of posture V-zai
As shown in Table 2, three tasks were used to test participants’ interpretation of the ‘posture V-zai’ structure. The MFC task tests whether participants presented with an event involving change of location accept ‘posture V-zai’ sentence (under the directional reading) in the presence of a ‘Location-posture V’ sentence. The results are summarized in Table 4. The decimals represent the group means: for example, the average acceptance rate of the directional reading for the three-year-old group is 0.56. The percentages in the table are the acceptance rates calculated based on the individual data. For instance, at the individual level, 60% of three-year-olds accepted the directional reading at least two out of three times. We used the at-least-2-out-of-3-trials criterion for acceptance rates, because it allows for some within-subject variation given rise by performance factors such as fatigue and lapse of concentration. We also provided in parentheses the percentages of participants who accepted the structure/interpretation three out of three times. The acceptance rates did not increase much with age. A one-way ANOVA showed no significant effect of age (F(4,87) = 1.418, p = .235, η p2 = .325). Even adults did not accept the directional reading of the ‘posture v-zai’ structure 100% of the time, suggesting that it is not the preferred reading for this structure.
Table 4. Acceptance Rate of the Directional Reading of the Posture V-zai Structure by Age Group in the MFC Task
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220321092651042-0394:S0305000917000496:S0305000917000496_tab4.gif?pub-status=live)
Notes. In this table and tables hereafter, mean = group mean, SD = standard deviation, % = percentage of participants who chose the target sentence at least two out of three times, and the percentage in parentheses = the percentage of participants who chose the target sentence three times out of three trials.
In the video choice task, the results summarized in Table 5 show that the locational reading is the preferred reading for children older than 3;0 and for adults. All of the adults were considered to have the locational reading, but only 30% were considered to have the directional reading. The preference was also found in children older than 3;0. We also found that 55% of the adults only had the locational reading, as they consistently chose videos that matched the locational reading, while none chose only the directional reading. However, some children, especially two- and three-year-olds, only have the directional reading. There is also a developmental trend (shown in Figure 4) in which acceptance of the directional reading decreases and acceptance of the locational reading increases as children grow older. Using choosing both videos in each trial, or choosing inconsistently among the three trials as the criteria for having both readings, 45% of adults were considered to have both readings, whereas 65% of three-year-olds were able to do so. A one-way ANOVA indicated significant differences among the five age groups for the locational reading (F(4,109) = 5.021, p = .001, η p2 = .156). Post-hoc tests showed that adults performed significantly better than three-year-olds (p < .001) and four-year-olds (p = .026). However, the effect of age was not significant for the directional reading (F(4,109) = 1.404, p = .238, η p2 = .049).Footnote 6 The statistics suggest that children, especially those under 5;0, did not accept the locational reading to the same extent as the adults. Paired-sample t-tests indicated a significant difference between acceptance rates of the locational reading and the directional reading for four- and five-year-olds, as well as adults (t(25) = 2.553, p = .017; t(29) = 3.466, p = .002; t(19) = 6.629, p < .001). Children begin to show an adult-like preference for the locational reading over the directional reading after 4;0. We also found that in accepting the directional reading, the less preferred reading, six-year-olds and adults have larger within-group variation as compared to the locational reading (SD 0.50 vs. 0.24 for six-year-olds; 0.43 vs. 0.07 for adults). The result suggests that adults and six-year-olds have considerable individual variation in the availability of the directional reading.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220321092651042-0394:S0305000917000496:S0305000917000496_fig4g.jpeg?pub-status=live)
Figure 4. Percentages of participants who accepted the two readings of the posture V-zai structure across ages.
Table 5. Acceptance Rate for Each Reading of the Posture V-zai Structure by Age Group in the Video Choice Task
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220321092651042-0394:S0305000917000496:S0305000917000496_tab5.gif?pub-status=live)
Notes. *** = significantly different from adults, at p < .001; * = significantly different from adults, at p < .05.
In the elicited production task, the participant watched a video showing a woman about to sit on a bench but stopping before her hips hit the bench. When asked what the woman wanted to do, as in (15a), 31 out of 98 children (32%) and 14 out of 20 adults (70%) answered (15b). Naturally, these participants have a directional reading since gan shenme ‘do what’ elicits events rather than states. Summarizing the findings from the three tasks, it can be concluded that the ‘posture V-zai’ structure does have a directional reading for both children and adults, but adults and children older than 4;0 prefer the locational reading.
Interpretation of zai-placement V
In the MFC task, we contrasted two types of ‘zai-placement V’ structures: the neutral type, as exemplified in (11a), which is a correct description of the video stimulus under the locational reading, but a mismatch under the directional reading, and the biased type, as shown in (11b), which is biased towards the directional reading by real-world knowledge. Given the biased type, which is unambiguously compatible with the video, if participants still accept the neutral type, it will be very strong evidence that the locational reading is available for them. Table 6 shows that children's acceptance rates for the locational reading (of the neutral sentence type) are higher than that of adults, whereas their acceptance rates for the directional reading (of the biased sentence type) are lower. A one-way ANOVA indicated that the effect of age was not significant in accepting the locational reading (F(4,90) = 0.264, p = .901, η p2 = .012), while it was significant in accepting the directional reading (F(4,90) = 2.719, p = .035, η p2 = .108).Footnote 7
Table 6. Acceptance Rate for Each Reading of the zai-placement V Structure by Age Group
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220321092651042-0394:S0305000917000496:S0305000917000496_tab6.gif?pub-status=live)
Children's acceptance rate of the directional reading is not as high as that of adults, because some children prefer the locational reading. Some four- to six-year-olds even consistently interpret preverbal zai-PPs as locational. However, the acceptance rates for the locational reading of the neutral type were less than 35% for each age group, suggesting that both children and adults preferred the directional reading to the locational reading.
Word order of zai
If Mandarin-speaking children have acquired the word order in expressing Location and Goal, they should be able to reject postverbal locational zai-PPs (*V-Location), as exemplified in (12b), and preverbal directional zai-PPs (*Goal-V), as shown by (13b). In the juxtaposition of a grammatical Location-V sentence and a *V-Location one, if participants chose the Location-V sentence, this was deemed an implicit rejection of the *V-Location sentence. Similarly, participants’ choice of the V-Goal sentence in the presence of *Goal-V was considered an implicit rejection of the latter. Participants’ rejection rates for the two types of non-canonical word orders are shown in Table 7. It was found that children rejected *Goal-V more readily than *V-Location. A one-way ANOVA showed that the effect of age in rejecting *V-Location was significant (F(4,90) = 3.289, p = .015, η p2 = .128), and post-hoc tests revealed significant difference between four-year-olds and adults (p = .018). The results clearly suggest that children have trouble in rejecting non-target postverbal Location, especially under the age of five. As to participants’ performance with *Goal-V, the one-way ANOVA showed that the effect of age was not significant (F(4,90) = 1.581, p = .186, η p2 = .066).Footnote 8 At a very early age, children seem to be aware of the distribution pattern that Goal should not be in the preverbal position.
Table 7. The Rejection of Non-canonical V-Location and Goal-V by Participants across Ages
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220321092651042-0394:S0305000917000496:S0305000917000496_tab7.gif?pub-status=live)
Note.* = significantly different from adults, at p < .05.
To express result location, many verbs need to be followed by directional phrases headed by dao ‘reach, to’ rather than zai ‘at’. In three trials designed to see whether participants accepted an incorrect V-zai sentence in the presence of a correct V-dao one, participants who exclusively chose the V-dao sentence twice were deemed able to distinguish between zai and dao in the postverbal position. Table 8 shows that some children as old as six were not able to do so. There were significant differences as a function of age (F(4,87) = 10.474, p < .001, η p2 = .325). Post-hoc tests revealed significant differences between adults and three-, four-, five-, and six-year-olds (p = .016, p < .001, p < .001, p < .001, respectively). Children up to six have not fully acquired the division of labor between zai and dao.
Table 8. Distinguishing zai and dao in Postverbal Position by Participants across Ages
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220321092651042-0394:S0305000917000496:S0305000917000496_tab8.gif?pub-status=live)
Notes. * = significantly different from adults, at p < .05; *** = significantly different from adults, at p < .001.
Discussion
Event semantics
Subevent modification gives rise to ambiguities with spatial modifiers if the predicate has a complex event structure (Dowty, Reference Dowty1979; Parsons, Reference Parsons1990). The children in our study showed early awareness of the ambiguity of zai-PPs with placement verbs. They have access to the two readings for zai-PPs with placement verbs at 3;0. As discussed earlier, preverbal zai-PPs may modify either subevent of the placement event, giving rise to a locational reading and a directional reading. Sometimes, even though two interpretations are theoretically possible, real-world knowledge will bias us towards a particular reading. In our MFC task, a sentence that is pragmatically neutral and compatible with the video stimuli in its locational reading is contrasted with a sentence that can only have the directional reading as dictated by real-world knowledge, and is unambiguously compatible with the video. All adults accepted the second type of sentence as correct. Given the contrast with a correct sentence, a small proportion of the adults still accepted the locational reading of the neutral sentence at least twice, as shown in Table 6, indicating that adults have the locational reading alongside the strongly preferred directional reading. The same tendency is also found for children of all age groups. The children's mean acceptance rates of the locational reading are even slightly higher than that of the adults, though the difference is not statistically significant. We argue that adults are more biased by their real-world knowledge towards the preferred reading than children. In contrast to the directional reading of the biased type, which is preferred, they tend to dismiss the locational reading of the neutral type. Children, on the other hand, without well-established real-world knowledge, are more likely to consider the locational reading of the neutral type, since the locational reading is part of their grammar. While children as young as 3;0 demonstrated adult-like acceptance of the locational reading, their acceptance of the directional reading is not as high as that of adults, probably because of the influence of the Location-V word order regularity in the input, to be discussed later. The directional reading is preferred to the locational one for both children and adults, perhaps due to the salience of the result state in the event structure of placement verbs.
The acquisition data showing that Mandarin-speaking children prefer the directional reading of the preverbal spatial modifiers of the placement verbs, just as adults do, have some theoretical implications. As introduced earlier, the process subevent in English Accomplishments is normally the head of the event structure (Pustejovsky, Reference Pustejovsky1995). However, Mandarin speakers tend to interpret the preverbal zai-PP as modifying the result state rather than the process in the event structure of a Mandarin placement verb, also an Accomplishment, suggesting the head status of the result state in the event structure of this type of verb, as shown in (16). This echoes Klein, Li, and Hendriks’ (Reference Klein, Li and Hendriks2000) insight that English is more ‘action-oriented’ while Chinese is ‘result-oriented’ in their discussion of 2-phase verbs (Accomplishment and Achievement verbs, which have a source phase and a target phase) in the two languages.
Aspect shift figures prominently in the syntactic behaviors of verbs across languages (Dowty, Reference Dowty1979; Smith Reference Smith1991; Fong, Reference Fong1997; Rothstein, Reference Rothstein2004). The availability of the two readings for posture verbs results from aspect shift. The data in the current study suggest that children have knowledge of aspect shift at a very early age. The results from MFC, elicited production, and video choice tasks all show that, besides the locational reading for the posture V-zai structure, adults and children have the directional reading, even though it is the less preferred reading for them, and children's acceptance of the directional reading did not differ significantly from adults’. In the video choice task, 65% of three-year-olds were considered to have both readings, as they chose both videos in each trial, or chose inconsistently among the three trials, and 45% of adults were considered to have both readings. The data clearly demonstrate that both the directional and locational reading are available to children as young as 3;0, indicating knowledge of aspect shift. Three-year-olds were even more likely to have both readings than adults, which may be explained by the fact that adults tend to dismiss the reading that is not preferred due to the bias from their real-world knowledge, as shown by the results for the ‘zai-placement V’ structure discussed above. The video choice task also reveals that the adults and children older than 4;0 preferred the locational reading, suggesting that the basic-level event type for the posture verb is State, and its use as an Achievement is derived. In other words, posture verbs are most often conceived as state verbs by Mandarin speakers. Even though the two- and three-year-olds had both readings, they did not show significant preference for the locational reading over the directional reading. Besides the more basic semantic principle of aspect shift, some of them were probably also influenced by the word order regularity to map Goal to the postverbal position, and thus tended to interpret zai-PPs after posture verbs as directional.
To summarize, children as young as 3;0 show awareness of the two readings of zai-PPs with placement and posture verbs. The multiple readings can be best explained in adult grammar by subevent modification and aspect shift. Therefore, we infer that children have adult-like knowledge of these notions. Early demonstration of such knowledge supports the psychological reality of the mental processes of subevent modification and aspect shift. In the child language literature, language universals are assumed to be acquired early (Jakobson, Reference Jakobson1968; Crain, Reference Crain2008). In this case, Mandarin-speaking children demonstrate adult-like interpretation of them at a young age, lending support to aspect shift and subevent modification as universal semantic principles. However, whether these universal principles are part of the genetically engraved UG or acquired early in life, and whether they are domain-general or specific to language await future study. Even if not every Mandarin-speaking child is systematically informed of the ambiguity of zai-PPs with placement verbs and posture verbs, they encounter this type of sentence unambiguously matched with one of the two readings in a real-life context in their daily lives. Therefore, there is input to trigger the learning of these semantic principles after birth. To prove that they are innate rather than learned after birth, we need to test still younger children with little experience with language, for instance one-year-olds. Whether these semantic principles are specific to language or governed by general cognitive principles is also an issue. Though we claim them to be linguistic principles, they could be the by-product of some cognitive principles of how to represent events in our mind, which could be innate or formulated based on children's observation of daily events.
Input
In what follows, we will argue that, beside inherent knowledge of event semantics, input also plays a role in child acquisition of the interpretation and distribution of zai-PPs. There are at least two levels to show the effect of input on acquisition: at the level of input frequency, the high frequency of a structure may cause children to overuse it; at the level of input transparency, inconsistent cues in the input make a structure harder to acquire than other structures with consistent and transparent cues. Our study confirms the effect of input on acquisition at these two levels. First, input frequency influences the interpretation of the ambiguous structure. In our MFC experiment, some children had exclusively directional readings for the postverbal zai-PPs with posture verbs, and some had exclusively locational readings for the preverbal zai-PPs with placement verbs, which can be explained by the influence of the Location-V and V-Goal word orders in Mandarin. This word order pattern is very frequent in adult input: more than 80% of preverbal zai-PPs denote Location and more than 80% of postverbal ones denote Goal (Deng & Yip, Reference Deng and Yip2015). The high frequency of the two word orders seems to exert a strong influence on some children in our MFC experiment: 25% of the two-year-olds, 14% of the three-year-olds, 8% of the four-year-olds, and 7% of the five-year-olds, as shown in Table 5, only had the directional reading for the posture V-zai structure. The initial preference for the directional reading is probably influenced by the high frequency of the V-Goal word in the input, which may lead some children to think that postverbal zai-PPs are always directional. Moreover, in this experiment, 8% of the four-year-olds, 4% of the five-year-olds, and 22% of the six-year-olds, as shown in Table 6, consistently interpreted the zai-placement V structure as locational, which may be explained by the Location-V word order regularity in the input. However, we do not rule out another possibility, that these children misparsed the preverbal prepositional zai in input as a verb forming a serial verb construction with the following placement verb, since zai has both verbal and prepositional usages due to grammaticalization (Peyraube, Reference Peyraube1994).
Second, input transparency influences the acquisition of the word order of zai-PPs. The placement of Goal-denoting zai-PPs and locational ones in Mandarin is inconsistent. The otherwise consistent pattern of Location-V and V-Goal is disrupted by preverbal Goal-denoting zai-PPs with placement verbs and postverbal locational zai-PPs with posture verbs. As shown in Deng and Yip (Reference Deng and Yip2015), at least 11.3% of preverbal zai-PPs and 19.2% of postverbal ones in the adult input are exceptions to the word order regularity. Besides, zai-PPs follow verbs from a restricted set: only verb classes that have a dynamic/resultative state component in the event structure can take them. The inconsistent cues are predicted to complicate the acquisition task. This was not borne out by the naturalistic production data from child language corpora (Deng & Yip, Reference Deng and Yip2015). However, the limited sampling of the corpus recording may explain the absence of non-target word order. In our comprehension task, four-year-olds showed significant difference from adults in accepting non-canonical postverbal locational zai-PPs under the experimental setting. The finding is in line with the production data from monolingual Cantonese-speaking children, Cantonese–English bilingual children, and L2 learners of Mandarin, who produced non-target V-Location orders (Cheung, Reference Cheung1991; Yip & Matthews, Reference Yip and Matthews2007; Zhou, Reference Zhou2011). The inconsistency of the word order of zai explains the late acquisition of the placement of locational zai-PPs. Children have difficulty in rejecting postverbal Location, while rejecting the non-canonical preverbal Goal is easier for them. The difference can be attributed to the difference in the number of inconsistent cues in the preverbal and postverbal positions: there are more postverbal locational zai-PPs than preverbal Goal-denoting zai-PPs in adult input (19.2% vs. 11.3%) as shown by Deng and Yip (Reference Deng and Yip2015); apart from Goal-denoting zai-PPs, other result elements in Mandarin are generally excluded from the preverbal position, whereas the postverbal position accommodates a variety of elements, including Result, Frequency, Duration, and Manner (see Huang, Reference Huang, Lust, Suñer and Whitman1994). As Result elements most frequently occur postverbally, children are good at rejecting non-canonical preverbal result location. Children did not clearly distinguish zai ‘at’ and dao ‘reach, to’ in the postverbal position: up to six, they accepted zai-PPs following verbs that can only be followed by dao. The comprehension data echo the naturalistic production data reported in Hsieh (Reference Hsieh, Wilder and Åfarli2010) and Deng and Yip (Reference Deng and Yip2015). Again, the late acquisition is explained by the fact that the division of labor between zai ‘at’ and dao ‘reach, to’ in postverbal position is not transparent in the input.
The distribution of zai-PPs and the division of labor between zai ‘at’ and dao ‘reach, to’ are difficult to learn because they are determined by their co-occurring verbs, and learners have to master knowledge of the semantics of each verb and its membership in a verb class through experience. Postverbal zai follows a limited number of verb classes, and children have to learn whether a verb belongs to these classes or not. This task is difficult for at least two reasons. First of all, children have to learn through experience the semantics and syntactic properties of each verb. Different languages may have different mappings from lexical semantics to syntax: for instance, the verb for ‘sneeze’ is unergative in Italian and Dutch, unaccusative in Eastern Pomo, and flexible in Choctaw (Rosen, 1984). A further source of complexity is frequent aspect shift in Mandarin, which makes it difficult for learners to pin down some verbs’ event types (see Deng, Reference Deng2014).
In sum, our child data demonstrate the combined forces from inherent principles of subevent modification and aspect shift, and distributional learning based on input in language acquisition (see Pinker, Reference Pinker1984; Yang, Reference Yang2002, Reference Yang2004). In addition, learning the syntax and semantics of each verb through experience is an indispensable part of language acquisition.
Conclusions
This study lends developmental support to some event-semantics principles as language universals. If the semantic principles of aspect shift and subevent modification are universal, and the interpretation of Mandarin zai-PPs with posture verbs and placement verbs hinges on these universals, children are expected to develop adult-like representations very early. It turns out that young children access the two readings of the ‘posture V-zai’ structure, and those of the ‘zai-placement V’ structure in an adult-like way, supporting aspect shift and subevent modification as universal semantic principles. The placement of Goal-denoting zai-PPs and Location-denoting ones in Mandarin is inconsistent, and zai-PPs follow a restricted set of verbs. If children's learning mechanism involves statistical/distributional learning, the inconsistent pattern of zai-PP placement in the input will complicate the acquisition of the distribution of zai-PPs. We tested whether children could reject inappropriate postverbal locational zai-PPs, and make a distinction between postverbal zai and dao. Many children failed to do so, suggesting that inconsistent input delays the full acquisition of the distribution of zai. Our child data demonstrate the combined forces of inherent principles and input-based distributional learning in language acquisition.
Our child and adult data also suggest that, in the event structure of a Mandarin placement verb, the subevent of result state is more salient, as children and adults prefer to interpret zai-PPs preceding the placement verb as the result location, and the default event type for Mandarin posture verbs is State, since children and adults prefer the locational reading of postverbal zai-PPs.
Acknowledgements
We have benefited from discussions with a number of colleagues: Stephen Matthews, Boping Yuan, Thomas Lee, Candice Cheung, Ziyin Mai, Jiahui Yang, Haoze Li, Aijun Huang, and Zhuang Wu. We gratefully acknowledge the support from participants in Meidi Kindergarten, Fengdan Yayuan Kindergarten, and Tsinghua University. We have also made significant improvements, thanks to the anonymous JCL reviewers and editors, especially Prof. Cecile De Cat. The following colleagues have contributed to this study in various ways: Donald White, Hinny Wong, Haoyan Ge, Jing Yang, Jess Law, Yu'an Yang, Tracy Au, Hannah Lam, Li Yang, and Qian Guo. This research is supported by a start-up grant to the Bilingualism and Language Disorders Laboratory at Shenzhen Research Institute of Chinese University of Hong Kong (CUHK), CUHK funding for the University of Cambridge-CUHK Joint Laboratory for Bilingualism and CUHK-Peking University-University System of Taiwan Joint Research Centre for Language and Human Complexity, and two General Research Fund projects funded by the Hong Kong Research Grants Council (Project no. 14413514 and 146632016).
Appendix
Test items for the modified forced choice taskFootnote 9
1. Interpretation of the posture V-zai structure
(1a) 阿姨坐在椅子上。
Ayi zuo zai yizi shang.
auntie sit at chair top
‘Auntie sat down on the chair.’
(1b) 阿姨在椅子上坐。
Ayi zai yizi shang zuo.
auntie at chair top sit
‘Auntie is sitting on the chair.’
(2a) 阿姨在地上蹲。
Ayi zai di shang dun.
auntie at ground top squat
‘Auntie is squatting on the ground.’
(2b) 阿姨蹲在地上。
Ayi dun zai di shang.
auntie squat at ground top
‘Auntie squatted down on the ground.’
(3a) 阿姨躺在床上。
Ayi tang zai chuang shang.
auntie lie at bed top
‘Auntie lay down on the bed.’
(3b) 阿姨在床上躺。
Ayi zai chuang shang tang.
auntie at bed top lie
‘Auntie is lying on the bed.’
2. Interpretation of the zai-placement V structure
(4a) 阿姨在墙上贴了一张相片。
Ayi zai qiang shang tie-le yi-zhang xiangpian.
auntie at wall top stick-Perf one-CL photo
‘Auntie stuck a photo onto the wall.’
(4b) 阿姨在沙发上贴了一张相片。
Ayi zai shafa shang tie-le yi-zhang xiangpian
auntie at sofa top stick-Perf one-CL photo
‘Auntie stuck a photo on the sofa.’
(5a) 阿姨在床上挂了一件衣服。
Ayi zai chuang shang gua-le yi-jian yifu.
auntie at bed top hang-Perf one-CL clothes
‘Auntie hung a piece of clothing on the bed.’
(5b) 阿姨在衣柜上挂了一件衣服。
Ayi zai yigui shang gua-le yi-jian yifu.
auntie at wardrobe top hang-Perf one-CL clothes
‘Auntie hung a piece of clothing on (the door of) the wardrobe.’
(6a) 阿姨在纸上画画。
Ayi zai zhi shang hua hua.
auntie at paper top draw picture
‘Auntie drew on the paper.’
(6b) 阿姨在床上画画。
Ayi zai chuang shang hua hua.
auntie at bed top draw picture
‘Auntie drew on the bed.’
3. *V-Location
(7a) *小朋友划船在水里。
Xiaopengyou hua-chuan zai shui li.
kid row-boat at water inside
Intended meaning: ‘Kids are rowing in the water.’
(7b) 小朋友在水里划船。
Xiaopengyou zai shui li hua-chuan.
kids at water inside row-boat
‘Kids are rowing in the water.’
(8a) 小朋友们在海里玩。
Xiaopengyou-men zai hai li wan.
kids-PLU at sea inside play
‘Kids are playing in the sea.’
(8b) *小朋友们玩在海里。
Xiaopengyou-men wan zai hai li.
kids-PLU play at sea inside
Intended meaning: ‘Kids are playing in the sea.’
(9a) *叔叔吃在桌子上。
Shushu chi zai zhuozi shang.
uncle eat at table top
Intended meaning: ‘The man is eating at the table.’
(9b) 叔叔在桌子上吃。
Shushu zai zhuozi shang chi.
uncle at table top eat
‘The man is eating at the table.’
4. *Goal-V
(10a) #阿姨在椅子上站。
Ayi zai yizi shang zhan.
auntie at chair top stand
Intended meaning: ‘Auntie stood on the chair.’
(10b) 阿姨站在椅子上。
Ayi zhan zai yizi shang.
auntie stand at chair top
‘Auntie stood on the chair.’
(11a) 球掉在地上。
Qiu diao zai di shang.
ball drop at ground top
‘The ball fell to the ground.’
(11b) *球在地上掉。
Qiu zai di shang diao.
Ball at ground top drop
Intended meaning: ‘The ball fell to the ground.’
(12a) 阿姨跳到床上。
Ayi tiao dao chuang shang.
auntie jump to bed top
‘Auntie jumped onto the bed.’
(12b) #阿姨在床上跳。
Ayi zai chuang shang tiao.
auntie at bed top jump
Intended meaning: ‘Auntie jumped onto the bed.’
5. Division of labor between zai ‘at’ and dao ‘to’ in the postverbal position
(13a) 阿姨把维尼熊拿到腿上。
Ayi ba weinixiong na dao tui shang.
auntie BA Winnie the Pooh take to leg top
‘Auntie took Winnie the Pooh onto her lap.’
(13b) *阿姨把维尼熊拿在腿上。
Ayi ba weinixiong na zai tui shang.
auntie BA Winnie the Pooh take at leg top
Intended meaning: ‘Auntie took Winnie the Pooh onto her lap.’
(14a) *阿姨走在外面去了。
Ayi zou zai wai-mian qu le.
auntie walk at outside-face go LE
Intended meaning: ‘Auntie walked out.’
(14b) 阿姨走到外面去了。
Ayi zou dao wai-mian qu le.
auntie walk to outside-face go LE
‘Auntie walked out.’
(15a) 阿姨爬到梯子上面。
Ayi pa dao tizi shang-mian.
auntie climb to ladder top-face
‘Auntie climbed to the top of the ladder.’
(15b) *阿姨爬在梯子上面。
Ayi pa zai tizi shang-mian.
auntie climb at ladder top-face
Intended meaning: ‘Auntie climbed to the top of the ladder.’
6. Sample control item
(16a) *马路过小朋友。
Malu guo xiaopengyou.
road cross kid
Intended meaning: ‘Kids crossed the road.’
(16b) 小朋友过马路。
Xiaopengyou guo malu.
kid cross road
‘Kids crossed the road.’