INTRODUCTION
The most remarkable design feature of human language is its generativity, creativity, or productivity, namely, that an infinite number of sentences can be created and interpreted using a finite number of grammatical rules and a finite number of simple terms (Chomsky, Reference Chomsky1965; Hockett, Reference Hockett1960). It is a safe generalization that we do not quite know how children acquire syntactic knowledge, and, in particular, what are the processes by which children acquire the basic principles of syntactic productivity.
The difficulty in explaining acquisition can be traced to some extent to the extremely complex syntactic systems mainstream linguistics posited for many years, and in particular, the Chomskian transformational grammars. However, lately there has occurred an important convergence between two central approaches to syntactic structure, namely, Chomsky's Minimalist Program (Reference Chomsky1995) and Dependency Grammar (Tesnière, Reference Tesnière1959) in their basic assumptions of how to characterize syntactic structure (Osborne, Putnam & Gross, Reference Osborne, Putnam and Gross2011), and the emergent theoretical system is significantly more simple. The shared assumption is that syntactic structure is built up by the iteration of a single binary relation between pairs of words, variously called Merge or Dependency. In dependency terminology, an atom of syntactic structure consists of two words, one of which is a head (governor) of the other, its dependent. The head-word determines the occurrence in the sentence, positioning, morphological form, and semantic role of the dependent word. Semantically, the dependent's role is to modify the lexical meaning of the head. Syntactic combinations are endocentric, meaning that the head-word determines the grammatical category of the combination (in X-bar schema it is said to ‘project to the phrase’, and in the Minimalist Program ‘to label the combination’). The syntactic structure of the sentence is built by taking all words of a sentence and combining them pairwise by iterating the merge/dependency relation, relating each word as a dependent to one single head, except for the ‘root’ of the sentence that has no head. The syntactic tree specifies the order by which the sentence-meaning is built up compositionally from the bottom up. Syntactic rules are lexical-specific and are stored in the lexicon in the form of words' a priori semantic and syntactic potential for combining with other words (their valency, Allerton, Reference Allerton1982; or their logical-functional structure, Kaplan & Bresnan, Reference Kaplan, Bresnan and Bresnan1982). In Minimalism there is a second rather marginal operation called Move which theoreticians say can be incorporated under Merge also (Cormack & Smith, Reference Cormack and Smith2001).
Syntax is closely connected with semantics, hence the use of syntactic structure for the semantic reading of the sentence is relatively straightforward. Syntactic relations encode the logical-semantic relation between the two words so that one of the words is a predicate and the other, its argument. In a complement relation (such as verb–direct object) the predicate is the head and the argument its dependent; in an adjunct relation (such as attributive adjective–noun) the predicate is the dependent and the argument is the syntactic head. As predicates are logical-semantic functions on variable arguments and are similar to mathematical functions such as f(x) or f(x,y), for a given predicate word, its logical argument(s) can take any number of different values in an actual sentence, hence the same predicate-specific coding rule expressing the predicate–argument relation generates an infinite number of different word combinations.
The great advantage of such a theory of syntax for developmental modelling is that it is simple and feasible as a cognitive system employed by human speakers. When transformational grammar dominated the field, it seemed impossible to learn it from the linguistic input, due to its invisible deep structure and transformations. Now that the Minimalist Program posits that structure is ‘projected’ from the lexicon, syntax is based on concrete combinatory potential of individual words. The system is rather simple; it includes the words and a single combining operation Merge, which, together, are sufficient to create simple sentences. Indeed Chomsky is not reluctant to say that under Minimalist assumptions, ‘we expect that languages are “learnable,” because there is little to learn’ (Chomsky, Reference Chomsky2000, p. 124). The sole element of syntax which may be difficult to learn from the input is the element of recursion by which one sentence is incorporated in another (Hauser, Chomsky & Fitch, Reference Hauser, Chomsky and Fitch2002). Recursion not being a process relevant for the basics of syntax, we might summarize that, according to present-day mainstream linguistic theory, syntax is relatively learnable.
The new proposal I wish to put forward is that children acquire the knowledge to generate a hierarchical syntactic structure by the operation Merge or Dependency – a rather simple mechanism compared to the rich machinery assumed in the earlier literature. This provides an alternative to usage-based or Construction Grammar accounts that assume a chunk-like, unanalyzed, non-hierarchical syntactic representation for the early stages of acquisition.
What is needed is a testable model that would account for the learning of syntactic fundamentals from the parental speech. It has been suggested before (Brooks & Kempe, Reference Brooks and Kempe2012; Green, Reference Green1997; MacWhinney, Reference MacWhinney and Kuczaj1982; Ninio, Reference Ninio2006, Reference Ninio2011; O'Grady, Reference O'Grady2005; Powers, Reference Powers, Witruk and Friederici2002; Radford, Reference Radford2000; Robinson, Reference Robinson1986; Van Langendonck, Reference Van Langendonck1987) that children's word combinations are produced as syntactic Merge or Dependency couples. In the absence of systematic testing, this hypothesis has not yet been given much attention in the developmental literature.
Research questions and hypotheses
We are going to test a model of syntactic development according to which there is in the parental input an easily available source to learn the basic principles of syntax from: sentences two words long. The hypotheses are that two-word sentences in the input demonstrate the fundamental nature of syntax in a transparent fashion, and that children indeed learn from them syntactic rules for the expression of specific predicate–argument relations. To test the first of these hypotheses, we explored the two-word input of the major grammatical relations of English, which are the subject–verb, verb–object, and verb–indirect object grammatical relations (Andrews, Reference Andrews and Shopen1985). Grammatical relations are subtypes of the general dependency relation. English sentences have to be built around a tensed verb accompanied by one or more of the verb's obligatory complements, generating the clausal core (Foley & Van Valin, Reference Foley, Van Valin and Shopen1985; Givon, Reference Givón and Givón1997). Hence these relations are crucial for constructing sentences.
In the Dependency/Merge tradition, to be able to produce syntactically structured sentences children need to master three basic principles:
1 the units of syntax are two and only two words in a binary relation; sentences of all length and complexity are built from such atoms;
2 syntactic combinations are asymmetrical, with one of the words of the pair – the head – determining the grammatical category of the combination and carrying the bulk of the semantic content, with the other word – the dependent – merely adding further specification to the semantics of the head;
3 syntactic relations are, as a rule, not between two specific words but between a given predicate word and its variable argument(s). The arguments may take an infinite number of possible values, so that the predicate resembles a mathematical function such as f(x) or f(x,y).
This is the basis of the generativity of syntactic rules which are predicate-specific but can take any number of different argument terms.
We believe that two-word utterances in the linguistic input, and in particular ones with a core syntactic relation, demonstrate the principles of syntax a transparent way.
In the first place, two-word utterances contain the shortest possible sentences in which syntactic connectivity is observed. In fact, these are naturally occurring syntactic atoms, which are available to the learner without needing to be segmented out of longer utterances. Trivially, they demonstrate that syntax is between two words.
The very shortness of these utterances may hold the key to the discovery of the asymmetrical nature of the syntactic relationship. Two-word utterances expressing a predicate–argument relation have one word for the predicate – i.e. the verb – and that leaves only a single word for the logical argument – e.g. the verb's direct object. Words that can serve as single-word referential expressions are, for example, pronouns. The hypothesis is that while single-word verbs are canonical, single-word referential expressions will tend to be defective, with a restricted ability to serve as linguistic signs or to enter into syntactic connections. The contrast between the fully functional predicate and the defective argument may transparently demonstrate the asymmetrical nature of syntactic relations.
The basis for this argument is to be found in the nature of linguistic signs, and, in particular, in the relations of signs to their objects. All signs stand for something, their signified or object. However, the way signs achieve this function varies. The logician Peirce (Reference Peirce, Fisch, Kloesel, Moore and Houser1865/1982) in his influential work on semiotics or the theory of signs, defined three types of sign by how they connect with their object. Symbols are arbitrary signs, relying on conventional use to determine what their objects are. Examples of symbolic linguistic signs are common nouns, verbs, adjectives, and adverbs, that is, words that function by indicating membership in a symbolic category. Icons signify via a resemblance or similarity to their significant, sharing some specific properties with it. The typical icon is the photograph. Among linguistic signs, onomatopoetic words such as meow are iconic, and so are complements of a request to imitate that models the locution to be imitated, such as in the sentence Say ‘please’, as the sign in this case is identical with its significant. Lastly, indexes (or indices) signify by indicating or pointing to an object, not by defining one through a symbolic category or by resembling it in some respect. Peirce says that anything which focuses the listener's attention on something is an index, such as a knock on the door or a pointed finger. Indexical linguistic signs are personal pronouns such as I, you, him, demonstrative pronouns such as this, that, those, and also proper nouns such as Mary, Carl, Bobby, that is, all referential expressions whose meaning is deictic and whose referents can only be established by taking the speaker, listener, and other contextual information into consideration. The least intuitive is the claim that proper nouns are indices. However, Peirce explicitly included designations such as proper names in the category of indexical signs, and in fact proper names are considered deictic by most authorities (e.g. Donnellan, Reference Donnellan, Steinberg and Jakobovits1971) because of their non-symbolic, indicating relationship with their objects that requires contextual information to be determined.
Returning to two-word sentences, we expect the single-word complement of verbs to be mostly indexes such as pronouns and proper names; maybe icons such as modelled sounds for imitation; and very seldom to be symbolic signs such as common nouns or members of the other open classes. The reason is that common nouns referring to specific objects usually require a determiner in English, hence cannot be single-word expressions; single-word occurrences are reserved for unusual usages such as a plural noun referring to a type of objects and not to specific objects (e.g. in Like bananas?). In very short phrases, specific referential objects are usually expressed by pronouns, demonstratives, proper names, or other indexical signs; we expect the same to occur in two-word sentences with a syntactic relation expressed in them.
However, there is a good reason why indexical signs are so short, relative to referential expressions that make use of symbolic signs. Indexes operate by direct reference, by pointing to some entity, and not by symbolizing or describing it (Kaplan, Reference Kaplan, Almog, Pezrry and Wettstein1989; Levinson, Reference Levinson, Horn and Ward2006). This makes them defective linguistic signs and, in particular, makes their semantics almost non-existent. Indeed, indexical signs are very similar to gestures (Goldin-Meadow, Reference Goldin-Meadow2007).
Gesture–word combinations in parental speech will help us see why the combination of a symbolic sign and an indexical sign is asymmetrical. Let us assume for the sake of the argument that we are considering a combination between a verb and a pointing gesture. The combination is clearly asymmetrical: the verb carries the majority of the meaning of this communication, for instance by requesting a type of action; the pointing gesture adds some supplementary information on the parameters of this action. However, two-word utterances combining a verb with an index are not very far from the combination of verb and gesture. Here, too, the verb carries the bulk of the meaning of the utterance. For instance, a parent may say Push! and point to a button on some toy; she may also say Push that! in the same circumstances. Whether or not she uses the indexical sign, she needs to provide gestural and contextual support so that the child would know what she wants him to push. The contribution of the pronoun it is as supplementary as that of a gesture. Just like the combination of a verb and a gesture is asymmetrical, so is the combination of the verb and the index. The combination is not only asymmetrical but it is clearly endocentric – it is a verb phrase, not a head-less concatenation of two words.
Assuming that the child already knows the vocabulary items and can guess the meaning of the sentence from the non-linguistic context (Macnamara, Reference Macnamara1972), the respective roles of the two words can easily be identified.
The two-word sentence containing a verb and an index appears as the combination of an act of reference to an object and an act of predicating for it an argument role in the event-description given by the verb. For instance, in the two word sentence Take that, which is a request to take some indicated object, the requested action of taking is encoded by the verb take, and the object on which the action of taking is to be performed is coded by the pointer-word that. As the semantic-logical role of the object is dependent on the meaning of the verb, the relation can only be conceptualized as that of asymmetrical dependency.
Because of the restriction of indexical signs to the almost-gestural function of pointing to some entity, all combinations they enter into with a symbolic sign such as a verb are dominated by the word with the full symbolic semantics. If we find that the majority of parental two-word sentences expressing core syntax have an indexical term for the complement, it implies that a child can learn from such word combinations that syntax is a fundamentally asymmetrical combination of two words.
From the same sentences, children can also learn the third basic principle of syntax, namely, that a syntactic relation is between a predicate word and one of its semantic arguments which, in principle, may take an infinite number of different values in different sentences. The child needs to realize that the semantics of the predicate resembles a mathematical function such as f(x), with X being the variable argument. This is the basis of the generativity of syntactic rules which are predicate-specific but can take any number of different argument terms. It is often assumed in the acquisition literature that there is learnability problem caused by the fact that in any given sentence the value of the semantic argument is fixed, determined by the current value of the argument-expression. That is, a child may hear the parental sentence makes music to describe what a music-box does, and may mistakenly assume that this is a frozen combination, to be rote-learned, which can only be used with respect to music-making. The question of how children go from specific input sentences to generalizations involving variable elements is called the projection problem in the literature (Peters, Reference Peters and Peters1972). In many publications, it has been proposed that earliest word-combinations are acquired as word-specific, rote-learned phrases, usable only in the narrowly defined context in which they were acquired (e.g. MacWhinney, Reference MacWhinney and Kuczaj1982; Pine & Lieven, Reference Pine and Lieven1993). In order to turn such phrases to the productive scheme of adult use, children were thought to need to acquire a number of word-specific exemplars of the same construction with different fillers, such as makes noise, makes tea, and so on, the set ultimately undergoing processes of abstraction, generalization, scheme formation, categorization, and the like until the logical argument of makes is recognized as a variable.
If our hypothesis is correct and we find that the majority of parental two-word sentences expressing core syntax have an indexical term for the complement, it implies that a child can immediately learn from such accessible word combinations that syntax involves the combination of a predicate with a variable argument. Pronouns and other indexical referential expressions are inherently variable, referring to a different entity in each different context of use. Indexical signs are deictic terms (or ‘shifters’), namely, referential expressions whose interpretation depends on the context in which they are uttered. The deictic word that and the other indexical expressions do not have a constant meaning; in one context, that refers to a piece of puzzle, in another context, to a block. This means that the verb's argument is immediately defined as a variable. Crucially, this would make the two-word syntactic couple such as take that the basis for flexible reuse and generalization to other contexts, without the need to abstract out the variable from individual specific exemplars. If the two-word syntactic atoms contain natural variables for dependents, a single input sentence can teach the syntactic behaviour of the relevant verb in any context, serving as an immediate abstract rule.
The following presents two studies that explore this model of syntactic development. In Study 1 we test the hypothesis that two-word sentences in the parental input with a core grammar, namely subject–verb, verb–object, and verb–indirect object combinations, mostly use indexical terms for the verbs' complement. Such sentences could demonstrate the fundamental nature of syntax in a transparent fashion. In Study 2 we shall test the hypothesis that children indeed learn from parental two-word sentences verb-specific rules for the expression of specific predicate–argument relations.
STUDY 1
METHOD
Participants
We used an already constructed large corpus or collection of transcribed sentences, representing English-language parental child-directed speech which was built and annotated in a previous stage of this study (Ninio, Reference Ninio2011). This corpus represents the linguistic input that young children receive when acquiring syntax. We used parental speech and not the existing corpora of adult-addressed language such as the Penn Treebank Project's collection of texts from the Wall Street Journal, as there are grounds to believe that child-directed parental speech forms a special speech register with its own unique characteristics. We systematically sampled the English transcripts in the CHILDES (Child Language Data Exchange System) archive (MacWhinney, Reference MacWhinney2000). CHILDES is a public domain database for corpora on first and second language acquisition. The publicly available, shared archive contains documentation of the speech of more than 500 English-speaking parents addressed to their young children. Although each separate study is by necessity limited in its coverage of the phenomenon, the different studies pooled together can provide the requisite solid database for generalization.
The use of pooled corpora of unrelated parents as a representation of the linguistic input is a relatively conventional move in child language research (e.g. Goodman, Dale & Li, Reference Goodman, Dale and Li2008; Huttenlocher, Haight, Bryk, Seltzer & Lyons, Reference Huttenlocher, Haight, Bryk, Seltzer and Lyons1991; Lee & Naigles, Reference Lee and Naigles2005; Zamuner, Gerken & Hammond, Reference Zamuner, Gerken and Hammond2005). Multiple speakers of child-directed speech may provide a good estimate of the total linguistic input to which children are exposed, which includes, besides the speech of the individual mother or father, also the speech of grandparents, aunts and uncles, older siblings, and other family members, neighbours, care professionals, and so forth, represented in our corpus by the speech of mothers and fathers unrelated to the individual child. The pooled database represents the language behaviour exhibited by the community as a whole when addressing young children. This research strategy has its own existence and justification in the field of linguistics where it is known as corpus-based linguistics. Corpus-based linguistics is applied in cases when the focus of interest is not an individual speaker (or writer) but the central tendencies of the language variety. In building our corpora, we followed closely the principles established in linguistics for constructing systematically assembled large corpora (Francis & Kučera, Reference Francis and Kučera1979).
The CHILDES archive stores the transcribed observations collected in various different research projects, each with its individual population and methodology. We have selected projects among the ones available using the criteria that the observations were of normally developing young children with no diagnosed hearing or speech problems, and of their parents, native speakers of English, their speech produced in the context of naturalistic, dyadic parent–child interaction. We have restricted the child's age during the observed period to three years and six months. Each parent was selected individually, so that from the same research project involving the same target child, we included either the mother, or the father, or both parents as separate speakers, as long as either or both passed the criteria for inclusion.
This process resulted in the selection of parents and children from thirty-three research projects in the CHILDES archive: the British projects Belfast, Howe, Korman, Manchester, and Wells, and the American projects Bates, Bernstein-Ratner, Bliss, Bloom 1970 and 1973, Brent, Brown, Clark, Cornell, Demetras, Feldman, Gleason, Harvard Home-School, Higginson, Kuczaj, MacWhinney, McMillan, Morisset, New England, Peters-Wilson, Post, Rollins, Sachs, Suppes, Tardif, Valian, Van Houten, and Warren-Leubecker (MacWhinney, Reference MacWhinney2000). From these projects, we selected the observational studies of 471 different parent–child dyads involving a target child of the correct age range, namely, below 3;6.
In 35 of the studies there were two active parents interacting with the target child, resulting in a parental sample of 506 different parents.
In order to avoid severely unequal contributions to the pooled corpus, the number of utterances included from each parent was restricted to a maximum of 3,000. We have excluded the speech of parents addressed to other adults present in the observational session or on the telephone, as this speech may be ignored by young children because of unfamiliar subjects. All transcribed dialogue and the action and other contextual comments were checked in order to ascertain that we include only spontaneous utterances from target parent to target child.
The resultant parental corpus contains almost 1·5 million (1,470,811) running words of transcribed speech based on naturalistic observations of interaction between parents and their young children, representing several hundred hours of transcribed speech. Most of the children addressed were under three years of age, and 93% of the parents in the sample talked to a child between one year and two and a half years of age in all or the majority of the observations we included in the corpus. The mean age of the children addressed was 2·25 years.
Syntactic annotation for core grammatical relations
As said above, the previous stage of the project included syntactic analysis. We manually annotated the parental corpus for the three core grammatical relations involving verbs, namely the subject–verb (SV), verb–object (VO), and verb–indirect object (VI) relations. We based our dependency analyses on the detailed descriptions of Hudson's English Word Grammar (Hudson, Reference Hudson1990), with some modifications so as to generate in all cases a strict tree structure, namely, we placed constraints on syntactic structure so that to restrict it to single-headedness, acyclicity, and projectivity. We also consulted descriptive grammars of English, and in particular, Quirk, Greenbaum, Leech, and Svartvik (Reference Quirk, Greenbaum, Leech and Svartvik1985). Hudson's system was chosen for annotation as his English Word Grammar (with its online update) is a highly regarded Dependency Grammar of English.
Syntactic annotation was done by five graduate students at the Hebrew University with training in linguistics. It relied on extensive coding instructions and a very large collection of annotated exemplars. We checked for reliability by having three pairs of coders blindly recode 1,200 utterances produced by four different parents. A checking of all reliability codes showed that the accuracy of each coder was above 98%, based on codes actually given by the relevant pair of coders. If we count the match between coders on the basis of all codes that were potentially possible (five SV, five VO, and three VI relations to be identified per utterance), the accuracy climbs to close to 100%. Throughout coding, all problem cases were discussed and resolved. Ultimately, each coded utterance was double-checked by another coder.
We should mention that some of the transcribed observations of the CHILDES archive are annotated for syntactic relations, using an automatic syntactic parser which has been prepared for CHILDES users. These annotations were not appropriate for the present project and our coders coded the original un-annotated transcripts in the CHILDES basic format ASCII text files. For more details of the corpus building and the coding process performed at the previous stage of the study, see Chapter 2 of Ninio (Reference Ninio2011).
Distribution of tokens of the clausal core in parents' speech
When the whole parental corpus was coded, it was found that there were 198,453 utterances in which there was at least one token of a core grammatical relation of subject–verb, verb–object, or verb–indirect object. We selected for the present study only 7,234 two-word sentences; that is, 3·65% of all sentences. There were 3,422 sentences expressing the SV relation, 4,180 expressing the VO relation, and 294 expressing the VI relation. The longer utterances were not considered further in the reported study.
Coding complement words for form class and classifying them into Peirce's categories of signs
We classified the words serving as subject, object, and indirect object by form class as pronouns (personal pronouns such as I, you, him, demonstrative pronouns such as this, that, those, and interrogative pronouns such as what, who, whom); proper nouns (e.g. Carl, Amy, and also Mommy, Dad); common nouns (bear, belly, birds); non-finite verb forms, namely gerunds, particles, and infinitives (trying, playing, going, go, and also to); and adverbs (again, down, out). In some cases the object word was a sound imitating animals or some vocalization being offered for imitation by the hearer; for instance, Say ‘moo!’ We classified these as the proper names of sounds.
As there were over 7,000 exemplars to code, classification into form classes was done mostly automatically, using a priori prepared lists of pronouns, non-finite verbs, and adverbs, the verb say as head, as well as a list of proper names culled from the transcribed observations. Common nouns were a leftover category. After the automatic search and annotation, results were hand checked and corrected. Blind recoding of the form-class classification reached a perfect 100% recoding reliability. The results are presented in Table 1.
We next separated the form-class codes into Peirce's categories of signs as indexical signs (pronouns and proper nouns) or symbolic signs (common nouns, non-finite verb forms, and adverbs). The results of this classification are also presented in Table 1.
RESULTS AND DISCUSSION
Table 1 presents the distribution of complement words in two-word subject–verb, verb–object, and verb–indirect object combinations, by sign type and form class in parental speech.
Table 1. Distribution of complement words in subject–verb (SV), verb–direct object (VO), and verb–indirect object (VI) combinations, by sign type and form class in sentences of two words
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:35185:20160412043525904-0280:S0305000913000470_tab1.gif?pub-status=live)
Subject–verb couplets in two-word parental utterances
We found that in subject–verb combinations the great majority of all subjects were indexical signs, that is, the words serving as subject were pronouns (94·5%) or proper nouns (3·5%), e.g. You fell, That tickles, Mommy cried. Table 2 presents some more examples of sentences using indexical signs as subjects, objects, and indirect objects. Indexical signs accounted together for 98% of all sentences in the subject–verb combination. Common noun subjects accounted for just 2% of the sentences.
Table 2. Examples of parental two-word utterances expressing the subject–verb (SV), verb–direct object (VO), and verb–indirect object (VI) grammatical relations with an indexical expression for the subject, direct object, or indirect object, respectively
We found similar results for the two other grammatical relations.
Verb–object couplets in two-word parental utterances
The words serving as the object were in 89·8% of the cases indexical signs (pronouns and proper nouns, e.g. Hold it, Open that, See Kermit?), and only 10·2% were symbolic signs (common nouns, non-finite verb forms, and adverbs). Some more examples of sentences using indexical signs as objects are presented in Table 2. Adverbs appeared as direct objects in sentences such as Want again?
Verb–indirect object couplets in two-word parental utterances
In the verb–indirect object relation, 100% of the objects were pronouns and proper nouns, namely, indexical signs (e.g. Show her, Give Mommy). Some more examples of sentences using indexical signs as indirect objects are presented in Table 2.
Total core grammar in two-word parental utterances
To summarize, in the vast majority of the two-word core grammatical combinations the dependent (subject, object, or indirect object) is expressed by an indexical referential expression. Of the total 7,896 tokens of two-word sentences expressing core grammar combinations in parental speech, 7,414, or 93·9%, of the complements to the verb were indexical signs, mostly pronouns. This is to be expected given the length restriction which precludes multiword noun phrases.
The results strongly support the hypothesis that two-word sentences in the input containing core syntactic relations demonstrate the fundamental nature of syntax in a transparent fashion. Even if children only rely on such short sentences for building the foundations of a syntactic system, they may be able to learn that: (a) the units of syntax are two words in a binary relation; (b) syntactic combinations are asymmetrical, with one of the words of the pair – the head – determining the meaning and nature of the combination; and (c) a syntactic relation is between a given predicate word and one of its semantic arguments – the argument may take a different value in different contexts.
The hypothesis of lexical-specific learning of syntactic atoms
The present model of syntactic development is lexicalist, following the present consensus of mainstream generative grammar (e.g. Chomsky, Reference Chomsky1995). That is, instead of learning abstract schemas such as the subject–verb construction to which different verbs can be inserted, it is assumed that children apparently need to learn this pattern of word combination for each different verb separately. As syntactic rules are for the expression of specific predicate–argument relations, this means that for each verb the child needs to learn how to express each one of the verb's semantic arguments in the subject, object, or indirect object role, or maybe in some other syntactic role such as prepositional object. In other words, they need to learn for each verb and each syntactic dependent the coding of some verb-specific semantic argument to a surface position, case-marking, and other coding rules. For example, they need to learn that with the verb want the thing they want is expressed by a postverbal term, in the accusative case if it is a pronoun. Separately and independently, they need to learn that with the verb open, the thing opened is also expressed by a postverbal term, in the accusative case if it is a pronoun. The similarity in the two case roles (aka syntactic relations) causes the well-known facilitation in the learning curve for acquiring more and more different verbs with a direct object. The facilitation is built on similarity in the form of coding, not on semantic linking rules or abstract schemas. We shall test the hypothesis that children could learn such verb-specific mapping rules from parental two-word sentences modelling each verb and its core arguments, separately. We propose, therefore, that children learn the basics of syntax from the shortest possible sentences in the linguistic input, namely, parental two-word sentences, by a simple exemplar-learning process that, at the same time, teaches them flexibly usable syntactic atoms and the Dependency/Merge relation, without engaging in statistical computations. To test this hypothesis, we looked at young children's sentences containing the core grammatical relations, and compared them to parental two-word sentences expressing the same relations. Our hypothesis is that young children's core grammatical relations can be traced back to the two-word atoms in parental speech. More precisely, the quantitative hypothesis is that the verbs children use in their early subject–verb, verb–object, or verb–indirect object combinations, can be traced to the verbs parents use in the two-word sentences expressing the identical syntactic relation. As two-word sentences only constitute 3·6% of all parental sentences that contain exemplars of these core syntactic relations, it is a non-trivial prediction that the two-word input sentences cover most of the verbs of children's early syntax. Even if we find this hypothesis supported, we have not proven that children learned these patterns only from two-word sentences and not from longer parental utterances, but we will have demonstrated that the two-word input is a sufficient source for children to learn verb-specific syntax from, in this early stage of syntactic development.
STUDY 2
METHOD
Participants
The children's corpus was constructed in a previous stage of this project (Ninio, Reference Ninio2011, Chapter 2). We built a corpus of young children's multiword speech, using the same transcribed observations from which we took the parents' speech. Of the 471 children in the selected observations, 50 did not produce utterances with verbal grammar, resulting in a child sample of 421 different children. We limited the age of the children to three and a half years and restricted the contribution of each individual child to 300 multiword sentences, starting from the first observation in which they produced multiword utterances. Children's utterances were included only if they were spontaneous, namely, not immediate imitations of preceding adult utterances. For each utterance marked in the original transcriptions as one uttered by the child, we hand checked the context to made certain that the line was indeed child speech (and not, for example, an action description or parental sentence erroneously marked as child speech). The size of the resulting pooled corpus is 194,359 running words. The mean age of the children was 2 years and 29 days with a standard deviation of 4 months 9 days.
In the children's corpus, all multiword sentences were considered, not only two-word sentences as in the parental input corpus. We did not measure the MLU (Mean Length of Utterance) of the children in the corpus and it was not a selection criterion. However, we do have in the corpus utterances of the three children in the original study on the basis of which the measure of MLU was constructed, namely, the children Adam, Eve, and Sarah from the Brown sample. The first 300 multiword utterances from the start of observations were at Stage I of grammatical development in all three children, according to Brown's own definitions (Reference Brown1973, pp. 74–80). Hence the method of sampling the first 300 multiword utterances of young children apparently resulted in a corpus representing so-called Stage I speech by young children acquiring English as their first language.
Syntactic annotation for core grammatical relations
In the previous stage of the project, we manually annotated the child corpus for the three core grammatical relations involving verbs, namely the subject–verb, verb–object, and verb–indirect object relations in a method identical to the annotation of the parental corpus.
Lemmatizing verbs for comparison
We lemmatized all verbs in the texts into their respective stem groups. Lemmatization is the grouping of related verb forms that share the same stem and differ only in inflection or spelling. For example, eat, eats, ate, eaten, and eating all belong to the stem group or lemma of eat. In case of irregular verbs changing their shape when inflected, such as am and was of the verb be, these forms were also included in the lemma of the relevant stem. This process neutralizes differences in morphological shape irrelevant for the syntactic behaviour of verbs, such as differences of tense, aspect, and person. We used the lemmas in order to trace the verbs children used to the verbs in parental two-word utterances, ignoring possible morphological differences. This analysis assumes that young children ignore the differences in morphological form between verbs belonging to the same lemma, so that they treat an inflected form such as sits as equivalent to an uninflected form such as sit. In actuality, in the present study the lemmatization had a very marginal effect on the results, as shown below.
RESULTS
Subject, object, and indirect object in children's speech
In the children's corpus, we found a total of 25,796 tokens of the three core syntactic relations. Table 3 presents the distribution of children's core grammatical relations by syntactic type.
Table 3. Distribution of core syntactic relations in children's corpus
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:4591:20160412043525904-0280:S0305000913000470_tab3.gif?pub-status=live)
The child corpus was restricted to the speech of children below age three years and six months. The great majority of exemplars (82·7%) in this corpus was produced by children under two and a half years of age, most between two years and two and a half.
Proportion of children's productions of core grammatical relations which were attributable to parental two-word syntactic atoms
We estimated the proportion of children's productions of core grammatical relations which were attributable to parental two-word syntactic atoms of the same syntactic type headed by the same verbs, plus/minus lemmatization. A child's utterance was coded as accounted for by parental two-word atoms of the same type and the same verb if there was at least one parental utterance in the pooled corpus with the same verb stem. That is, child utterances were matched with the speech of all the parents in the pooled corpus, not specifically with their own parents' speech. Table 4 presents some examples of child utterances and the adult utterances that were considered to be of the same verb lemma.
Table 4. Examples of child utterances and parental utterances considered to be constructed with the same lemmatized verb stem in a given syntactic relation, for verb–direct object (VO), verb–indirect object (VI), and subject–verb (SV) combinations
Basing the equivalence on lemmatized verb stems made only a marginal difference to the proportion of child utterances which were deemed to have been accounted for by parental utterances with the same verbs. In the VO and VI relations, parental two-word utterances were almost exclusively imperatives, using the bare infinitive form of the verb. Children's multiword utterances were also mostly imperatives, as in do a mummy's body, or else they were statements and requests with a first person subject, as in I bring the fence and I want some tea. In such cases, the parents' stem form verb was a perfect match. The same happens when children describe a third person subject's action and use an uninflected verb, as in tiger bang his head. However, the methodology we have used also classifies as a match an inflected verb used by the children with an uninflected imperative said by parents, as in he bites my finger versus bite me. The opposite approximate-only match is exemplified by some utterances from the subject–verb set, for instance the child-produced piggie squeak and the parental it squeaks. In the majority of cases, however, children's verb forms had an exactly identical equivalent in some parental utterances, for instance, the children's oh that hurts has a same-form equivalent in the parental that hurts, and another form produced by children as in that hurt also has parental equivalents as in that hurt! We may summarize that in the great majority of cases, the lemmatization process made no difference to the coding of a child utterance as accounted for by at least one parental utterance in the corpus.
Table 5 presents the proportion of children's productions of core grammatical relations which were attributable to parental two-word syntactic atoms of the same syntactic type headed by the same verbs for the SV (subject–verb), the VO (verb–object), and the VI (verb–indirect object) grammatical relations.
Table 5. Distribution of tokens of the subject–verb (SV), verb–direct object (VO), and verb–indirect object (VI) grammatical relation in child speech that could be accounted for by the two-word atoms of the same type in the input
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:38332:20160412043525904-0280:S0305000913000470_tab5.gif?pub-status=live)
In 96% of all tokens of the core grammatical relations produced by children, the verb used was one that occurred in parental two-word utterances with the same grammatical relation. The syntactic atoms not covered seem to represent low-frequency items which may be under-represented in the parental two-word sample. This result strongly supports the hypothesis that two-word parental speech may provide the input for the great majority of children's syntactic word combinations. The results signify that children at the beginning of the acquisition of syntax need not go further than the easiest, least complex input exemplars in order to master a great deal of basic syntax. It may not be required at this stage of learning that they attempt to process longer input sentences with their more complicated, less accessible structure. Parents' two-word utterances provide transparent information on one of the words being a logical variable; children do not have to collect a set of such sentences in order to abstract out a generalizable syntactic rule with a variable element that can be used in other contexts than the one in which the sentences were originally heard. Indexical complements are ‘shifters’, by definition they refer to a different entity at each mention. Mastering parents' two-word syntactic atoms are sufficient to teach children usable and reusable flexible word combinations.
GENERAL DISCUSSION
This study adds to our understanding of the mechanisms underlying child language development by pointing to the set of easily accessible, two-word utterances in the parental input as the possible sources of learning both the major principles of generative syntax and also immediately usable and generalizable schematic rules for the core grammar of verbs. Because of the special character of parental two-word utterances expressing verb–argument relations, simple item-specific learning of these exemplars would give not just a rote-learned utterance but already a productive schema for generating word combinations on the arguments as a category. Once the principle of a variable complement to the verb is internalized, the learner can use the same verb-specific schemas with any referential expression as the value of the variable. Such learning does not require segmentation, abstraction, or storage of unanalyzed exemplars; in fact, without any further statistical or other distributional analysis, comparison, or alignment of multiple sentences, without abstraction or generalization, these utterances provide the child with productive syntactic atoms with a variable element that can take any value in different circumstances. If we wish, we could view the productive schemas as verb-specific formulae with a variable slot for the subject, or for the object, similar to the frame-and-slot schemata suggested in the literature. Often it is thought that children get a variable slot when they collect many sentences, align them on shared words, and get the complements – the varying part – as a slot. In the proposed learning mechanism, the slots are given by the individual sentences of two-word input – and no distributional analysis is needed in order to extract them. The items which are syntactic atoms in the two-word input are already with a variable element, and not with a specific complement to be generalized over, as is assumed in the literature.
Syntactic relations represent autonomous syntax and are not Constructions
This theory is presented as an alternative to Construction Grammar. While the data themselves do not distinguish between the two theories, there are other reasons to prefer the theory presented here over Construction Grammar. Construction Grammar assumes that each syntactic structure/relation is associated with a particular meaning, but there are reasons to believe that this is not the case, especially when we consider children's early word combinations.
Children need to learn not only principles of syntactic connectivity but also, and in particular, individual predicates' lexical-specific combining behaviour. Syntactic relations – also called grammatical relations, case roles, grammatical cases – are certain prescribed ways of overtly encoding the verb's semantic arguments. The coding methods of English include word order, case-marking, and cross-referencing, namely agreement. Subjects typically appear in sentences in a preverbal position; objects and indirect objects in a postverbal position. Subjects are typically in a nominative case, which in Modern English means that the personal pronouns used are I, he, and she, and not me, him, and her. The latter are preserved for direct and indirect objects, as in the sentence She liked me but she liked him better. She – the subject – is in the nominative case and in the preverbal position; me and him – the direct objects – are in accusative case and placed postverbally. Syntactic roles such as direct object have other features besides coding (Andrews, Reference Andrews and Shopen1985), including determining the probability of expressions to be relativized, passivized, or topicalized, as described in the Accessibility Hierarchy of Keenan and Comrie (Reference Keenan and Comrie1977), but these are beyond our scope in the present paper.
In mainstream generative linguistics, grammatical relations are not thought to have defining semantic properties (e.g. Jackendoff, Reference Jackendoff1997). On the contrary, the motivation for positing such terms as subject or object is precisely because they are needed to label some purely formal and behavioural categories which are not semantically homogeneous. This is especially so regarding the core grammatical relations we are focusing on. When linguists claim syntax is fundamentally autonomous as there is no one-to-one mapping of form to meaning, they probably mostly refer to the fact that core grammatical relations are not associated with any particular meaning.
One of the defining properties of core grammatical relations is their wide semantic range, or, as it is called in some linguistic terminology, their restricted neutralization of semantic distinctions (Andrews, Reference Andrews and Shopen1985; Lyons, Reference Lyons1968: 439; Van Valin, Reference Van Valin and Van Valin1993). Van Valin's Role and Reference Grammar, for example, points out that grammatical relations neutralize the semantic macro-roles Actor and Undergoer, so that, for instance, the entity encoded as subject can be either. Givón (Reference Givón and Givón1997, pp. 2–3), who talks about the dissociation of grammatical case roles from semantic roles rather than neutralization of the latter, lists the multiple semantic roles of subjects and direct objects in English to illustrate the point. He shows that grammatical subjects can have many different semantic roles such as, variously, patient of state, patient of change, dative, and agent. Grammatical objects can have the semantic roles of the patient of state, patient of change, ablative, allative, ingressive, dative, and benefactive. Schlesinger (Reference Schlesinger, Aarts and Meyer1995) reviews some of the linguistic literature on semantics of direct objects in a chapter devoted to this topic and concludes that objects possess a practically infinite variety of semantic roles. If we want to consider a whole sentence-level construction rather than individual case roles, it should be mentioned that the subject–verb–object (SVO) pattern is similarly associated with a wide variety of meanings (Dowty, Reference Dowty1991). Lastly, the double-object construction of the ditransitive, which is often considered to be prototypically reserved for meanings associated with the transfer of possession, in actuality has, in English, quite a variety of different semantics, as pointed out by Jackendoff (Reference Jackendoff1997, p. 175).
Andrews (Reference Andrews and Shopen1985, p. 82) generalizes that the case roles of core grammatical relations, the so-called ‘syntactic cases’, always imply a great degree of semantic variability, and should best be viewed as expressing some abstract grammatical relation, not necessarily correlated with semantic roles or any other aspect of meaning. Our special interest in core grammatical relations, therefore, targets that part of English grammar in which there is, by definition, a dissociation of abstract syntactic entities from classes with homogeneous semantics. For these reasons, the possibility should be considered that although there is much merit to a construction-based theory of grammar, some syntactic patterns (i.e. the core grammatical relations) are not constructions, and parts of syntax are built in the atomic generative way, regardless of semantics. Such suggestions were made by Jackendoff (Reference Jackendoff1997), otherwise sympathetic to the Construction Grammar project. Jackendoff argues that basic phrase structure, structural case marking, and agreement are syntactically autonomous, and the relevant phenomena are better not included among the meaningful constructions covered by the theory.
For us, the crucial point is that a formal pattern encodes individual and verb-specific semantics rather than a meaning shared by the same pattern over all participating verbs. This means that each verb is to be learned individually in each of the core syntactic relations; there is no semantics common to the whole category, and hence no possibility of employing a linking rule that would apply to all possible verbs getting, e.g. a direct object. The significance for developmental theory is that core syntactic relations are not constructions as the term is used in Construction Grammar. Construction Grammar assumes that syntactic structures are symbolic units that combine a particular form with a particular meaning (cf. Goldberg Reference Goldberg1995, p. 4), while the subject–verb, verb–object, and verb–indirect object patterns are not associated with any particular meaning, nor have they typical or prototypical semantics. In order to learn such word combinations, children need to learn each verb separately, and not to learn the pattern as if it were an abstract ‘argument structure construction’, with semantics associated with the form as the whole. If a child would attempt to transfer to a new verb the association between thematic role and syntax of an already learned verb, this move would be misleading on very many occasions. For instance, many English-speaking children such as Bowerman's Eva begin the production of verb–object combinations with the verb want, in such sentences as want bottle, want juice, and want see (Bowerman, Reference Bowerman, Morehead and Morehead1976, p. 157). As Bowerman points out, the semantics of these word combinations is specific to the verb want and their meaning is the expression of a need for some object or action specified in the direct object. This semantic relation for the direct object cannot be generalized to other transitive verbs Eva used at this time as single-word utterances, which were names for actions such as wipe, push, open, close, bite, and throw, and names for states such as see and got (meaning ‘have’). Bowerman herself suggests this is the explanation for the fact that Eva did not start to combine her other verbs with a direct object until another month had passed. Nor do the different direct objects this child learns to express throughout the coming months accumulate to a verb–object construction with a prototypical semantics. The semantic roles of the object of want, of see, of wipe, and of push do not join together to a coherent prototypical meaning. Attempts to find such semantics for transitive word combinations either in parental or child speech by researchers working within the Construction Grammar tradition failed to find any prominent and especially frequent thematic role that could have been taken as a prototypical meaning for the constructions. For instance, Goldberg (Reference Goldberg2005) and her associates (Sethuraman & Goodman, Reference Sethuraman, Goodman and Clark2004) have reported that the pattern of direct objects in parental speech does not provide the required prototypical semantics to be considered a meaningful construction. Indeed, in none of the three core grammatical relations is there a most frequent parental verb that demonstrates prototypical semantics and can thus serve the hypothetical process by which the construction in the abstract gets associated with the relevant prototypical semantics. Moreover, Ninio (Reference Ninio2011, pp. 114–116) demonstrated that in parental child-addressed speech, the most frequent use of the subject–verb combination is grammatical or purely formal (76%) rather than semantic; namely, the great majority of subjects in parental speech are the formal subjects of auxiliary verbs and copulas, not ones filling some semantic role versus the finite verb that possesses them as its subject. Interestingly, this was also true of Stage I child speech. Other counter-arguments to the notion that core syntactic relations possess prototypical semantics can be found in Bowerman (Reference Bowerman1990), Rispoli (Reference Rispoli and Levy1994), and various chapters in the recent collection edited by Mueller-Gathercole (Reference Mueller-Gathercole2009).
The significance of these findings is that Construction Grammar is not an appropriate theoretical umbrella for a model of acquisition, when it comes to children's learning to produce their earliest multiword utterances. We need to explain how children learn to produce verb–object combinations such as Take this; Construction Grammar cannot help us as such combinations are not constructions. That is, when it comes to the acquisition of the primitive units of syntax, Construction Grammar is inappropriate as its primitive units are meaningful linguistic signals or meaning–form patterns – which the core syntactic combinations are not.
The learning model supported by these results
The results of Study 2 suggest that children may indeed learn from parental two-word utterances to generate their earliest set of core syntax, namely, subject–verb, verb–object, and verb–indirect object word combinations, using various words as the complements of the verbs. We can not, with this methodology, prove that parental two-word input utterances were the source of children's earliest set of core syntax, but the overlap of the verbs used in the input and output sets show that the possibility of such learning is not contradicted by the data. As an interesting support for such a learning process, in a recent study by McCune (unpublished observations), investigating the SV, VO, and SVO patterns in four children acquiring English, it was found that pronoun use spikes before the children begin to produce sentences with grammatical relations. This suggests that acquisition of pronouns is a condition for learning syntax, a finding whose significance becomes clear under the present model of acquisition.
After learning the basic dependency combinatory mechanism and the generation of two-word syntactic atoms, the next developmental task children face is extending the two-word syntactic structure to include more words, namely, more than a single dependency relation. That is, children need to learn to apply the dependency operation iteratively. This skill is needed in order to build syntactically connected three-word sentences, and, with repeated iterations, sentences of any length. Evidence for the acquisition of the principle of iteration of the head-dependent relation as a separate developmental stage comes from researchers such as Elbers (Reference Elbers1990), Hill (Reference Hill1984), and Powers (Reference Powers, Witruk and Friederici2002), who have observed that moving from a single two-word combination to a three-word sentence in which a second dependency relation is built on one of the original words, poses a special difficulty for some children. In particular, sometimes children repeat the shared word participating in the two combinations, creating a sequence of two separate dependencies instead of a combined one, such as in the sentence Take this, this ball (see also MacWhinney, Reference MacWhinney and Kuczaj1982, for a review of more research findings on such redundant combinations). With further development, the shared word occurs only once, generating the correct Take this ball. In addition, Ninio (Reference Ninio1994) reported on other difficulties children face in the production of their early three-word sentences, such as keeping one dependency relation open in working memory while dealing with the other one. It may take children much longer to learn to build various complex constructions, but once they master the necessary skills to construct binary syntactic atoms and to add another word by iterating the dependency relation, they should be able to generate a respectable set of English sentences.
The possibility that children can derive the fundamentals of syntactic knowledge from simple parental input raises the possibility that it is possible to learn syntax without any help from innate knowledge encoded in the human genome. The parental exemplars are transparent sources of information and a child can learn that syntactic relations are binary, asymmetrical, and made up of arguments combining with predicates. It is possible that further studies will reveal the role of even earlier learning, more precisely the acquisition of predicates in the single-word stage, as an essential step in the learning process. Such a study is presently in progress (Ninio, unpublished observations) and will hopefully round off the empiricist learning model suggested in the present study.
The proposed learning process relies on a two-element communicative format in which first a joint focus of attention is established, then some attribute is predicated of this entity. This appears to be a general strategy for language acquisition as the format is identical in structure to the ‘naming game’ of the acquisition of vocabulary in which the first step of the format is to establish a object of joint focus of attention with some orienting utterance, e.g. look here!, then to predicate of the referent that it is a zebra, or ball, or keys (Brown, Reference Brown1973; Ninio & Bruner, Reference Ninio and Bruner1978). The crucial role of joint focus of attention in communication is well known; talk is one of the more important arrangements for people to enter an intersubjective mental world where they deal with matters which have captured their attention (Goffman, Reference Goffman1976). Language acquisition appears to follow the path of other cognitive processes such as visual and auditory perception, problem-solving, and more, that apply the basic procedure of establishing in working memory some visual or auditory ‘object of attention’ and then perform a perceptual or computational task on it (Kahneman, Treisman & Gibbs, Reference Kahneman, Treisman and Gibbs1992; Pylyshyn, Reference Pylyshyn1989; Scholl, Reference Scholl2001). Learning the basics of syntax from the parental input may turn out to be a remarkably simple task, not involving a heavy computational load, nor necessitating the genetic transmission of innate principles for it to work.