Having surveyed a wide range of posited universals and found them wanting, Evans & Levinson (E&L) propose instead that the “common patterns” observed in the organization of human languages are due to cognitive constraints and cultural factors. We offer empirical evidence in support of both these ideas. (See Fig. 1.)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160627031851-15221-mediumThumb-S0140525X09990756_fig1g.jpg?pub-status=live)
Figure 1. Examples of child-directed speech in six languages. It is not necessary to be able to read, let alone understand, any of these languages to identify the most prominent structural characteristics common to these examples (see text for a hint). These characteristics should, therefore, be readily apparent to a prelinguistic infant, which is indeed the case, as the evidence we mention suggests. All the examples are from CHILDES corpora (MacWhinney Reference MacWhinney2000).
One kind of common pattern is readily apparent in the six examples of child-directed speech in Figure 1, in each of which partial matches between successive utterances serve to highlight the structural regularities of the underlying language. Two universal principles facilitating the identification of such regularities can be traced to the work of Zellig Harris (Reference Harris1946; Reference Harris1991). First, the discovery of language structure, from morphemes to phrases, can proceed by cross-utterance alignment and comparison (Edelman & Waterfall Reference Edelman and Waterfall2007; Harris Reference Harris1946). Second, the fundamental task in describing a language is to state the departures from equiprobability in its sound- and word-sequences (Harris Reference Harris1991, p. 32; cf. Goldsmith Reference Goldsmith and de Carvalho2007).
These principles are precisely those used by the only two unsupervised algorithms currently capable of learning productive construction grammars from large-scale raw corpus data, ADIOS (Solan et al. Reference Solan, Horn, Ruppin and Edelman2005) and ConText (Waterfall et al., under review). Both algorithms bootstrap from completely unsegmented text to words and to phrase structure by recursively identifying candidate constructions in patterns of partial alignment between utterances in the training corpus. Furthermore, in both algorithms, candidate structures must pass a statistical significance test before they join the growing grammar and the learning resumes (the algorithms differ in the way they represent corpus data and in the kinds of significance tests they impose).
These algorithms exhibited hitherto unrivaled – albeit still very far from perfect – capacity for language learning, as measured by (1) precision, or acceptability of novel generated utterances, (2) recall, or coverage of withheld test corpus, (3) perplexity, or average uncertainty about the next lexical element in test utterances, and (4) performance in certain comprehension-related tasks (Edelman & Solan, under review; Edelman et al. Reference Edelman, Solan, Horn, Ruppin, Brugos, Clark-Cotton and Ha2005; Reference Edelman, Solan, Horn, Ruppin, Forbus, Gentner and Regier2004; Solan et al. Reference Solan, Horn, Ruppin and Edelman2005). They have been tested, to varying extents, in English, French, Hebrew, Mandarin, Spanish, and a few other languages. The learning algorithms proved particularly effective when applied to raw, transcribed, child-directed speech (MacWhinney Reference MacWhinney2000), achieving precision of 54% and 63% in Mandarin and English, respectively, and recall of about 30% in both languages (Brodsky et al. Reference Brodsky, Waterfall, Edelman, McNamara and Trafton2007; Solan et al. Reference Solan, Ruppin, Horn, Edelman, Alterman and Kirsh2003).
To the extent that human learners rely on the same principles of aligning and comparing potentially relatable utterances, one may put these principles forward as the source of part of speech, phrase structure, and other structural “universals.” In other words, certain forms may be common across languages because they are easier to learn, given the algorithmic constraints on the learner.Footnote 1
Language acquisition becomes easier not only when linguistic forms match the algorithmic capabilities of the learner, but also when the learner's social environment is structured in various helpful ways. One possibility here is for mature speakers to embed structural cues in child-directed speech (CDS). Indeed, a growing body of evidence suggests that language acquisition is made easier than it would have been otherwise because of the way CDS is shaped by caregivers during their interaction with children.Footnote 2 One seemingly universal property of CDS is the prevalence of variation sets (Hoff-Ginsberg Reference Hoff-Ginsberg1990; Küntay & Slobin Reference Küntay, Slobin, Slobin and Gerhardt1996; Waterfall Reference Waterfall2006; under review) – partial alignment among phrases uttered in temporal proximity, of the kind illustrated in Figure 1. The proportion of CDS utterances contained in variation sets is surprisingly constant across languages: 22% in Mandarin, 20% in Turkish, and 25% in English (when variation sets are defined by requiring consecutive caregiver utterances to have in common at least two lexical items in the same order; cf. Küntay & Slobin Reference Küntay, Slobin, Slobin and Gerhardt1996; this proportion grows to about 50% if a gap of two utterances is allowed between the partially matching ones). Furthermore, the lexical items (types) on which CDS utterances are aligned constitute a significant proportion of the corpus vocabulary, ranging from 9% in Mandarin to 32% in English.
Crucially, the nouns and verbs in variation sets in CDS were shown to be related to children's verb and noun use at the same observation, as well as to their production of verbs, pronouns, and subcategorization frames four months later (Hoff-Ginsberg, Reference Hoff-Ginsberg1990; Waterfall Reference Waterfall2006; under review). Moreover, experiments involving artificial language learning highlighted the causal role of variation sets: adults exposed to input which contained variation sets performed better in word segmentation and phrase boundary judgment tasks than a control group that heard the same utterances in a scrambled order, which had no variation sets (Onnis et al. Reference Onnis, Waterfall and Edelman2008).
The convergence of the three lines of evidence mentioned – the ubiquity of variation sets in child-directed speech in widely different languages, their proven effectiveness in facilitating acquisition, and the algorithmic revival of the principles of acquisition intuited by Harris – supports E&L's proposal of the origin of observed universals. More research is needed to integrate the computational framework outlined here with models of social interaction during acquisition and with neurobiological constraints on learning that undoubtedly contribute to the emergence of cognitive/cultural language universals.