The neglected universals: Learnability constraints and discourse cues

Heidi Waterfall; Shimon Edelman

doi:10.1017/S0140525X09990756

The neglected universals: Learnability constraints and discourse cues

Published online by Cambridge University Press: 26 October 2009

Heidi Waterfall and

Shimon Edelman

Show author details

Heidi Waterfall: Affiliation:
Department of Psychology, Cornell University, Ithaca, NY 14853, and Department of Psychology, University of Chicago, Chicago, IL 60637. heidi.waterfall@gmail.comhttp://kybele.psych.cornell.edu/~heidi
Shimon Edelman: Affiliation:
Department of Psychology, Cornell University, Ithaca, NY 14853, and Department of Brain and Cognitive Engineering, Korea University, Seoul 136-713, South Korea. se37@cornell.eduhttp://kybele.psych.cornell.edu/~edelman

Article contents

Abstract
Footnotes
References

Rights & Permissions

Abstract

Converging findings from English, Mandarin, and other languages suggest that observed “universals” may be algorithmic. First, computational principles behind recently developed algorithms that acquire productive constructions from raw texts or transcribed child-directed speech impose family resemblance on learnable languages. Second, child-directed speech is particularly rich in statistical (and social) cues that facilitate learning of certain types of structures.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 32 , Issue 5 , October 2009 , pp. 471 - 472

DOI: https://doi.org/10.1017/S0140525X09990756 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2009

Having surveyed a wide range of posited universals and found them wanting, Evans & Levinson (E&L) propose instead that the “common patterns” observed in the organization of human languages are due to cognitive constraints and cultural factors. We offer empirical evidence in support of both these ideas. (See Fig. 1.)

Figure 1. Examples of child-directed speech in six languages. It is not necessary to be able to read, let alone understand, any of these languages to identify the most prominent structural characteristics common to these examples (see text for a hint). These characteristics should, therefore, be readily apparent to a prelinguistic infant, which is indeed the case, as the evidence we mention suggests. All the examples are from CHILDES corpora (MacWhinney Reference MacWhinney2000).

One kind of common pattern is readily apparent in the six examples of child-directed speech in Figure 1, in each of which partial matches between successive utterances serve to highlight the structural regularities of the underlying language. Two universal principles facilitating the identification of such regularities can be traced to the work of Zellig Harris (Reference Harris1946; Reference Harris1991). First, the discovery of language structure, from morphemes to phrases, can proceed by cross-utterance alignment and comparison (Edelman & Waterfall Reference Edelman and Waterfall2007; Harris Reference Harris1946). Second, the fundamental task in describing a language is to state the departures from equiprobability in its sound- and word-sequences (Harris Reference Harris1991, p. 32; cf. Goldsmith Reference Goldsmith and de Carvalho2007).

These principles are precisely those used by the only two unsupervised algorithms currently capable of learning productive construction grammars from large-scale raw corpus data, ADIOS (Solan et al. Reference Solan, Horn, Ruppin and Edelman2005) and ConText (Waterfall et al., under review). Both algorithms bootstrap from completely unsegmented text to words and to phrase structure by recursively identifying candidate constructions in patterns of partial alignment between utterances in the training corpus. Furthermore, in both algorithms, candidate structures must pass a statistical significance test before they join the growing grammar and the learning resumes (the algorithms differ in the way they represent corpus data and in the kinds of significance tests they impose).

These algorithms exhibited hitherto unrivaled – albeit still very far from perfect – capacity for language learning, as measured by (1) precision, or acceptability of novel generated utterances, (2) recall, or coverage of withheld test corpus, (3) perplexity, or average uncertainty about the next lexical element in test utterances, and (4) performance in certain comprehension-related tasks (Edelman & Solan, under review; Edelman et al. Reference Edelman, Solan, Horn, Ruppin, Brugos, Clark-Cotton and Ha2005; Reference Edelman, Solan, Horn, Ruppin, Forbus, Gentner and Regier2004; Solan et al. Reference Solan, Horn, Ruppin and Edelman2005). They have been tested, to varying extents, in English, French, Hebrew, Mandarin, Spanish, and a few other languages. The learning algorithms proved particularly effective when applied to raw, transcribed, child-directed speech (MacWhinney Reference MacWhinney2000), achieving precision of 54% and 63% in Mandarin and English, respectively, and recall of about 30% in both languages (Brodsky et al. Reference Brodsky, Waterfall, Edelman, McNamara and Trafton2007; Solan et al. Reference Solan, Ruppin, Horn, Edelman, Alterman and Kirsh2003).

To the extent that human learners rely on the same principles of aligning and comparing potentially relatable utterances, one may put these principles forward as the source of part of speech, phrase structure, and other structural “universals.” In other words, certain forms may be common across languages because they are easier to learn, given the algorithmic constraints on the learner.Footnote ¹

Language acquisition becomes easier not only when linguistic forms match the algorithmic capabilities of the learner, but also when the learner's social environment is structured in various helpful ways. One possibility here is for mature speakers to embed structural cues in child-directed speech (CDS). Indeed, a growing body of evidence suggests that language acquisition is made easier than it would have been otherwise because of the way CDS is shaped by caregivers during their interaction with children.Footnote ² One seemingly universal property of CDS is the prevalence of variation sets (Hoff-Ginsberg Reference Hoff-Ginsberg1990; Küntay & Slobin Reference Küntay, Slobin, Slobin and Gerhardt1996; Waterfall Reference Waterfall2006; under review) – partial alignment among phrases uttered in temporal proximity, of the kind illustrated in Figure 1. The proportion of CDS utterances contained in variation sets is surprisingly constant across languages: 22% in Mandarin, 20% in Turkish, and 25% in English (when variation sets are defined by requiring consecutive caregiver utterances to have in common at least two lexical items in the same order; cf. Küntay & Slobin Reference Küntay, Slobin, Slobin and Gerhardt1996; this proportion grows to about 50% if a gap of two utterances is allowed between the partially matching ones). Furthermore, the lexical items (types) on which CDS utterances are aligned constitute a significant proportion of the corpus vocabulary, ranging from 9% in Mandarin to 32% in English.

Crucially, the nouns and verbs in variation sets in CDS were shown to be related to children's verb and noun use at the same observation, as well as to their production of verbs, pronouns, and subcategorization frames four months later (Hoff-Ginsberg, Reference Hoff-Ginsberg1990; Waterfall Reference Waterfall2006; under review). Moreover, experiments involving artificial language learning highlighted the causal role of variation sets: adults exposed to input which contained variation sets performed better in word segmentation and phrase boundary judgment tasks than a control group that heard the same utterances in a scrambled order, which had no variation sets (Onnis et al. Reference Onnis, Waterfall and Edelman2008).

The convergence of the three lines of evidence mentioned – the ubiquity of variation sets in child-directed speech in widely different languages, their proven effectiveness in facilitating acquisition, and the algorithmic revival of the principles of acquisition intuited by Harris – supports E&L's proposal of the origin of observed universals. More research is needed to integrate the computational framework outlined here with models of social interaction during acquisition and with neurobiological constraints on learning that undoubtedly contribute to the emergence of cognitive/cultural language universals.

Footnotes

1. Language may also be expected to evolve in the direction of a better fit between its structure and the learners' abilities (Christiansen & Chater Reference Christiansen and Chater2008).

2. Social cues complement and reinforce structural ones in this context (Goldstein & Schwade Reference Goldstein and Schwade2008).

References

Brodsky, P., Waterfall, H. R. & Edelman, S. (2007) Characterizing motherese: On the computational structure of child-directed language. In: Proceedings of the 29th Cognitive Science Society Conference, ed. McNamara, D. S. & Trafton, J. G., pp. 833–38. Cognitive Science Society.Google Scholar

Christiansen, M. H. & Chater, N. (2008) Language as shaped by the brain. Behavioral and Brain Sciences 31(5):489–509; discussion 509–58.CrossRef Google Scholar PubMed

Edelman, S. & Solan, Z. (in press) Translation using an automatically inferred structured language model.Google Scholar

Edelman, S., Solan, Z., Horn, D. & Ruppin, E. (2004) Bridging computational, formal and psycholinguistic approaches to language. In: Proceedings of the 26th Conference of the Cognitive Science Society, ed. Forbus, K., Gentner, D. & Regier, T., pp. 345–50. Erlbaum.Google Scholar

Edelman, S., Solan, Z., Horn, D. & Ruppin, E. (2005) Learning syntactic constructions from raw corpora. In: Proceedings of the 29th annual Boston University Conference on language development, ed. Brugos, A., Clark-Cotton, M. R. & Ha, S., pp. 180–91. Cascadilla Press.Google Scholar

Edelman, S. & Waterfall, H. R. (2007) Behavioral and computational aspects of language and its acquisition. Physics of Life Reviews 4:253–77.CrossRef Google Scholar

Goldsmith, J. A. (2007) Towards a new empiricism. In: Recherches linguistiques à Vincennes, vol. 36, ed. de Carvalho, J. B.. Presses universitaires de Vincenne.Google Scholar

Goldstein, M. H. & Schwade, J. A. (2008) Social feedback to infants' babbling facilitates rapid phonological learning. Psychological Science 19:515–23.CrossRef Google Scholar PubMed

Harris, Z. S. (1946) From morpheme to utterance. Language 22:161–83.CrossRef Google Scholar

Harris, Z. S. (1991) A theory of language and information. Clarendon.CrossRef Google Scholar

Hoff-Ginsberg, E. (1990) Maternal speech and the child's development of syntax: A further look. Journal of Child Language 17:85–99.CrossRef Google Scholar PubMed

Küntay, A. & Slobin, D. (1996) Listening to a Turkish mother: Some puzzles for acquisition. In: Social interaction, social context, and language: Essays in honor of Susan Ervin-Tripp, ed. Slobin, D. & Gerhardt, J., pp. 265–86. Erlbaum.Google Scholar

MacWhinney, B., ed. (2000) The CHILDES Project: Tools for analyzing talk. Vol. 1: Transcription format and programs. Vol. 2: The Database. Erlbaum.Google Scholar

Onnis, L., Waterfall, H. R. & Edelman, S. (2008) Learn locally, act globally: Learning language from variation set cues. Cognition 109:423–30.CrossRef Google Scholar PubMed

Solan, Z., Horn, D., Ruppin, E. & Edelman, S. (2005) Unsupervised learning of natural languages. Proceedings of the National Academy of Sciences USA 102:11629–34.CrossRef Google Scholar PubMed

Solan, Z., Ruppin, E., Horn, D. & Edelman, S. (2003) Unsupervised efficient learning and representation of language structure. In: Proceedings of the 25th conference of the Cognitive Science Society, ed. Alterman, R. & Kirsh, D., pp. 1106–11. Erlbaum.Google Scholar

Waterfall, H. R. (2006) A little change is a good thing: Feature theory, language acquisition and variation sets. Doctoral dissertation, University of Chicago.Google Scholar

Waterfall, H. R. (under review) A little change is a good thing: The relation of variation sets to children's noun, verb and verb-frame development.Google Scholar

Waterfall, H. R., Sandbank, B., Onnis, L. & Edelman, S. (under review) An empirical generative framework for computational modeling of language acquisition.Google Scholar

Figure 1. Examples of child-directed speech in six languages. It is not necessary to be able to read, let alone understand, any of these languages to identify the most prominent structural characteristics common to these examples (see text for a hint). These characteristics should, therefore, be readily apparent to a prelinguistic infant, which is indeed the case, as the evidence we mention suggests. All the examples are from CHILDES corpora (MacWhinney 2000).

Article contents

The neglected universals: Learnability constraints and discourse cues

Abstract

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests