摘要
Non-Linguistic Constraints on the Acquisition of Phrase Structure Jenny R. Saffran (jsaffran@facstaff.wisc.edu) Department of Psychology; 1202 W. Johnson Street Madison, WI 53706 USA Abstract To what extent is linguistic structure learnable from statisti- cal information in the input? One set of cues which might as- sist in the discovery of hierarchical phrase structure given se- rially presented input are the dependencies, or predictive rela- tionships, present within phrases. In order to determine whether adult learners can use this statistical information, subjects were exposed to artificial languages which either contained or violated the kinds of dependencies which charac- terize natural languages. The results suggest that adults pos- sess learning mechanisms which detect and utilize statistical cues to phrase and hierarchical structure. A second experiment contrasted the acquisition of these linguistic systems with the same grammars implemented as non-linguistic input (se- quences of non-linguistic sounds or shapes). These findings suggest that constraints on the mechanisms which highlight the statistical cues which are most characteristic of human languages are not specifically tailored for language learning. Introduction While the idea that surface distributional patterns point to pertinent linguistic structures holds a distinguished place in linguistic history (e.g., Bloomfield, 1933; Harris, 1951), statistical learning has only recently re-emerged as a poten- tial contributing force in language acquisition (though see Maratsos & Chalkley, 1980). This renewed interest in sta- tistical learning has been fueled by developments in compu- tational modeling, by the widespread availability of large corpora of child-directed speech, and most recently by em- pirical research demonstrating that human subjects can per- form statistical language learning tasks in laboratory ex- periments. For example, computational algorithms can use the co-occurrence environments of words to discover form classes in large corpora (e.g., Cartwright & Brent, 1997; Finch & Chater, 1994; Mintz, 1996; Mintz, Newport, & Bever, 1995). Similarly, individual verb argument structures can be induced by models which tracks the co-occurrences of verbs and their arguments in the input (e.g., Schutze, 1994; Seidenberg & MacDonald, 1999). Extensive modeling work has also examined the statistical cues available for the dis- covery of word boundaries in continuous speech (e.g., Aslin, Woodward, LaMendola, & Bever, 1996; Brent & Cartwright, 1996; Cairns, Shillcock, Chater, & Levy, 1997; Christian- sen, Allen, & Seidenberg, 1998; Perruchet & Vintner, These models provide invaluable explorations of the ex- tent to which statistical information is available, in princi- ple, to language learners equipped with the right distribu- tional tools. But are humans such learners? A wealth of sta- tistical cues are useless unless humans can detect and use them. In fact, recent research suggests that humans are ex- tremely good at some statistical language learning tasks, such as word segmentation (e.g., Aslin, Saffran, & New- port, 1998; Goodsitt, Morgan & Kuhl, 1993; Saffran, Aslin, & Newport, 1996; Saffran, Newport, & Aslin, 1996) These results suggest that humans possess powerful sta- tistical language learning mechanisms, which are likely to provide important contributions to the language learning process. At the same time, it is important to recognize that these mechanisms would not be useful in language acquisi- tion unless they are somehow constrained or biased to per- form only certain kinds of computations over certain kinds of input. The pertinent generalizations to be drawn from a linguistic corpus are awash in irrelevant information. Any learning device without the right architectural, representa- tional, or computational constraints risks being sidetracked by the massive number of misleading generalizations avail- able in the input (e.g., Gleitman & Wanner, 1982; Pinker, 1984). There are an infinite number of linguistically irrele- vant statistics that an overly powerful statistical learner could compute: for example, which words are presented third in sentences, or which words follow words whose second syllable begins with th (e.g., Pinker, 1989). One way to avoid this combinatorial explosion would be to impose constraints on statistical learning which perform only a subset of the logically possible computations. It is clear that learning in biological systems is limited by inter- nal factors; there are species differences in which specific types of stimuli serve as privileged input (e.g., Garcia & Koelling, 1966; Marler, 1991). External factors also strongly bias learning, because input from structured do- mains consists of non-random information. In order for sta- tistical learning accounts to succeed, learners must be simi- larly constrained: humans must be just the type of statistical learners who are best suited to acquire the type of input ex- emplified by natural languages, focusing on linguistically relevant statistics while ignoring the wealth of available irrelevant computations. Such constraints might arise from various sources, either specific to language or from more general cognitive and/or perceptual constraints on human learning.