148x Filetype PDF File size 0.88 MB Source: aclanthology.org
Co-evolution of Language and of the Language Acquisition Device Ted Briscoe ejb¢cl, cam. ac. uk Computer Laboratory University of Cambridge Pembroke Street Cambridge CB2 3QG, UK Abstract rapid increase in gramlnatical complexity accompa- A new account of parameter setting dur- nying the transition from pidgin to creole languages. ing grammatical acquisition is presented in Prom the perspective of the parameters framework, terms of Generalized Categorial Grammar the Bioprogram Hypothesis claims that children are embedded in a default inheritance hierar- endowed genetically with a UG which, by default, chy, providing a natural partial ordering specifies the stereotypical core creole grammar, with on the setting of parameters. Experiments right-branching syntax and subject-verb-object or- show that several experimentally effective der, as in Saramaccan. Others working within the learners can be defined in this framework. parameters framework have proposed unmarked, de- Ew)lutionary simulations suggest that a fault parameters (e.g. Lightfoot, 1991), but the Bio- lea.rner with default initial settings for pa- program Hypothesis can be interpreted as towards rameters will emerge, provided that learn- one end of a continuum of proposals ranging from all ing is memory limited and the environment parameters initially unset to all set to default values. of linguistic adaptation contains an appro- 2 The Language Acquisition Device priate language. A model of the Language Acquisition Device (LAD) 1 Theoretical Background incorporates a UG with associated parameters, a parser, and an algorithm for updating initial param- Grmnnmtical acquisition proceeds on the basis of a eter settings on parse failure during learning. partial genotypic specifica.tion of (universal) grmn- 2.1 The Grammar (set) mar (UG) complemented with a learning procedure elmbling the child to complete this specification ap- Basic categorial grammar (CG) uses one rule of ap- propriately. The parameter setting frainework of plication which combines a functor category (con- Chomsky (1981) claims that learning involves fix- taining a slash) with an argument category to form ing the wdues of a finite set of finite-valued param- a derived category (with one less slashed argument eters to select a single fully-specified grammar from category). Grammatical constraints of order and within the space defined by the genotypic specifi- agreement are captured by only allowing directed cation of UG. Formal accounts of parameter set- application to adjacent matching categories. Gener- ting have been developed for small fragments but alized Categorial Grammar (GCG) extends CG with even in these search spaces contain local maxima further rule schemata) The rules of FA, BA, gen- and subset-superset relations which may cause a eralized weak permutation (P) and backward and learner to converge to an incorrect grammar (Clark, forward colnposition (I?C, BC) are given in Fig- 1992; Gibson and Wexler, 1994; Niyogi and Berwick, ure 1 (where X, Y and Z are category variables, 1995). The solution to these problems involves defin- [ is a vm'iable over slash and backslash, and ... ing d(,fault, umnarked initial values for (some) pa- denotes zero or more further flmctor arguments). rameters and/or ordering the setting of paraineters Once pernmtation is included, several semantically during learning. l\¥ood (1993) is a general introduction to Categorial Bickerton (1984) argues for the Bioprograin Hy- Grammar mid extensions to the basic theory. The most pothesis a.s an explanation for universal similarities closely related theories to that presented here are those between historically unrelated creoles, and for the of Steedman (e.g. 1988) and Hoffman (1995). 418 Forward Application: X/Y Y ~ X A y [X(y)] (y) ::~ X(y) Backward Application: Y X\Y ~ X A y [X(y)] (y) =~ X(y) Forward Composition: X/Y Y/Z ~ X/Z y [X(y)] A z [Y(z)] =~ A z [X(Y(z))] Backward Composition: Y\Z X\Y ~ X\Z z [Y(z)] A y [X(y)] ~ A z [X(Y(z))] (Generalized Weak) Permutation: (XIY1)... IY, ~ (XIYn)IYI... A Yn..-,Yl [X(yl ...,y,.)] =V A Yl,Y .... [X(yl ...,Yn)] Figure 1: GCG Rule Schemata Kim loves Sandy this system can handle (long-distance) scrambling NP (S\NP)/NP NP elegantly and generates mildly context-sensitive lan- kim' A y,x [love'(x y)] sandy' guages (Joshi et al, 1991). P The relationship between GCG as a theory of UG (S/NP)\NP (GCUG) and as a the specification of a particu- A x,y [love'(x y)] lar grammar is captured by embedding the theory -BA in a default inheritance hierarchy. This is repre- S/NP sented as a lattice of typed default feature structures A y [love'(kim' y)] (TDFSs) representing subsumption and default in- FA heritance relationships (Lascarides et al, 1996; Las- S carides and Copestake, 1996). The lattice defines love'(kim' sandy') intensionally the set of possible categories and rule schemata via type declarations on nodes. For ex- Figure 2: GCG Derivation for Kim loves Sandy ample, an intransitive verb might be treated as a subtype of verb, inheriting subject directionality by default from a type gendir (for general direction). equivalent derivations for Kim loves Sandy become For English, gendir is default right but the node of available, Figure 2 shows the non-conventional left- the (intransitive) functor category, where the direc- branching one. Composition also allows alterna- tionality of subject arguments is specified, overrides tive non-conventional semantically equivalent (left- this to left, reflecting the fact that English is pre- branching) derivations. dominantly right-branching, though subjects appear GCG as presented is inadequate as an account of to the left of the verb. A transitive verb would in- UG or of any individual grammar. In particular, herit structure from the type for intransitive verbs the definition of atomic categories needs extending and an extra NP argument with default directional- to deal with featural variation (e.g. Bouma and van ity specified by gendir, and so forth. 2 Noord, 1994), and the rule schemata, especially com- For the purposes of the evolutionary simulation position and weak permutation, must be restricted described in §3, GC(U)Gs are represented as a se- in various parametric ways so that overgeneration quence of p-settings (where p denotes principles or is prevented for specific languages. Nevertheless, parameters) based on a flat (ternary) sequential en- GCG does represent a plausible kernel of UG; Hoff- coding of such default inheritance lattices. The in- man (1995, 1996) explores the descriptive power of a 2Bouma and van Noord (1994) and others demon- very similar system, in which generalized weak per- strate that CGs can be embedded in a constraint-based mutation is not required because functor arguments representation. Briscoe (1997a,b) gives further details of are interpreted as multisets. She demonstrates that the encoding of GCG in TDFSs. 419 NP N S gen-dir subj-dir applic AT AT AT DR DL DT NP gendir applic S N subj-dir AT DR DT AT AT DL "applic NP N gen-dir subj-dir S DT AT AT DR DL AT Figure 3: Sequential encodings of the grammar fragment heritance hierarchy provides a partial ordering on postpositions in which specifiers and modifiers follow parameters, which is exploited in the learning pro- heads. There are other languages in the SOV family cedure. For example, the atomic categories, N, with less consistent left-branching syntax in which NP and S are each represented by a parameter en- specifiers and/or modifiers precede phrasal heads, coding the presence/absence or lack of specification some of which are attested. "German" is a more (T/F/?) of the category in the (U)G. Since they will complex SOV language in which the parameter verb- be unordered in the lattice their ordering in the se- second (v2) ensures that the surface order in main quential coding is arbitrary. However, the ordering clauses is usually SVO. 4 of the directional types gendir and subjdir (with There are 20 p-settings which determine the rule values L/R) is significant as the latter is a more spe- schemata available, the atomic category set, and so cific type. The distinctions between absolute, de- forth. In all, this CGUG defines just under 300 fault or unset specifications also form part of the grammars. Not all of the resulting languages are encoding (A/D/?). Figure 3 shows several equiva- (stringset) distinct and some are proper subsets of lent and equally correct sequential encodings of the other languages. "English" without the rule of per- fragment of the English type system outlined above. mutation results in a stringset-identical language, A set of grammars based on typological distinc- but the grammar assigns different derivations to tions defined by basic constituent order (e.g. Green- some strings, though the associated logical forms are berg, 1966; Hawkins, 1994) was constructed as a identical. "English" without composition results in (partial) GCUG with independently varying binary- a subset language. Some combinations of p-settings valued parameters. The eight basic language fami- result in 'impossible' grammars (or UGs). Others lies are defined in terms of the unmarked order of yield equivalent grammars, for example, different verb (V), subject (S) and objects (0) in clauses. combinations of default settings (for types and their Languages within families further specify the order subtypes) can define an identical category set. of modifiers and specifiers in phrases, the order of ad- The grammars defined generate (usually infinite) positions and further phrasal-level ordering param- stringsets of lexical syntactic categories. These eters. Figure 4 list the language-specific ordering strings are sentence types since each is equivalent parameters used to define the full set of grammars to a finite set of grammatical sentences formed by in (partial) order of generality, and gives examples selecting a lexical instance of each lexicai category. of settings based on familiar languages such as "En- Languages are represented as a finite subset of sen- glish", "German" and "Japanese". 3 "English" de- tence types generated by the associated grammar. fines an SVO language, with prepositions in which These represent a sample of degree-1 learning trig- specifiers, complementizers and some modifiers pre- gers for the language (e.g. Lightfoot, 1991). Subset cede heads of phrases. There are other grammars in languages are represented by 3-9 sentence types and the SVO family in which all modifers follow heads, 'full' languages by 12 sentence types. The construc- there are postpositions, and so forth. Not all combi- tions exemplified by each sentence type and their nations of parameter settings correspond to attested length are equivalent across all the languages defined languages and one entire language family (OVS) is by the grammar set, but the sequences of lexical cat- unattested. "Japanese" is an SOV language with egories can differ. For example, two SOV language 3Throughout double quotes around language names renditions of The man who Bill likes gave Fred a are used as convenient mnemonics for familiar combina- tions of parameters. Since not all aspects of these actual 4Representation of the vl/v2 parameter(s) in terms languages are represented in the grammars, conclusions of a type constraint determining allowable functor cate- about actual languages must be made with care. gories is discussed in more detail in Briscoe (1997b). 420 gen vl n subj obj v2 mod spec relcl adpos compl Engl R F R L R F R R R R R Ger R F R L L T R R R R R Jap L F L L L F L L L L ? Figure 4: The Grammar Set - Ordering Parameters present, one with premodifying and the other post- modifying relative clauses, both with a relative pro- noun at the right boundary of the relative clause, are shown below with the differing category highlighted. Bill likes who the-man a-present Fred gave 1. The Reduce Step: if the top 2 cells of the NP8 (S\NP,)\NPo Rc\(S\NPo) NPs\Rc NPo2 stack are occupied, NPol ((S\NPs)\NPo2)\NPol then try The-man Bill likes who a-present Fred gave a) Application, if match, then apply and goto NPs/Rc NPs (S\NPs)\NPo Rc\(S\NPo) NPo2 1), else b), NPol ((S\NPs)\NPo2)\NPol b) Combination, if match then apply and goto 1), else c), c) Permutation, if match then apply and goto 2.2 The Parser 1), else goto 2) The parser is a deterministic, bounded-context 2. The Shift Step: if the first cell of the Input stack-based shift-reduce algorithm. The parser op- Buffer is occupied, erates on two data structures, an input buffer or then pop it and move it onto the Stack to- queue, and a stack or push down store. The algo- gether with its associated lexical syntactic cat- rithm for the parser working with a GCG which in- egory and goto 1), cludes application, composition and permutation is else goto 3) given in Figure 5. This algorithm finds the most left- 3. The Halt Step: if only the top cell of the Stack branching derivation for a sentence type because Re- is occupied by a constituent of category S, duce is ordered before Shift. The category sequences then return Success, representing the sentence types in the data for the else return Fail entire language set are designed to be unambiguous relative to thi s 'greedy', deterministic algorithm, so The Match and Apply operation: if a binary it will always assign the appropriate logical form to rule schema matches the categories of the top 2 cells each sentence type. However, there are frequently al- of the Stack, then they are popped from the Stack ternative less left-branching derivations of the same and the new category formed by applying the rule logical form. schema is pushed onto the Stack. The parser is augmented with an algorithm which The Permutation operation: each time step lc) computes working memory load during an analy- is visited during the Reduce step, permutation is ap- sis (e.g. Baddeley, 1992). Limitations of working plied to one of the categories in the top 2 cells of the memory are modelled in the parser by associating a Stack until all possible permutations of the 2 cate- cost with each stack cell occupied during each step gories have been tried using the binary rules. The of a derivation, and recency and depth of process- number of possible permutation operations is finite ing effects are modelled by resetting this cost each and bounded by the maximum number of arguments time a reduction occurs: the working memory load of any functor category in the grammar. (WML) algorithm is given in Figure 6. Figure 7 gives the right-branching derivation for Kim loves Sandy, Figure 5: The Parsing Algorithm found by the parser utilising a grammar without per- mutation. The WML at each step is shown for this derivation. The overall WML (16) is higher than for the left-branching derivation (9). The WML algorithm ranks sentence types, and 421
no reviews yet
Please Login to review.