210x Filetype PDF File size 0.42 MB Source: aclanthology.org
DANISH FIELD GRAMMAR IN TYPED PROLOG Henrik Rue UNI-C, Danish Computing Center for Research and Education Vermundsgade 5, DK 2100 @, Copenhagen, Denmark ABSTRACT ges in the definition and inventory of fields in order to make an executable This paper describes a field grammar for program. Danish and its implementations in a Prolog version with predeclared types. In compa- Prolog Dialect rison to the ususal S -> NP VP schema, this kind of grammar, where the first rule The Prolog dialect used is the Danish is S -> CNF FF NF CF enhances analysis prototype of Borland's TurboProlog. This effeciency because the fields specify is a typed prolog, and may be termed a constituents and syntactic function at the hybrid between Prolog and Pascal. When same time. The field grammar tradition is seeing a sample grammar written in this outlinedand an overview of the major rules dialect, one is impressed by the clarity of the Prolog program, which implements it achieves: grammatical structures are the grammar, is given. statically described in the declaration of types. The dynamic part which enables one to get at these structures are the rules of the program. A further aim of this FIELD GRAMMAR work, then, is to explore whether this clarity will prevail also in an elaborate A Syntactic Strategy grammar program. In terms of computational linguistics, Other Purposes field grammar may be viewed as a syntactic strategy, which offers the user the imme- Apart from the purpose implicit in the diate constituents while at the same time aims we believe that field theory offers a giving their syntactic functions and the sound (read: economic) starting point for functional sentence perspective, in part a great variety of parsing purposes. As at least. Field grammar furthermore faci- mentioned, the theory offers a combina- litates the handling of discontinuous con- tion of constituent structure analysis stituents, as will be shown. with syntactic and thematic analysis. Background This will not only hold for the Scandi- navian languages, but presumably also for The field grammar of the Danish linguist other Germanic language like English, Paul Diderichsen adequately describes con- where one might abandon the S -> NP VP in stituent structure in Danish, while at the favour of something on the lines of the same time capturing both topicalization SVC SVA SV SVO etc. clause patterns of and syntactic roles. Diderichsens grammar Quirk (1972) et al. "Elementmr dansk grammatik" (1946) was developed from the 1940's onwards with the In the work presented here, however, intention that it should be used as a there is no exploitation of the topicali- common framework for grammar teaching in zation facilities offered by the grammar. secondary school as well as on university level. This grammar has since served as one cornerstone of Danish grammatical thought. A DANISH FIELD GRAMMAR Diderichsen's grammar is distinguished According to Diderichsen, the Danish by a high degree of formalization, and it sentence structure has four major fields, is one of the aims of the work presented the connector field, the fundament field, in this paper to see how much of the the nexus field and the content field. original formalism can be implemented directly as a Prolog program, and whether The four types are present in main sen- it is necessary to make substantial chan- tences 167 S -> CONN FF NF CF CONTENTF = nil; and three of them in subordinate ones: contentf( INFFLD, OBJFLD, CADVFLD ) SS -> CONN S-NF CF where all fields except the nexus field These are the major fields. They may in (NF or S-NF) may be empty. turn be divided into subfields: The CONN is the field for conjunctions. INFFLD : nil; inffld( INFI, INF2 ) The FF (for Fundament Field, which is means that Danish has a possibility of two the Danish topicalization device) may auxiliaries, (the finite + one infinite), contain any complete constituent, which is and implicitly that if INF2 is filled, there as a result of a movement from its then this will be the content verb. This field in the sentence: 'Moderen giver treatment is not quite adequate, actually, drengen gaven' vs. 'Gaven giver moderen but it follows Diderichsen's schema. drengen', ('The mother gives the boy a gift') where the second version differes OBJFLD : nil; in its thematical content only: it stres- obJfld( NOMINAL, PREPG, NOMINAL ) ses the direct object as the theme. The NF, for Nexus Field, contains a the object field, which at the moment con- finite verbform, a possible subject plus tains a quick-and-dirty solution to the adverbials modifying the verb; the inter- problem that the indirect object may be nal structure of the nexus field differs expressed by a prepositional phrase in in main and subordinate clauses. Danish, the solution being the incorpora- tion of an unwarranted PREP subfield. The CF, for Content Field, contains two It should be noted in passing, that the possible infinite verbforms, the objects connector field in Diderichsen's formalism and predicates plus adverbial and other is one of the places where the system will modifiers. not be able to hold on to the original. The Grammar Declaration This field is part of scemata not only for sentences, but also for noun- and adver- So far the project has implemented field bial phrases, where it may contain i.a. analysis of both main and subordinate preposition. The system thus has to di- sentences. However, not all topicaliza- stinguish between the two types of connec- tions are handled yet: in questions, the tor fields in order to avoid the genera- fundament field may be empty too, but this tion of spurious analysis results. is not incorporated in the program, as it Discontinuous Verbal Particles remains to be seen whether an anlysis with the finite topicalized, that is moved into In Danish some verbs are either prefi- the fundament field, would be more fit for gated or obligatorly constructed with a the purpose. particle, a preposition actually, which moves to the end of the sentence with all Clause structure finite forms: 'oplade' ('charge') but 'han lader batteriet op', ('he charges the The following declarations describe main battery'); 'lukke op' ('open up') but 'ban and subordinate clauses and furthermore lukker d~ren op' ('he opens the-door up'). the internal structure of the major The same phenomenon exists in German: fields: 'Peter gab sein rauchen auf'. This is one of the places where field grammar shows S : s( CONN, FUNDF, NEXUSF, CONTENTF ); its force as a syntactic strategy, because nil; the phenomenon of discontinuity is handled s_s( CONN, NEXUSF_S, CONTENTF ) in a straightforward way at the first CONN = level of analysis: nil; ADVFLD = nil; konj( KONJ ) cadvfld( CADF, CADF ) FUNDF = fundf n( NOMINAL ); /* No nil */ with fundf--a( ADVERBIAL ); fundf--i( INF ); CADF = nil; fundfZc( CONTENTF ) prep( PREP ); cadf( ADVERBIAL ) NEXUSF : nexusf( FINIT, SUBJ, NADV ) where CADF is the field for i.a. conten- NEXUSF_S : nexusf_s( SUBJ, NADV, FINIT ) tial adverbs, but also for disjunct verbal 168 particles. These are acommodated by split- the first 'VERB' slot when field analysis ting the original Diderichsen subfield for is carried out. The result of the syntac- content adverbials into two further sub- tical analysis which follows, will be in fields, one of which will contain the the second 'VERB' slot. verbal particle (if any) the other the Syntax regular content adverbials. This is suffi- cient for the declaration of the grammar; The system also comprises a syntactic how our analysis handles the various fields will be shown in a later section. part, based on traditional school grammar: SYNT = synt( SUBJ, VERB, NADV, SUBJPRED, Phrasal structure OBJ, OBJPRED, IOBJ, CADV, TEMPG ) Syntagmatic structures are also divided where NADV and CADV are the adverbial into fields. As the system stands it is modifiers of the nexus and the con- implemented for adverbial phrases, but not tentfield respectivily. The other mnemo- yet for noun phrases. These are at the nics should be self evident. moment structured in a way, that is pretty much on the NP -> Det AdjP N lines. As The Dictionary regards adverbials, the structure given is only one of several possible: As the dictionary of the system has not NOMINAL = nil; been given much attention yet, and as it nominal( ART, ADJEKTIVAL, SUBKERN works on a purely ad hoc basis, it will PREPP, CS ) not be treated in this paper. ADVERBIAL : nil; ANALYSIS adverbial( CONN, DEGREEF, SITUATF, ADVKERN, Analysis runs in two steps, one carrying PREPP, CS ) out the field analysis, the other handling The CS is a symbol representing subordi- the syntactical interpretation of the nate sentences, which have the form: result of the field analysis. CS = nil; Field Analysys cs( S, SYNT ) Field analysis is carried out by a call to where S is the field structure, and SYNT the following major rule: the corresponding syntactical structure of the subordinate sentence represented by is_s( I, O, s( CONN, FUNDF, NEXUSF, the token of the symbol type CS. CONTENTF ) ):- is forb( I, II, CONN, FEATC ), Verb phrases, on the other hand, do not FEATC <> subord, exist as such. Instead we have: is fundf( II, I2, FUNDF ), FINIT = finit( VERB, VERB, TEMPG ) is--nexusf( I2, I3, NEXUSF ), INFINIT = infinit( VERB, VERB, TEMPG ) is--contentf( I3, O, CONTENTF ). VERB = Symbol which applies the following rules in order to succeed (or fail): which means that a verb, whether it be is_fundf( I, O, fundf n( NOMINAL ) ):- finite or infinite, is described by a is nomen( I, O, NOMINAL ), I <> O. structure, which consists of I) the verbal form itself as it is found in the sentence is_fundf( I, O, fundf a( ADVERBIAL ) ):- (the first 'VERB'), 2) a lexical unit, is adverbial( I, O, ADVERBIAL, ), (the second 'VERB', which will be found as I~> O. a result of the analysis of the sentence, and which will leave the fields for infi- is_nexusf( I, O, nexusf( FINIT, NOMINAL, nite form empty) and 3) a complex descrip- ADVERBIAL ) ):- tion, TEMPG, of tense, aspect, voice, is finit( I, II, FINIT ), modality and the telic/atelic property of is-nomen( II, I2, NOMINAL, _, _ ), the situation described by the verb. This is~adverbial( I2, O, ADVERBIAL, _ ). TEMPG is used of the sentence as a whole also. and In this way a 'FINIT' in a sentence will have either an auxiliary, a finite verb- form missing the verbal prefix or the full, finite form of the content verb in 169 is contentf( I, O, contentf( INFFLD, start:- -- OBJFLD, CADVFLD ) ):- write("Skriv en smtning"),nl, is inffld( I, II, INFFLD ), readln( Line ), is--objfld( II, I2, OBJFLD ), is s( Line, "", S ), is--cadvfld( I2, O, CADVFLD ), is~syn( S, SYNT ), I~> O. nl, write("Feltanalyse:"),nl, skriv s( S, 0 ), nl, is contentf( I, I, nil ). nl, w~ite("Syntaktisk analyse:"), nl, skriv( SYNT, 0 ), nl, fail. As a consequence of having a possible nil- is_syn( S, SYNT ):- filling for a major field, the content extract_vg( S, VERBI, TEMPG ), field, it becomes necessary to explode the number of rules which identify and collect extract disco vpart( VERBI, S, VERB ), compound verb forms, or in other words extract~advg(--S, NADV, CADV ), what is gained in the simplicity of the interpret_nominals( S, VERB, SUBJ, grammar is lost again by the number of SUBJPRED, OBJ, rules. OBJPRED, IOBJ ), collect_synt( VERB, NADV, SUBJ, Discontinous Verbal Particles SUBJPRED, OBJ, OBJPRED, IOBJ, CADV, TEMPG, SYNT ). As an example of the rules handling the major fields, we shall take a look at the is_syn( nil, nai ). rule, which picks out discontinous verbal particles. The claim was that field grammar facili- The rules which handle the adverbial sub- tates syntactic analysis, and we shall now field of the content field contain a spe- endeavour to support this claim by looking cification for the particles, as they at the handling of the noun phrases. allow for the class of prepositional ad- verbs: The major rule is 'interpretnominals', which has the form: is cadvfld( I, O, cadvfld( PREPG, -- C ADVERBIAL ) ):- interpret nominals( is_advprep( I, II,--PREPG ), s( _, FUNDF, NEXUSF, CONTENTF ), is c adverbial( II, O, C ADVERBIAL ), VERB, SUBJ, SUBJPRED, I <> O. OBJ, OBJPRED, IOBJ ):- syn_nomfund( FUNDF, NEXUSF, CONTENTF, is cadvfld( I, O, cadvfld( C ADVERBIAL, VERB, SUBJ, SUBJPRED, - PREPG ) ) :- OBJ, OBJPRED, IOBJ). is c adverbial( I, 11, C ADVERBIAL ), is--advprep( II, O, PREPG- ), For transitive verbs the following no~_nom( 0 ), I <> O. version of a 'synnomfund' rule generates the filler in the fundament The prepositional adverbs are then picked field as subject, and two fillers to the up by the rule: object and indirect object slots; if there is only one filler in the object subfield is advprep( I, O, prep( PREP ) ):- this will be the object: fronttoken( I, PREP, 0 ), dic_prep( X ), X = PREP. syn nomfund( ~undf n( FUNDFN I ), which in fact is an ad hoc rule to circum- nexus~( _, nil, _ ), vent the restrictions posed on the system CONTENTF, be the typing facility. During syntactic VERB, subj( FUNDFN 0 ), nil, analysis the disjunct particles are col- OBJS, nil, IOBJS )T- lected with the verb by the rule trans verb( VERB, DITRANS ), extract disco vpart, as will be demon- check--sentcomp( FUNDFN I, FUNDFN 0 ), strated-in th~ following. extra~t_obj( nil, DITRANS, CONTENTF, OBJS, IOBJS ),!. Syntactic Analysis where the interesting call is the one to There is one major clause for syntactic 'extract obj', where the following will analysis, 'is_syn', which is called by the match (the 'check_sentcomp' in the follo- top level anlysis clause 'start': wing rules should be disregarded, as it has nothing to do with the analysis of the arguments proper, it only activates a syntactic analysis of a possible clausal complement to the given nominal kernels): 170
no reviews yet
Please Login to review.