Newsgroups: rec.arts.int-fiction
Path: nntp.gmd.de!news.ruhr-uni-bochum.de!news.uni-stuttgart.de!rz.uni-karlsruhe.de!blackbush.xlink.net!sol.ctr.columbia.edu!news.uoregon.edu!arclight.uoregon.edu!dispatch.news.demon.net!demon!netcom.net.uk!netcom.com!librik
From: librik@netcom.com (David Librik)
Subject: Re: Parser construction (was: Inform in other languages)
Message-ID: <librikDryJyB.EwF@netcom.com>
Organization: Icy Waters Underground, Inc.
References: <8C0B45E.0512002A7E.uuout@almac.co.uk> <4nrsp6$3iv@tid.tid.es> <4nsus9$12k@news.lth.se> <4numum$90v@tid.tid.es> <4o1a3a$dmm@news.lth.se>
Date: Sat, 25 May 1996 11:01:23 GMT
Lines: 109
Sender: librik@netcom20.netcom.com

mol@marvin.df.lth.se (Magnus Olsson) writes:

>In article <4numum$90v@tid.tid.es>, PAZ SALGADO <japaz@tid.tid.es> wrote:
>>Magnus Olsson (mol@marvin.df.lth.se) wrote:
>>: > If you want to have a good parser forget the computer
>>: >parser studies and take your old grammar books!
>>
>>: Huh? The old grammar books tell you nothing about parsers. You could do
>>: well with some *new* grammar books, namely those dealing with generative
>>: (Chomsky) grammars.
>>
>>  Well, my old grammar book talks about Chomsky, perhaps I'm too young.

>Well, Chmosky did his work in the 50's if I'm not mistaken. I think
>generative grammar has been moved in and out of grammar textbooks
>since then.

The '60s, really, though Chomsky's original ideas were written up in
the late '50s.  He wrote a short and accessible book called _Syntactic
Structures_ that showed why Transformational Grammar was the way to go,
and his ideas were a big success.  By the late 1960s nobody was really
doing anything else when it came to syntax.

The trouble is that Chomsky's generative grammar (Transformational
Grammar) was shown in the 1970s to be computationally unreasonable.
Computer Scientists showed that it was "equivalent to a Turing Machine
in generative power" (read: really tough to use for parsing), and
Psychologists concluded that it doesn't match up with what we know
about language use.  This sent a lot of people (including Chomsky)
scurrying off to try to simplify the theory, often to make it more
amenable to use with computers. 

The first attempt at a computationally reasonable model of grammar
was developed at Stanford and Xerox PARC (the same place that
invented the Mouse and the Graphical User Interface, and then ignored
their inventions until Apple ripped them off).  This was Joan Bresnan's
"Lexical Functional Grammar."  The programmers at PARC were able to
do quite a lot of good natural-language processing with LFG as a tool.

The Linguistics idea that I think would make a good tool for increasing
the power of Adventure parsers is Construction Grammar, developed by
Charles Fillmore (whose Case Grammar was big in natural-language parsing
in the '70s), Paul Kay, and Adele Goldberg.  Their idea is that not
just words have meaning, but constructions -- patterns of how words
appear together -- have their own meanings, independent of the words
that appear inside them.  Goldberg gives these English examples:

	push    the box      out of the room
	kick    the ball     away
	sneeze  the Kleenex  off the table

and claims that the *structure*, the *pattern* that these words fall into,
has its own semantic meaning of "cause directional motion."  Whatever verbs,
nouns, and directional phrases you can shove into this structure, you end up
with something whose meaning is more than just the meaning of the
individual parts -- there's nothing in "sneeze" "the Kleenex" and "off
the table" that implies the sneezing causes the Kleenex to move off the
table.  Also, constructions combine with each other -- each one carrying
its own bits of semantic meaning; as the constructions combine their
forms, so do their meanings combine, until you've built up a whole
sentence.  You can handle this combination by "unification" -- a familiar
technique to anyone who's taken a programming class that dealt with PROLOG
or database management.  Find a combination of constructions that unify
to form the input sentence, and the parallel combination of semantics will
give you the meaning of that sentence.  (For simplicity's sake, individual
words and parts of words are also "constructions," meaning there's only
one data type.)

Now what does this mean for Interactive-fiction parsers?  It's pretty
clear that so far we've only been able to handle a few constructions:
the ones which appear in:

	JUMP
	OPEN BOTTLE
	SHOW THE STILETTO TO THE THIEF

Furthermore, a lot of different constructions have been parsed in
the same way, leaving it up to the verb-processing code to handle
the differences in meaning.  To see what I mean, notice that:

	Magnus kicked the ball behind the barn.
	David stabbed the boy behind the barn.

seem to have exactly the same form, but the second doesn't have
a "caused motion" meaning.  So, more than one construction can result
in the same surface form.  (In fact, I suspect the second sentence
is a combination of two constructions: a "transitive sentence"
construction which just lets you say things like "David stabbed
the boy," and another one which lets you augment a verb phrase with
an adverbial [prepositional phrase or adverb] describing where
something took place.)  In Infocom-like games, this is paralleled
by the fact that:

        GIVE THE VASE TO THE TROLL
   and  CONNECT THE RED PLUG TO THE BLACK SOCKET

are parsed exactly the same, and each verb has to figure out separately
exactly what to do with its arguments.  The "meaning" (i.e. what actions
to perform) is associated purely with the verb -- and in more object-
oriented I-F languages, with the nouns too.  A Construction Grammar
solution would let some of the meaning get handled by routines for
the syntactic constructions used in the sentence, and thus would
(1) make the individual verb/noun coding a lot simpler, (2) handle new
combinations of words better, and (3) let us start adding new constructions
to the language that the parser can understand, and thus finally get beyond
Infocom.

- David Librik
librik@cs.Berkeley.edu
