X-Newsreader: Geminisoft Pimmy 3.2 Eng - www.geminisoft.com
From: "John Colagioia" <JColagioia@csi.com>
Newsgroups: rec.arts.int-fiction
Subject: Re: NLP Question
Date: Fri, 05 Jul 2002 17:47:14 -0400
References: <ts1V8.1521$x6.950@newsread1.prod.itd.earthlink.net> <3d24d781.19079434@news.east.cox.net> <vE4V8.344$Yx5.51@newsread1.prod.itd.earthlink.net> <5q5V8.424$Yx5.149@newsread1.prod.itd.earthlink.net>
MIME-Version: 1.0
Content-Type: text/plain
Content-Transfer-Encoding: 8bit
NNTP-Posting-Host: ool-182f30fa.dyn.optonline.net
X-Original-NNTP-Posting-Host: ool-182f30fa.dyn.optonline.net
Message-ID: <3d2613f8@excalibur.gbmtech.net>
X-Trace: excalibur.gbmtech.net 1025905656 ool-182f30fa.dyn.optonline.net (5 Jul 2002 17:47:36 -0400)
Organization: ProNet USA Inc.
Lines: 55
X-Authenticated-User: jnc
Path: news.duke.edu!newsgate.duke.edu!solaris.cc.vt.edu!news.vt.edu!netnews.com!nntp.abs.net!uunet!dca.uu.net!excalibur.gbmtech.net
Xref: news.duke.edu rec.arts.int-fiction:105763

"John" <nojgoalbyspam@hotmail.com> wrote:
>I guess what I need is a way to determine the nouns, verbs, etc. from a
>sentence.  From then I can pretty much deal with the input.  Is this what
>the infocom adventures did?  Or was it purely a matching of the game
>definition text with the user input?

The Infocom games included a specialized "quasi-parser."
Basically, it tried to match user input to a pre-existing
table of grammar, starting with the text of the verb.

>Again, any pointers very appreciated.

For Infocom-style parsing, you probably won't do much
better than looking at the Inform standard library; you
can find the latest version at:

http://www.ifarchive.org/indexes/if-archiveXinfocomXcompilersXinform6Xlibrary.html

Inform targets the same virtual machine as Infocom did
and makes a good attempt at replicating most of the
"look and feel" of the games.

The problem, of course, is that the library is written
in fairly "heavy" Inform, at some spots.  However:

- Grammar.H will give you a pretty good idea of how the
grammars were defined.
- English.H will illustrate the kinds of error messages
that are often needed.
- ParserM.H, particularly the Parser__Parse() function,
is the actual guts of the parsing mechanism which,
despite its bulk, is somewhat disappointing, considering
how effective it seems.

The word "seems" is sort of a key.  You don't actually
have to understand anything outside of your limited
vocabulary, as long as you can fake it...

In the case of Infocom, Inform, TADS, and the other IF
systems, it really boils down to a combination of top-
down and bottom-up parsing techniques, and is therefore
little more sophisticated than your average compiler.

Does that help?  Or does it just muddy the issue?

Closer to the subject of "real" NLP, back in high
school, my senior-year English teacher ran us through
a few exercises to discuss grammar from a morphology
standpoint, rather than a set of recursive definitions;
things like "an adverb precedes an adjective or is
directly before or after a verb; it frequently ends in
'-ly.'"  That sort of thing might be up your alley.
Alas, I don't have a better reference than that vague
description, but perhaps someone more well-read in
such fields might.
