Newsgroups: rec.arts.int-fiction
Path: gmd.de!ira.uka.de!yale.edu!spool.mu.edu!darwin.sura.net!udel!rochester!cantaloupe.srv.cs.cmu.edu!crabapple.srv.cs.cmu.edu!mkant
From: mkant+@cs.cmu.edu (Mark Kantrowitz)
Subject: Re: Text Generation Similar to Code Generation
Message-ID: <BzMwFJ.JtD.1@cs.cmu.edu>
Sender: news@cs.cmu.edu (Usenet News System)
Nntp-Posting-Host: glinda.oz.cs.cmu.edu
Organization: School of Computer Science, Carnegie Mellon
References: <9754280@MVB.SAIC.COM>
Date: Tue, 22 Dec 1992 00:09:18 GMT
Lines: 58

In article <9754280@MVB.SAIC.COM> Whitten@Fwva.Saic.Com (David Whitten) writes:
>As I understand it, a compiler may generate inefficient or ugly code,
>but then a second program looks at what it generated and applies some
>pattern matching to it.  When the output matches specific patterns,
>then the machine code is replaced with shorter/faster/better code that
>does the same thing.  This program is called the peephole optimizer
>because it only looks at the code "through a peep hole" and doesn't
>really know much about the way it was created, just how it looks on
>output.

You're coalescing two different concepts. Peephole optimization is a
form of bounded lookahead. Many NLG systems that are concerned with
text realization can be considered LL(k) transducers. Lookahead is
different from rewriting or revising generated text. See for example,
Marie Meteer (Vaughan) and David McDonald, "A Model of Revision in
Natural Language Generation", ACL-86, pages 90-96, or Marie's more
recent paper in Cecile Paris's book. Another paper on this topic is
Richard Gabriel's "Deliberate Writing" in McDonald's 1988 book. In it
he describes the Yh system, which uses critics and repeated editing to
improve the generated text.

>I'm sure the work being done by the OZ researchers has considered, or is
>considering something similar to this.  

Yes, this is one of the ideas involved in the GLINDA natural language
generation system. See my INLGWS92 paper in the list Peter posted for
details. 

> The problem comes from recognizing 'writing style' enough to make a
> collection of patterns that generates writing with a consistent style.

True, that's one of the reasons why style is a difficult topic. Just
coming up with a good definition of "style" is hard.

>Of course, anything more than a simple syntactic level style is a big
>problem. 
>You see mate, I kin tawk funnee an' youz kin get a fee-eel fur mein 
>siinktaktik stiile but th's ain't no how thee same as me wrrrritin' style.
> 
>I think we would be pushing the current state of IF simply to have 
>artificial personalities in an adventure which just had differing 
>syntactic styles.

I disagree. Syntactic style and lexical style are equally difficult.
GLINDA uses the same mechanisms to handle both. 

It isn't clear to me what you mean by "syntactic style".  The example
you give is one of dialectal variation. Dialectal variation involves
many factors: particular syntactic organization (e.g., your example
starts with a vocative and combines several clauses with
conjunctions), lexical selection (e.g., use of "mate", "ain't", "no
how"), pronunciation (e.g., "tawk" instead of "talk", "writin'",
"thee") and so on. Limiting the generator to just the syntactic level
of organization would yield stilted sounding text.

--mark
Oz Interactive Fiction Project

