Newsgroups: rec.arts.int-fiction
Path: gmd.de!ira.uka.de!scsing.switch.ch!univ-lyon1.fr!ghost.dsi.unimi.it!batcomputer!caen!destroyer!sol.ctr.columbia.edu!howland.reston.ans.net!spool.mu.edu!uwm.edu!linac!att!mcdchg!chinet!jorn
From: jorn@chinet.chi.il.us (Jorn Barger)
Subject: "Was: Barger@ILS" Chapter 4 (CBR)
Organization: Chinet - Public Access UNIX
Date: Thu, 28 Jan 1993 18:39:46 GMT
Message-ID: <C1KuIA.HBq@chinet.chi.il.us>
Lines: 121

====================================================================
                        "Was: Barger@ILS"
                   (memoirs of an a.i. hacker)
                          by Jorn Barger

                Chapter 4: Case Based Reasoning
====================================================================

If you enumerate all the interesting categories of 'human histories' on
indexcards, all the stories of emotion, every plausible configuration of
events you'll ever need in an interactive fiction (elf meets orc, villain
plots counterattack, boy kisses girl), and lay them out on the floor (of
a *gym*, figure... it's gonna be a lot of cards!), so that as far as
possible the most similar ones are closest together... what's the overall
pattern?

If you then stretch lengths of string from each card to all its most
similar neighbors, and also to some distant neighbors that are maybe very
different, but still similar in some important ways (analogous,
contradictory)... is there a simple pattern to the crosslinks?

This, in fact, is the central problem of CBR, one Schank's been pursuing
his whole career.

Long before 1989, it seems, the Yale school's focus of interest had
shifted from SPG&U to Schank's next book "Dynamic Memory".  DynaMem is
credited with opening up the whole field of CBR, introducing the model of
an indexed network of storylike cases. (The 'dynamic' part-- creating a
memory that revises itself-- is still a dream.  Just creating a case-
based memory structure turns out to be a plenty big problem.)

DynaMem suggested three sorts of longdistance link between indexcard-
stories (which stories it lumpily termed Memory Organization Packets or
MOPs):
1) Story X1 is one scene within more-complex story Y1
2) Story X2 is just like the more-obvious story Y2, but with a different,
unexpected ending (an "expectation failure")
3) Story X3 is a simplified version of story Y3.

#1 is totally critical for IF: stories have to maintain their identity
through any quantity of inflation with added detail, and if two stories
share a detail, they should both have access to that detail's own
stories.

#2 was suggested by real-life observation of the sorts of stories that
we're naturally reminded of in the course of the day-- we'll always
remember the story of the time when things went haywire, not least so we
can make sure *that* doesn't happen again.  A planning algorithm in IF
must similarly consider all the ways the planned story could go wrong, so
it will have to use something equivalent to links like these.

#3 is the abst-spec link of traditional abstraction hierarchies, but its
application to stories is not at all simple.  One approach might be to
remove 'scenes', one by one, to create more general stories, so that Y3 =
X3 + X1. (Story X3 is boy-talks-to-girl, scene X1 is girl-winks-at-boy)

DynaMem also speculated about a possibly-perpendicular hierarchy of
"Thematic Organization Packets" that account for occasions when a story
in one domain reminds us metaphorically of a story from a totally
different domain-- her wink was like a *door swinging open*....  So
stories X and Y, from different domains, may both point to an indexcard
that presents a common abstraction of their themes: *access-level-
increased*, in this case.


The first case-based domain I was assigned to turned out to be
meteorology, *weather*.  We had (or thought we had) some money from the
Navy to write training software for their weather-school.  We were just
going to make a modest casebase of weather stories, from balmy summer
afternoon to hurricane tsunami.  Students would look at a weather map and
abstract out its features (by hand), and the tutor would retrieve closely
matching cases.

Weather stories can be expressed pretty simply in terms of temperature
trend, wind trend, cloudcover trend, precipitation trend, etc.   Each of
these 'slots' needs only a handful of possible fillers (the fillers for
the slot TempTrend might be rising, falling and steady), and a way to
search for the best match among the known cases.  Before we really got
started on the retrieval problem, the funding dried up, but in retrospect
I can ask myself, how could we have built the storybase as a network of
weather histories?

The concept of scene is easily generalized to the much-mushier 'feature',
so 'cloudy' might be a 'scene' within low-black-cumulonimbus-followed-by-
thunder-and-torrential-rain.  It would be quite natural to present the
student with a weather situation along with the various ways it might go
next: "Remember that time it was sunny and clear, and then a minute later
all hell broke loose?  (It could happen again...)"

In theory, 'simpler' weather-stories might treat one or two of the trend-
dimensions (eg, temp and cloudcover) and ignore the others: more complex
stories could then be created additively. (Some combinations would never
occur in real life-- blank spaces on the meteorological 'gym floor'.)

Alternatively, a 'simpler' story might follow *all* the trends, but for a
shorter span of storytime, or in less finely-resolved detail.  So the
story "overcast" is a simplification of "overcast, then clearing later",
and also of "overcast with a couple of places where the sun peeked
through".

Choosing one of these is choosing an 'indexing scheme', effectively
determining *the sorts of menus* that should be presented to a human user
trying to navigate the storybase.  So in the latter case, at any point
you'd be offered a menu of scenes-that-might-happen-next, and in the
former case a menu of other-features-that-apply-in-the-same-moment.   
There's a human-factors element to this, in that you must avoid overlong
menus without resorting to confusingly abstract menu-items.

The 'indexing problem' is subtly different from the representation
problem: you can sufficiently *represent* a weather story, but still not
know where to index it so you can find it when you need it.  SPG&U was
mainly about representation, DynaMem shifted the focus to the general
theory of indexing.

[Was this too 'heady'?  Let me know if you found it useful.  If you need
more examples, check out Dynamem, or "Tell Me a Story".  For programmming
mechanics, look at "Inside Case-Based Reasoning" by Riesbeck and Schank.]

Next up: Parsers (maybe), and ASK systems

Jorn Barger   jorn@chinet.chi.il.us   (was: barger@ils.nwu.edu)
