PICKLE specification version 1
("Platform-Independent Composite Kumquat Lickable Format")

Written by Andrew Plotkin
Updated 12/25/95 (merry greed-day!)

	Introduction

PICKLE is a way to encapsulate many resources of different types into a
single file, which can easily be transferred between systems. The
archetypical use is an enhanced text adventure, containing a program (in a
platform-independent format such as Z-code) and a set of graphics and
sound resources (in platform-independent formats such as GIF and AIFF.)

(Smart-alecks may point out that this is what the MIME (Multimedia
Internet Mail something-or-other) standard is for. I know. I feel that
while MIME is simple to parse, it's not *trivial*, and it's a little too
powerful for this use. It also worries about encoding as well as format,
which I'd rather not.)

What's the point? This format was inspired by the following notation in
the Z-machine specification: "The interpreter is simply expected [by
the story file] to know where to find them [the sounds or graphics.]"

The de facto standard set by Infocom is to put the resources in a separate
directory, one per file. I think this is terrible. It places the burden of
keeping all the game resources together on the user, instead of on the
game author. The convenience of downloading a single story file and
knowing that it will work has, I think, been taken for granted. PICKLE is
an attempt to extend that convenience to games which include more
resources than just a story file. 

(Note that this is very close to the idea of stand-alone games. PICKLE
solves a potential problem here as well: how do you create a stand-alone
game from an arbitrary number of resource files? How does the player
determine what to include? Making a stand-alone game from a PICKLE file is
as easy as making one from a single story file. And PICKLE provides a
model, if not an actual format, for assembling many resources into one
package.)

Although I was thinking of the Z-machine when I invented this stuff, it can
equally well be used by any interpreter system that uses the concept of
"the story file requests some data, and the interpreter is expected to know
where to find it." If the story file has more low-level control (such as 
access to direct file manipulation) then PICKLE is of less use. But then, I
hope that future interpreter designers will take the existence of PICKLE
into account.


	What a PICKLE File Is

Conceptually, a PICKLE file is a collection of chunks. Each chunk has a
*use*, a *number*, and a *format*. A chunk's use might be executable,
text, image, sound, animation, and so on. A chunk whose use is image might
have a format of GIF, JPEG, ASCII-art, and so on. The number of a chunk is
just a label, used to request it. The idea is that the file can provide
several alternative formats for a given resource, by containing chunks
with the same use and number, but different formats. All such chunks
should have the same content.

There are only two operations defined on a PICKLE file: you can request a
chunk, or check for the existence of a usable chunk.

When you wish to request a chunk, you request it by use and number. For
example, you might request image number 4. The interpreter (or whatever is
parsing the PICKLE file) would look through for a chunk whose use is image
and whose number is 4. There may be no such chunk, or one, or several:

* If there is none, the interpreter should fail gracefully, displaying an
error message, and go on with what it's doing (if possible.) Do not assume
that "nothing" will happen. The error message may be obtrusive or cause
the interpreter to exit.
* If there is one such chunk, the interpreter may or may not understand the
format of the chunk. If it does, it will give it to you, to do with as you
please. If not, it will again fail gracefully and display an error.
* If there are several chunks which have the given use and number, the
interpreter may pick *any one of them*. The interpreter will of course try
to pick a chunk whose format it understands, and to pick one which will be
the nicest (a GIF image is nicer than an ASCII-art one, for example.)
However, if it has more than one nice option, there is no guarantee as to
which it will pick. Whichever chunk it picks, it will return it or fail
with an error, as above.

When you wish to check for a chunk, you again check by use and number. The
interpreter then tells you whether there is a chunk with that use and
number, and a format the interpreter understands. This means that if the
interpreter says "yes", you may request that use and number safely. If the
interpreter says "no", you know that such a request will produce an error.

There is no restriction as to what chunks may be in a PICKLE file. (In
particular, numbered chunks of the same use need not be sequential.) It is
even legal to have more than one chunk with the same use, number, and
format. There is no point to it, however, unless you like leaving
decisions to the whim of the interpreter.

Who is "you" in this discussion? It is, of course, possible to write a
simple interpreter which accepts command-line input, requests chunks, and
displays the results (or plays them, or whatever.) However, the default
action of a user-level PICKLE interpreter should be to *request executable
chunk number zero, and execute it.* This executable chunk may then request
further chunks in due course.

Note that there may be more than one executable chunk zero. Theoretically,
you could create a PICKLE file which contains both a TADS and a Z-code
program, with the interpreter running whichever it knows how to run. (In
practice, if I ever see this done with a full-length game, I will run
gibbering for the hills.)

A simple example, in case the above verbosity wasn't *quite* enough for
you. The Z-machine version 6 understands an opcode @draw_picture(num,y,x)
which means to draw the picture number *num* at coordinates (*x*,*y*) in
the game window. We put together a PICKLE file containing the following
chunks:

	use			# (format)		: contents
	-----------------------------------------
	executable 	0 (Z-code V6)	: A game file
	image		1 (GIF)			: Picture #1
	image		1 (JPEG)		: Picture #1
	image		2 (GIF)			: Picture #2
	image		2 (JPEG)		: Picture #2
			etc...
			
The player loads this into his interpreter, which starts executing the game
(since it is executable 0.) When the Z-machine emulator reaches a
@draw_picture(num,y,x) opcode, it requests image *num* from the
interpreter, takes what is returned, and draws it on the screen. Simple,
and will work as long as the interpreter is capable of drawing either GIFs
or JPEGs.

Not all languages have the capacity to check for the existence of chunks.
The Z-machine has an opcode to check how many images it holds, but
although it can play sounds, it has no opcode to check for the existence
of sounds. This is tough noogies on the Z-machine. If someone deletes all
the sound chunks from a game that expects them, the player will get error
messages.

Standards go two ways, so here is a list of various things promised and
assumed by various members of the PICKLE-using community:

	The file creator guarantees, 
	and thus the reader library assumes,
that the order of the chunks is not significant. (This is why the reader
library API has no functions to find the first, second, ... nth chunk.
The interpreter doesn't have to care.)
	The file creator guarantees, 
	and thus the reader library and game file assume,
that two chunks with the same use and number (but different formats)
have the same content -- the same image or sound or whatever. (This is
why the game file doesn't have to care what formats are available. It
can just request a use and number, and trust that the interpreter will
get the right information out to the player.)


	The PICKLE Format

After all this discussion of the semantics, the syntax is very simple. A
PICKLE file consists of a header, followed by one chunk descriptor for
each chunk, followed by the chunks themselves.

(A NUM is a 32-bit integer, stored MSB first. A TYPE is a 32-bit value
representing four ASCII characters, stored first character first, in the
manner common on the Macintosh. TYPEs are generally manipulated as long
integers, so every bit of every byte is significant. The characters can in
theory be any 8-bit values. All the TYPEs I suggest in this document will
be made of lower-case letters, but interpreters should *not* take it on
themselves to ignore capitalization differences or skip unprintable
characters.)

Header: (16 bytes)
	TYPE: 'pikl' (that is, hex value 70696b6c)
	NUM : the version number of the PICKLE file (1 for the version 
			described in this document)
	NUM : number of chunks in the file
	NUM : length of the file (in bytes)

Descriptor: (24 bytes for each) (The descriptors are not in any 
guaranteed order.)
	TYPE: use of chunk
	NUM : number of chunk
	TYPE: major format of chunk
	NUM : minor format of chunk
	NUM : position of beginning of chunk data (in bytes, from the 
			beginning of the PICKLE file)
	NUM : length of chunk (in bytes)

Chunks: 
	All the chunk data stuck together. The chunks do not have to be 
		in the same order as their descriptors. 
		
Note that the use of a chunk is represented by a 32-bit TYPE. The format is
represented by 64 bits, a TYPE and a NUM. (The meaning of the NUM is
determined by the TYPE; it will usually be a version number.)

Here are some suggested uses and formats. I have no suggestion for most of
the version numbers, because in most cases I don't know much about the
format or the version numbers it's had. If there's a question mark, you
should do some research and start some net discussion before you make any
PICKLE files containing that type. If a format has never had different
versions, just use 0 in the minor format field.

 Use 	 Major format
---------------
'exec' : Executable chunk. 
		'zcod' 1-8 : Z-code file, with the version (1 through 8 these days) 
			stored in the minor format field.
		'tads' ? : TADS game file
		'hugo' ? : HUGO game file

'text' : Text. (No idea why anyone would want one, but we might as well 
			define it.)
		'text' 0 : ASCII text, with newline (ctrl-J, '\012') characters 
			delineating the ends of paragraphs (not lines). Characters 
			with the high bit set are legal, and should be interpreted via 
			ISO 8859 Latin-1.
			
'pict' : Two-D image.
		'giff' ? : GIF (I made up the other 'f'.) Versions are 87 and 89,
			maybe?
		'jpeg' ? : JPEG
		'text' 0 : ASCII text, as described above, which the interpreter 
			will display in a fixed-width font. Newlines (ctrl-J, '\012')
			can go at the end of each line, as is usual for ASCII graphics.

'audi' : Sound.
		'aiff' ? : AIFF
		'idat' 0 : Infocom DAT sound format, as described in Stefan 
			Jokisch's sound format article. The data should be unsigned;
			that is, values range from 0 to 255, with 128 in the middle.
		'imid' 0 : Infocom MID sound format, as described in the same
			article. 
		'midi' ? : General MIDI file
		'text' 0 : ASCII text, as described above. (Which might contain the
			message "You hear a horrid scream!".)

'anim' : Two-D animation (possibly with built-in sound).
		'mpeg' ? : MPEG
		'qktm' 0 : QuickTime (flattened)
			
Note that the format 'text' can be used in several ways. This is a simple
way to ensure, say, that sounds will be "audible" even on a machine with
no sound capacity. (The interpreter might display the 'text' contents in a
separate window.) However, you are certainly under no obligation to
provide a 'text' equivalent for every sound and picture. Your language may
have much better mechanisms to detect and work around the capacities of
the player's machine.

A few other formats may also be flexible enough to work under several
different uses. (One might have a 'audi' 'qktm', which would be a
QuickTime movie with only audio tracks.) However, it is not really
required that a given format TYPE mean the same thing in different uses.
It is merely convenient.

Finally, note that it is very simple to outfit an existing text game
interpreter (Z-code, TADS, etc) to accept PICKLE-packed game files as well
as normal game files. If the first four bytes of the file are 'pikl', you
know you have a PICKLE file; you then scan through the chunk descriptors
looking for use 'exec', number 0, format 'zcod' (or whatever.) If you find
one, you pull out its offset and length, read it from the file, and go on
as usual. If not, you display an error.

