$Header: README,v 1.4 89/09/14 20:31:55 chip Rel $
$Log:	README,v $
# Revision 1.4  89/09/14  20:31:55  20:31:55  chip (Chip Chapin)
# Release 1-2.  Supports -f and -l options for formatting the output.
# Updates primarily brl.c, bible.c, and bible.1.
# 
# Revision 1.3  89/09/08  13:21:54  13:21:54  chip (Chip Chapin)
# Better error checking on verse syntax; automatic test suite.
# 
# Revision 1.2  89/09/05  19:18:27  19:18:27  chip (Chip Chapin)
# Initial release.
# 

	README 
	Bible Retrieval System 
	Chip Chapin, Hewlett Packard Company
	Initial release, September 5, 1989


The Bible Retrieval System (BRS) consists of a textual database of the
Authorized ("King James") Version of the Old and New Testaments, a set
of libraries for finding and retrieving text, and a program ("bible")
which uses the libraries to retrieve Bible passages given references
on the command line or from standard input.  A man page is provided.

Other applications could easily be constructed using the libraries.
In fact the text storage library could be used for any type of textual
data, providing useful indexing, compression and buffering functions.

While the raw Bible text consumes over 4.4 megabytes, the BRS stores
it in a special compressed form, requiring less than 1.8 megabytes.
Despite the compression, retrieval is very fast.  A buffering scheme
makes second and following references to a particular region of the
text almost instantaneous.  Buffers are retained until a pre-set
memory limit is reached.  The least-recently-used buffers are then
reused if more are needed.


The Compression Scheme

The text is compressed using a modified version of the
Lempel-Ziv-Welch "compress" program.  The modification is very simple,
and consists merely of forcing compress to emit checkpoints after a
fixed number of input bytes which I call a "window".  One can thus
easily determine which compressed "window" contains a particular byte
of the original text.  By keeping track of the locations of the
checkpoints in the compressed data, it is then possible to uncompress
only the windows that are needed.  By the way, the uncompression is
done by a subroutine within the library -- no exec's or temporary
files are used.

Windows can be any size -- the size is stored in the data file and the
retrieval routines treat the file accordingly.  In the default
configuration, the windows are 64Kbytes, which was shown by experiment
to offer a reasonable compromise between efficient compression and
efficient buffer management.  If you want to experiment, you can
change the window size by editing the argument to "squish" in the
Makefile.


Installation

You'll want to create a directory to work in, then "cd" to it and
proceed.  If you have received a compressed tar file, then unpack the
archive with the following command:
	
		$ zcat bible.tar.Z | tar xvf -

Now execute "make" and wait for a while.  

		$ make

When make has completed, verify that the program and data files were
built correctly by executing the automatic test suite:

		$ make test
		
This will automatically run the tests and compare the results to a set
of standard results distributed with the package.  If all goes well
you should see:

   Running Test Suite (results in test.results)... test suite completed
   Comparing results to standard results
     results OK.

Now you can install the program, data, and man files into proper
system locations ("/usr/local/...") by executing:

		$ make install
		
if you have the proper permissions.  If you wish to install them
somewhere else, edit the Makefile and change the DEST variable, or
install the files by hand.


The Libraries

The Bible Retrieval System is intended to be more than just the
"bible" retrieval program.  Two libraries of routines are provided in
the BRS that may be used to construct other applications.

The "Text Storage Library" (TSL) routines could be used for *any*
textual data file; they are entirely independent of the structure of
the Bible.  They support the use of the windowed compression scheme on
any text, with fast retrieval of any particular line of the text.  For
this release, no separate documentation is provided for the TSL -- see
the files tsl.c, tsl.h, tsl-index.c and buildindex.c.

The "Bible Retrieval Library" (BRL) includes routines that are
specifically oriented to the Book-Chapter-Verse structure of the Bible
text, however they are independent of the storage structure of the
textual data, leaving that to the TSL.  The BRL routines make
retrieval programs such as "bible" extremely simple.  For this release
no separate documentation is provided for the BRL -- see brl.c, brl.h,
brl-index.c and bible.c.

Actually, there's also a third library of sorts.  "Compresslib"
contains a routine which may be called to uncompress a buffer of
LZW-compressed data.


Some Personal Notes...

In 1979, as the owner of "Chapin Associates" in San Diego, I started a
project to create an affordable computer-based retrieval system for
Bible text.  Working in UCSD Pascal on a PDP-11/03 with 60Kbytes of
memory and two 500Kbyte RX02 floppy drives, with my associates Neil
Fraser and Jan Denser, we succeeded in prototyping a system that used
word-level Huffman-coding for the text of the New Testament.
Unfortunately, pressed between economics and the limitations of the
available hardware, I wound up abandoning the effort in 1980.

In early 1989 I gained access to one of the available freeware Bible
retrieval programs for the PC.  I immediately decided that the time
had come to "close the loop" on this particular personal dream, with
Unix as the target environment.  There really aren't any serious
technical challenges any more to producing an acceptable Bible
retrieval implementation for Unix systems.  So I snatched the Bible
text, spent a few weekends and evenings at my workstation, and here it
is.  LZW compression is so much nicer and easier than the word-level
Huffman coding!  And it's great being able to count memory and disk
storage in MBytes instead of KBytes.  Even so, I've really tried to
keep the system's use of resources to a minimum.  But I don't think
1.7+ megabytes of data is too high a price to pay nowadays in most
Unix environments.

So... I hope others find these tools useful.

Chip Chapin
Hewlett Packard Company, California Language Lab (HP/CBO/NSS/CSG/DLD/LMO/CLL)
September 5, 1989

--------------------------------------------------------------------
uucp:      ... {allegra,decvax,ihnp4,ucbvax} !hplabs!hpda!chip
	or ... uunet!hp-sde!hpda!chip
Internet:  chip%hpda@hplabs.hp.com or chip%hpda@hp-sde.hp.com
HPDesk:    chip (hpda) /HPUNIX/UX
USMail:    MS47LZ; 19420 Homestead Ave; Cupertino, CA  95014
Phone:     408/447-5735    Fax: 408/973-8455    HPTelnet: 1-447-5735
--------------------------------------------------------------------


