Genezzo-Contrib-Clustered 
=========================

Shared data cluster support for Genezzo

Genezzo is an extensible database with SQL and DBI.  It is written in Perl.
Basic routines inside Genezzo are overridden via Havok SysHooks.  Override
routines provide support for shared data clusters.  Routines
provide transactions, distributed locking, undo, and recovery.  

INSTALLATION

To install this module type the following:

   perl Makefile.PL
   make
   make test
   make install

To prepare for use of Genezzo::Clustered

  ./genprepundo.pl

  gendba.pl
  >@havok.sql
  >@syshook.sql
  >@clustered.sql

LIMITATIONS

  This is pre-alpha software; don't use it to store any data you hope
  to see again!

  Transactions, Rollback, etc. are not fully implemented.  Process death
  and necessary cleanup is not detected.

DEPENDENCIES

This module requires these other modules and libraries:

  Genezzo
  FreezeThaw

  OpenDLM

SEE ALSO

  For more information, please visit the Genezzo homepage
  at http://www.genezzo.com

  also 
  http://eric_rollins.home.mindspring.com/genezzo/ClusteredGenezzoDesign.html
  http://eric_rollins.home.mindspring.com/genezzo/cluster.html
  http://opendlm.sourceforge.net/

TODO

  1)   [Genezzo fix] DB Block header size for filesystem devices needs to be
       a multiple of operating system block size.  Otherwise corruptions
       will occur. 

  2)   [Genezzo update] Need a exception mechanism when SQL is run to catch
       errors such as running out of disk space.  Currently multi-row
       SQL statements can result in partial results on disk when system
       runs out of disk space.
      
  3)   Use new exception mechanism (above) to handle deadlock with DLM.
       Currently program exits on any error from OpenDLM.  Need to modify
       inline::C interface to provide deadlock return code distinct from
       other errors.  Need to modify GLock Perl code to throw SQL exception.

  4)   [Genezzo update or documentation] Need to be able to add metadata to
       DB data blocks.  Need to update block checksum, etc. when this is done.

  5)   Use block metadata (above) to add process_id to each modified block,
       and later clear process_id.  This is done in or near 
       Genezzo::BufCa::DirtyScalar::STORE.  May need to register callback
       on different routine to prevent endless recursion.

  6)   [Genezzo update] Need way to invalidate entire buffer cache.
       Simply release all blocks, don't write them to disk (disk write
       should already have been done in commit case).

  7)   Invalidate buffer cache (using new mechanism above) before releasing 
       all locks on commit or rollback.  Actually only invalidate 
       non-system tablespace portion, see below.

  8)   [Genezzo update] add a way for command-line parameters (esp.
       gnz_home and undo_filename) and prefs to be available inside
       syshook packages.

  9)   Use command-line prameters (from above) for gnz_home and undo_filename.
       Can't use prefs since we may be starting a corrupt database which
       requires recovery before prevs can be read.  May be still able
       to use information stored in header of default datafile.  Header
       is not normally written at runtime (and isn't covered by undo), 
       and is assumed non-corrupt.

  10)  [Genezzo fix] fix "mystery" writes.  These bogus writes create 
       unnecessary write locks,  generate unnecessary undo, and confuse 
       rollback code.  Solution may be to attach syshook to new attachment 
       point instead of Genezzo::BufCa::DirtyScalar::STORE.

  11)  [Genezzo update] complete tablespace support.

  12)  Use tablespace support (above) to restrict locking to non-system
       tablespace tables.  Otherwise we need to lock additional blocks
       (for system tables) already read at startup prior to syshook
       initialization.  And locking system tables prevents other
       instances from running in parallel.

  13)  After the above, complete rest of development per Design Document:
   a)  Detect blocks needing recovery via process_id in block
   b)  Recover dead processes
   c)  etc...

COPYRIGHT AND LICENCE

    Copyright (C) 2005 by Eric Rollins.  All rights reserved.

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
    Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA

Address bug reports and comments to rollins@acm.org
