Journaling for ext2fs, alpha release 0.0.1

The "danger, danger, danger!" release.

Released 4 September 1999,
Stephen Tweedie <sct@redhat.com>


*** Nobody accepts any responsibility if the use of this code damages
*** your filesystem, corrupts data, creates a black hole or turns you
*** into a sperm whale.  If I had a lawyer he'd probably have told me to
*** say this.  You have been warned.


Introduction
------------

OK, folks, here is the first journaled ext2 release.  It is an alpha
release.  It is incomplete, and has not been substantially tested.
However, it works, it appears to be usable, and it has never lost me any
data.  (With one caveat --- see below.)

What is journaling?

    * It means you don't have to fsck after a crash.  Basically.

What works?

    * Journaling to a journal file on the journaled filesystem

    * Automatic recover when the filesystem is remounted

    * All VFS operations (including quota) should be journaled

    * Add data updates are also journaled


What is left to be done?

    * Journaling of metadata only.  Currently everything is journaled,
      incuding data, resulting in a performance drop as all data gets
      written twice.

      Journaling of metadata only is supported but is not enabled.  It
      turns out to involve several extra complications in the journaling
      buffer state, so I'm testing the simpler case first to get that
      reliable on its own.

    * Journaling to an off-filesystem device, eg. NVRam

    * Automatic reclamation of unlink but still-referenced files on
      reboot

    * Error recovery.  You will see that the source is marked quite
      carefully where there are potential IO or memory allocation errors
      which can disrupt things, but the code to respond to that (either
      to remount the fs readonly or to abort and panic) remains to be
      added. 

    * Decent documentation!

    * A few internal cleanups: migrating the extra buffer_head fields to
      a separate jfs_buffer_info field in particular.

    * Mounting a jfs volume readonly and then remounting readwrite DOES
      NOT WORK.  It corrupts your filesystem, predictably.  I'm not sure
      why, yet, but in the mean time you cannot use jfs for a
      readonly-mounted root filesystem.  Of course, journaling should
      let you mount it read-write straight away, but I haven't
      experimented with that yet!

    * e2fsprogs tools.  e2fsck needs to know about the journal (but see
      below). 

How to apply
------------

This README should have come with two diffs:

  -rw-rw-r--   1 sct      sct        217634 Sep  3 19:19 kdb-v0.5-2.2.2-sct
  -rw-rw-r--   1 sct      sct        337673 Sep  4 18:58 linux-2.2.2-ext3.diff

The first one is a copy of SGI's kdb kernel debugger patches for the
2.2.2 (yes, 2.2.2!) kernel.  Apply this first if you want kdb.  The
second patch is the ext3 filesystem.  If you apply this without the kdb
diff, you will get a couple of rejects (the ext3 diff includes a kdb
module for interrogating jfs data structures) --- ignore those.

Why 2.2.2?  Because that was the most reliable version when I started
this work --- 2.2.3+ had stability problems until things got settled
around 2.2.6/7, and there were a few fundamental VFS changes by then.
Top priority now is to port jfs/ext3 forward against a 2.2.12 kernel.

If you can't apply kernel patches, stop reading this now.  Right now.

Now, configure the kernel, saying YES to "Enable Second extended fs
development code" (I *assume* you want it!), and build it.


What next?
----------

Now, you want to make a journaled filesystem (recommended) or journal an
existing one (for the exceptionally stupid/brave).  Great.  Go right
ahead, make a new ext2 filesystem if you need to, and mount the
filesystem you want to journal.

Be aware that the jfs patch does _not_ change the ext2 code.  Rather, it
makes a copy of ext2 called ext3, and all the fancy footwork takes place
in that.  You don't have to run ext3 on all your valuable filesystems:
just use it on the throwaway ones.

Now, create a journal file.  I don't know how big it should be yet: the
rules of thumb have yet to be established!  However, try (say) 2MB for a
small filesystem on a 486; maybe up to 30MB on a bit 18G 10krpm
Cheetah.  Or whatever you want.  You'll need to make sure that the file
is preallocated, so use something like:

	dd if=/dev/zero of=/mnt/sparefs/journal.dat bs=1k count=10000

assuming you want a 10MB journal on a 1k ext2 filesystem mounted on
/mnt/sparefs.  You need to find the journal inode's inode number, too:

	ls -i /mnt/sparefs/journal.dat

For a newly created filesystem, this will probably show

        12 journal.dat

OK, 12 is the expected number for a clean fs.

Now, umount as ext2.  Take a deep breath.  Now mount as ext3, giving it
the inode number of the file to be mounted as a journal:

	mount -t ext3 /dev/sdb2 /mnt/sparefs -o journal=12

Bingo.  That's it.  Enjoy!


How to fsck
-----------

Right now, e2fsck will reject an uncleanly unmounted ext3 partition.
However, if you umount an ext3 filesystem cleanly, ext3 will clear the
compatibility flags which tell e2fsck not to bother it, and you will
then be able to run e2fsck on it quite happily.

However, the whole point is that you don't HAVE to run e2fsck after a
crash, right?


Known Bugs
----------

Lots of stuff is missing, in particular the ext3-aware fsck tools.
However, the only known bug right now is the mount-readonly bug referred
to above.  All of the other bugs are currently unknown.  Good luck
finding them.


Discaimer
---------

Oh, I already did that.


Final note
----------

At the time of writing, I am just about to leave for Linux Kongress for
a week.  No doubt my mailbox will start overflowing the moment I switch
my computer off for a week.  I'll try to look into bug reports and
problems, but I certainly can't guarantee to reply to every mail message
I get in response to this initial release.  If your problem can be
reproduced, I'll try to fix it in the next release.  If it still remains
after that, then start to pester me.

Enjoy.
--Stephen.
