                                 Lesson 10a.

                   Playing with files and their attributes.

  A computer file has three parts, naturally enough the data in it, and
secondly its "attributes". Let's use the model of the box file in an
the Dickensian office, you can think of the attributes as the label 
on the outside of the file, the data in the file is naturally enough the
papers in the box. There is one more parallel - the position of the file
in the strong-room of Mr. Dickens' legal advisers, which corresponds to
the position of the file in the computer's file-store.

  Taking these subjects in reverse order. Almost all computers have a
file store - the exceptions are the dedicated processors which are pre-
programmed with an algorithm to do a specific task such as controlling 
the washing machine or printing a bus ticket. The position of the 
file is sometimes known as the "path". As an example the path of the
file into which I'm writing this lesson is:-

/usr/chris/c-notes/Lesson.10a/Lesson.txt

We need not go into all the specifics of the hows and whys of the exact
mechanism used by the operating system to actually translate this
string of characters into the sector number in a track on a particular
disk platter. It is not really relevant to the C language. Suffice it to say
that when you refer to a file using its path the o/s does the calculations
for you and you don't have to concern yourself with any of the details except
to note that the strings of characters between the slashes are the names
of directory files. The last string is the name of the file to which we are
- presumably - going to refer. This name is stored in the last directory
mentioned in the path. In this case, it is "Lesson.10a", and the name of the
file is "Lesson.txt", to which, with the help of the unix utility vi, I am
currently referring.

  So much for the path, just remember that the whole string of characters
starting at the first "/" is needed to identify the required file uniquely.

  Now for the attributes. These Notes are supposed to be about the C
programming language and the notion of file attributes is really a subject
which is probably nearer to the heart of the computer's operating system
than to the heart of the programming language. However it is impossible to
write a C program properly if you are happily unaware of the correct
techniques used to access the file system and the files therein. So here's
a little program which accesses the attributes.

 /* -------------------------------cut here--------------------------------- */

#ident "@(#) stat.c  - Return the file status."

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/errno.h>
#include <time.h>

print_time ( m, f, t )
char *m, *f;
time_t t;
{
	struct tm *time_data;
	char *time_string;

  time_data = localtime ( &t );
  time_string = asctime ( time_data );
  printf ( "The %s time of file %s is %s\n\n", m, f, time_string );
  }

usage()
{
	printf ( "stat -<opt> where opt is one of \"acdfgilnmrsu\"\n" );
	exit ( 1 );
	}

static char *getopt_string = { "acdfgilnmrsu" };

main ( argc, argv ) int argc; char **argv;
{
  struct stat buf;
  int ch, i;
	extern char *optarg;
	extern int optind, opterr;

  if ( stat ( argv[2], &buf ))
  {
    perror ( "Stat" );
    exit ( 1 );
    }

	if (( ch = getopt ( argc, argv, getopt_string )) != EOF )
	{
		switch ( ch )
		{
case 'a': print_time ( "access", argv[2], buf.st_atime );       /* Access   */
          break;
case 'c': print_time ( "creation", argv[2], buf.st_ctime );     /* Created  */
          break;
case 'd': printf ( "%d\n", buf.st_dev ); break;
case 'f': print_time ( "modification", argv[2], buf.st_mtime ); /* Modified */
          break;
case 'g': printf ( "%d\n", buf.st_gid ); break;
case 'i': printf ( "%d\n", buf.st_ino ); break;
case 'l':
case 'n': printf ( "%d\n", buf.st_nlink ); break;
case 'm': printf ( "%o\n", buf.st_mode ); break;
case 'r': printf ( "%d\n", buf.st_rdev ); break;
case 's': printf ( "%ld\n", buf.st_size ); break;
case 'u': printf ( "%d\n", buf.st_uid ); break;
default: usage (); exit (1 );
			}
		}
	else
	{
		usage();
		}
	}

 /* -------------------------------cut here--------------------------------- */

  Yes, you guessed right! there is a little "program maintenance" to do before
this little toy could be called a "useful utility".

	1) One should be able to enter more than one option at a time.
	2) The other data elements in the stat structure should have more
		 meaningful messages, similar perhaps to the time ones.
	3) More than one file.
	4) Use the gnu "getopt for long options" from the Free Software Foundation.
	
	Go to it. You will learn two things from the exercise.

	1) How stat works.
	2) How getopt works.

	These are standard unix functions and are fully documented in both the
Programmers Reference Manual and the C Language Interfaces. This latter book
is the official at&t reference manual for the System V libraries. It's
as readable, if not slightly more so, as any unix documentation.
ISBN 0-13-109661-3.

  Now for the other example for learning about file i/o.

	This program is a demonstration of how to use the buffered file i/o
system. Unix does all the buffering for you and is really ever so helpful,
you are given routines which are simple to use and they all return useful
values so that you can check for abnormalities. For ordinary file i/o there
is no need to use the more complex unbuffered routines.

	So let me present "dog" - The reverse of "cat". "dog" cuts files up into
equal sized fragments. I had to write it recently to allow me to transfer
quite a large archive file from one computer to another using floppy disks.
The program is simple to use, the usage line is self-explanatory. Its the
equivalent of "split" for binary data files such as compressed archives.

Usage:
dog -f filename [ -s size of output fractions ] [ -o output directory ]

 /* -------------------------------cut here--------------------------------- */

#ident "@(#) dog.c - Opposite of cat. Splits a file into sections."

#include <stdio.h>
#include <assert.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <ustat.h>
#include <signal.h>
#include <string.h>
#include <malloc.h>
#include <unistd.h>

/*
** This is set so that it is convenient for a 720kb 
** diskette. Change it if your diskette size is different.
*/

#define DEFAULT_PUPPY_SIZE 720000

#define BLOCK_SIZE 512
#define PATH_LENGTH 128
#define FILE_NAME_SIZE 14

#define DIRECTORY_DELIMITER '/'

typedef enum { FALSE, TRUE } boolean;

static char *errors[] =
{
  { "\n%s: The size of the input file %s ( %d ) is smaller\n\
than the size of the output fractions ( %d ).\n\
No need to use %s!\n\n" },
  { "\n%s: Insufficient space in destination file system.\n\n" },
  { "\nYour ulimit for files is currently %ld,\n\
which is smaller than the size ( %ld ) of the output files.\n\
Please ask your system administrator to raise the value for you.\n\n" },
  };

char *usage_message =
"Usage:\n\
%s -f filename [ -s size of output fractions ] [ -o output directory ]\n";

static void usage ( progname )
{
  fprintf ( stderr, usage_message, progname );
  exit ( 1 );
  }

#if !defined(NDEBUG)

/*
** This concept is very useful. It allows you to use assert to cause a core
** dump if any of the file i/o operations fail. This is frequently
** the author's problem. Using the stack tracing facilities in sdb allows
** you to find out where things went wrong.
*/

static exit ( n )
int n;
{
	assert ( !n );
	}

/*
** This allows you to use a signal from the keyboard to stop execution
** and make a core dump. Very useful for interrupting endless loops.
*/

catch()
{
  abort();
  }
#endif

main ( argc, argv )
int argc;
char **argv;
{
  FILE *in, *out;
  struct stat input_stat_buffer, output_stat_buffer;
  struct ustat ustat_buffer;
  long ulimit ();
  unsigned int puppy_count,
               puppies;
	int flag = 0;
  unsigned long int byte_count,
                    max_puppy_size,
                    puppy_size,
                    remaining_bytes,
                    current_ulimit,
                    file_size;

  char input_file_name[ FILE_NAME_SIZE + 1 ],
       output_file_name[ FILE_NAME_SIZE + 1 ],
       input_path_name[ PATH_LENGTH ],
       output_path_name[ PATH_LENGTH ],
       *input_file_name_p,
       *input_path_name_p,
       *output_file_name_p,
       *output_path_name_p,
       *complete_output_file_name_p,
       *progname,
       *default_output_file_name = "puppy";
  int c;
  extern int optind, opterr, getopt();
  extern char *getcwd(), *getenv(), *optarg;
  boolean got_input_file, got_size, got_output_directory;

#if !defined(NDEBUG)
  signal ( SIGINT, catch );
#endif

  progname = argv[0];
  got_input_file = got_size = got_output_directory = FALSE;

  if ( argc < 3 || argc > 7 ) usage ( progname );
  while (( c = getopt ( argc, argv, "f:o:s:" )) != EOF )
  {
    switch ( c )
    {
case 'f':
      got_input_file = TRUE;
      strncpy ( input_path_name, optarg, PATH_LENGTH );
      break;
case 'o':
      got_output_directory = TRUE;
      strncpy ( output_path_name, optarg, PATH_LENGTH );
      break;
case 's':
      got_size = TRUE;
      if ( 1 != ( sscanf ( optarg, "%u", &max_puppy_size ))) usage( progname );
      break;
default:
      usage ( progname );
      }
    }

  if ( !got_size ) max_puppy_size = DEFAULT_PUPPY_SIZE;

  if ( !got_input_file )
  {
    usage ();
    }

  /*
  ** Here is where we do all the checking to see that the
  ** operation is possible.
  ** ( Many people would consider this to be overkill. )
  **
  ** check the following:
  ** 0) That the file to be cut up is in fact smaller than the
  **    output file size.
  ** 1) Permissions on the input file.
  ** 2) Permissions on the output directory.
  ** 3) That there is sufficient space in the file system which 
  **    contains the directory to which we are going to write.
  ** 4) Check that there are sufficient inodes available.
  ** 5) that the ulimit is not going to be exceeded.
  */

  /*
  ** Parse the input file name into separate path and filename components.
  */

  if (( input_file_name_p = strrchr ( input_path_name, DIRECTORY_DELIMITER )))
  {
    input_file_name_p++;
    }
  else
  {
    input_file_name_p = input_path_name;
    }

  strcpy ( input_file_name, input_file_name_p );

  /*
  ** Check ( 0 ) That the input file is larger than the output fractions.
  */

  if ( stat ( input_path_name, &input_stat_buffer ))
  {
    perror ( progname );
    usage( progname );
    }

  if ( input_stat_buffer.st_size <= max_puppy_size )
  {
    fprintf ( stderr,
              errors[0],
              progname,
	            input_path_name,
              input_stat_buffer.st_size,
              max_puppy_size,
              progname
              );
    exit ( 1 );
    }

  /*
  ** Check ( 1 ) that user is allowed read access to the input file.
  */

  if ( access ( input_path_name, R_OK ))
  {
    char *error_message;

    error_message = ( char *) malloc ( strlen ( input_path_name ) +
                                       strlen ( progname ) + 4);
    sprintf ( error_message, "\n%s: %s", progname, input_path_name );
    perror ( error_message );
    free ( error_message );
    exit ( 1 );
    }

  /*
  ** Check ( 2 ) that user is allowed write access to the output directory.
  */

  if ( !got_output_directory )
	{
		if (!(output_path_name_p = getcwd ( output_path_name, PATH_LENGTH )))
		{
			perror ( progname );
			exit ( 1 );
			}
		}
	else
	{
		 output_path_name_p = output_path_name;
		 }

  if ( stat ( output_path_name_p, &output_stat_buffer ))
	{
		perror ( progname );
		exit ( 1 );
		}
	
	if ( access ( output_path_name, W_OK | X_OK ))
  {
    char *error_message;

    error_message = ( char *) malloc ( strlen ( output_path_name ) +
                                       strlen ( progname ) + 4);
    sprintf ( error_message, "\n%s: %s", progname, output_path_name );
    perror ( error_message );
    free ( error_message );
    exit ( 1 );
    }

  /*
  ** Check ( 3 ) that there is sufficient space
  ** in the file system for the output files,
  ** which are actually the same size as the
  ** input file. If there is no output directory
  ** specified, then the output files are placed 
  ** in the present working directory.
  */

  if ( ustat ( output_stat_buffer.st_dev, &ustat_buffer ))
  {
    perror ( progname );
    exit ( 1 );
    }
  
  /*
  ** unix, with its charming inconsistencies, returns the 
  ** size of the free disk space measured in blocks.
  */

  if ((daddr_t) BLOCK_SIZE * ustat_buffer.f_tfree < input_stat_buffer.st_size )
  {
    fprintf ( stderr,
              errors[1],
              progname
              );
    exit ( 1 );
    }

  /*
  ** Put code to test for sufficient inodes in here.
  ** ( This is unix o/s specific, and its need is somewhat
  ** nebulous, so I'm leaving it as an exercise for the reader.
  ** If you are working in a unix environment, it's a worthwhile
  ** exercise for you to do, because you will gain an appreciation
  ** of the construction of the unix file system. )
  */

	/*
	** Calculate the number of output file which are going to be produced.
	*/

  puppies = input_stat_buffer.st_size / max_puppy_size;

  if ( 0 != ( input_stat_buffer.st_size % max_puppy_size))
  {
    puppies++;
    puppy_size = input_stat_buffer.st_size / puppies;
    remaining_bytes = input_stat_buffer.st_size % puppies; 
    puppy_size += ( remaining_bytes / puppies );
    remaining_bytes %= puppies;
    }

	/*
	** Discover the maximum file size allowed.
	*/

  current_ulimit = ulimit ( 1, 0L  );
  file_size = puppy_size + remaining_bytes;

  if ( current_ulimit * BLOCK_SIZE < file_size )
  {
    fprintf ( stderr, errors[2], current_ulimit, file_size );
    exit ( 1 );
    }

  /*
  ** Now, at last, we can open the input file and do the job in hand.
  */

  if (( FILE * ) NULL == ( in = fopen ( input_path_name, "r" )))
  {
    perror ( progname );
    exit ( 1 );
    }

  output_file_name_p = ( strlen ( input_file_name_p ) + 2 >= FILE_NAME_SIZE )
                     ? default_output_file_name
                     : input_file_name_p;

  complete_output_file_name_p = malloc ( strlen ( output_file_name_p ) +
                                         strlen ( output_path_name_p ) + 4 );

  /*
  ** Derive file names of puppies, and whelp.
  */

  for ( puppy_count = 0; puppy_count < puppies; puppy_count++ )
  {
    sprintf ( complete_output_file_name_p,   /* Create the name of the   */
              "%s/%s%.2d",                   /* name of the output file. */
              output_path_name_p,
              output_file_name_p,
              puppy_count
              );
    printf ( "%s\n", complete_output_file_name_p );  /* Display the name. */
    fflush ( stdout );                  

    if (( out = fopen ( complete_output_file_name_p, "w" )) == (FILE *) NULL )
    {
      fprintf ( stderr,
                "%s: Can't open %s for output\n",
                progname,
                complete_output_file_name_p
                );
      free ( complete_output_file_name_p );
      exit ( 1 );
      }

    for ( byte_count = 0; byte_count < puppy_size; byte_count++ )
    {
			flag = 0;
      flag |= putc ((flag = getc ( in )), out );
      if ( flag == EOF ) break;
      }

    if ( ferror ( in ) || ferror ( out ) )
    {
      perror ( progname );
      exit ( 1 );
      }
    fclose ( out );
    }

  if (( out = fopen ( complete_output_file_name_p, "a+" )) == (FILE *) NULL )
  {
    fprintf ( stderr,
              "%s: Can't open %s for output", progname,
              complete_output_file_name_p
              );
      free ( complete_output_file_name_p );
      exit ( 1 );
    }

  for ( byte_count = 0; byte_count < remaining_bytes; byte_count++ )
  {
		flag = 0;
    flag |= putc ((flag = getc ( in )), out );
    if ( flag == EOF ) break;
    }

  free ( complete_output_file_name_p );

  if ( ferror ( in ) || ferror ( out ) )
  {
    perror ( progname );
    exit ( 1 );
    }

	flag = fclose ( out );   /* Let's be pedantic, just in case there */
  flag |= fclose ( in );   /* is a self appointed expert watching.  */

  if ( flag == EOF )
  {
    perror ( progname );
    exit ( 1 );
    }

  return 0;    /* At last, we can return "success" to the shell! */
  }

 /* -------------------------------cut here--------------------------------- */

  I know it looks long and tortuous, but and it is a big BUT, all that
checking is that which converts a quick and dirty little program into a
fully-fledged "industrial-strength" utility. Study it and see if you can
find any places in the code where it will fall apart at the seams. If you
can find something amiss, please tell me about it. I have tested it as well
as the author of a program can test his own work. I created a file-system on
a floppy and mounted it and then filled it almost full. dog reported that
there was insufficient room to proceed. I then emptied the floppy file
system, and started dog up. Fiddling with the floppy drive door-catch while
dog was running caused i/o errors which dog reported correctly. Doing this,
by the way, will almost certainly ruin the file system on the floppy disk.
Don't forget to re-format it and create a new filesystem before putting it
away in the box!

	As only the Almighty is perfect there is a little inconsistency which
will become apparent if you try to port dog to the ubiquitious single user
system. See if you can find it.

To compile just say:-

$ cc -DNDEBUG -o dog dog.c

	I know that this is really just the start of a lesson about using the file
system and the files therein. This is the reason this lesson is known as
Lesson10a. Lesson 10b will deal with locally buffered i/o and file locking.


Copyright notice:-

(c) 1993 Christopher Sawtell.

I assert the right to be known as the author, and owner of the
intellectual property rights of all the files in this material,
except for the quoted examples which have their individual
copyright notices. Permission is granted for onward copying,
storage, but not modification, of this course in electronic data
retrieval systems, and its use for personal study only, provided
all the copyright notices are left in the text and are printed
in full on any subsequent paper reproduction.

In other words you may pass it around to your friends and print it
out in full on paper, but you may not steal my text and pretend
you wrote it, change the text in any way, or print it as a bound book.

