
                 USENET Sources Archiver             

                 @(#)README	2.3 5/9/91 

             Copyright (c) 1989, 1990, 1991 by Kent Landfield.

   Permission is hereby granted to copy, distribute or otherwise 
   use any part of this package as long as you do not try to make 
   money from it or pretend that you wrote it.  The copyright 
   notice must be maintained in any copy made.
  
   If you make modifications to this software that you feel 
   increases it usefulness for the rest of the community, please 
   email the changes, enhancements, bug fixes as well as any and 
   all ideas to me. This software is going to be maintained and 
   enhanced as deemed necessary by the community.
  		
			-Kent+
  		   uunet!sparky!kent
                   kent@sparky.imd.sterling.com
  
------------------------------------------------------------------
                       DISCLAIMER
------------------------------------------------------------------
Use of this software constitutes acceptance for use in an AS IS 
condition. There are NO warranties with regard to this software.  
In no event shall the author be liable for any damages whatsoever 
arising out of or in connection with the use or performance of this 
software.  Any use of this software is at the user's own risk.
-------------------------------------------------------------------


When made, this package currently contains 3 executables:

	o  rkive    - a USENET newsgroup archiver,
	o  article  - print formatted news article header information, and
        o  ckconfig - an rkive configuration file check program.


This package was initially designed for archiving comp.sources.all newsgroups.
It does however, support archiving of non-moderated, non-sources newsgroups.

                        -----
                        rkive 
                        -----

rkive reads a configuration file to determine such things as:

	o where the news directory resides,
	o where each newsgroup is to be archived,
	o the type of archiving to be done for each newsgroup,
	o the ownership and modes of the archived members,

as well as additional optional features such as:

        o which users/accounts to mail the archived member information to,
	o the location and format of log files, 
	o the location and format of index files,
	o the compression program to use (if desired). 

It is intended that rkive be run by cron on a daily basis. In this manner,
software is archived and available for retrieval from the archives on the
day it reaches the machine.  It allows for the archives to be managed by
the same or different people (or accounts).  It supports the building
of indexes for later review or to interface to the netlib type of mail
retrieval software. It also supports mailing notifications of the archiving
to a specified list of users or aliases. The indexes and log file formats
are specifiable by the person configuring the rkive configuration file.

Please read the file INSTALL for specifics on installation of rkive and
is associated applications.  What follows is a little background information
that might be helpful to read prior to reading INSTALL.

---------------------------
Archive Member Compression:
---------------------------
If you wish to have your archived articles compressed you may do so by
specifying the disk path to the compression program as the value for
COMPRESS in the rkive configuration file. It is important that *if* you
use a compression program other that "compress" or "pack" that you add
an entry to the compression routine table in the file suffix.h.
Currently, this program recognizes just ".z" and ".Z" suffixes.

----------------
REPOST Handling:
----------------
Warning:
	Repost handling is not a configurable parameter within the 
	rkive configuration file at this time.

ADD_REPOST_SUFFIX define added.
    This define allows the administrator to configure the software to
    add "-repost" (or whatever is defined in REPOST_SUFFIX) to the
    end of all files that are marked as REPOST by the newsgroup moderator.
    The suffix is added prior to compression. This feature should only be 
    configured/exist on systems whose filename limits are greater than 14.

MV_ORIGINAL define added.
    This define allows the administrator to configure the software to
    move the original article into a "originals" directory in the 
    problems directory. The inbound reposted article is placed into 
    the archive in the correct position.

If neither define is specified then the inbound article is placed into 
the archive in the correct position only if the initial article is not 
in the archive.  Otherwise the reposted article is placed in the problems 
directory as normal duplicate articles are now.

-----------------
PATCHES Handling:
-----------------
rkive supports the Auxiliary header "Patch-To:". The Patch-To: line
exists for articles that are patches to previously posted software. 
The Patch-To: line only appears in articles that are posted, "Official", 
patches. The initial postings would not contain the Patch-To: auxiliary 
header line.

Auxiliary Headers For Patch Postings:

	Submitted-by: Kent Landfield <kent@sparky>
	Posting-number: Volume 23, Issue 14
->	Patch-To: rkive: Volume 19, Issue 98-101
	Archive-name: rkive/patch1

Patch-To: syntax
	Patch-To: package-name: Volume X, Issue x[-y,z]

Patch-To: examples. These are examples and do not reflect the
accurate volume/issue numbering for rkive.

In the first example, the article that contains the following line
is a patch to a single part posting.
	Patch-To: rkive: Volume 19, Issue 98

This example shows that the 122-124 indicates the patch applies to
a multi-part posting. The '-' is used to mean "article A through article
B, inclusive..
	Patch-To: rkive: Volume 19, Issue 98-101

If a patch applies to multiple part postings that are not consecutive, the
',' is used to separate the part issue numbers. It is possible to mix both
',' and '-' on a single Patch-To: line.
	Patch-To: rkive: Volume 99, Issue 122,125,126,127
	Patch-To: rkive: Volume 22, Issue 122,125-127

There are two different types of handling with regards to patches. 

	Package     - This type of archiving of patches places the patches
                      in the same directory that the initial source was
                      posted to. This type of archiving is only available
                      to newsgroup archives that are using Archive-Name
                      archiving.
                   
	Historical  - This type of archiving patches is done by sites that 
                      want to place the patches in the volume/issue in 
                      which the patch originally arrived.

Archive recognizes that the Patch-To: line indicates the article is 
a patch.  For Archive-Name archiving which has specified "Package" 
patches archiving in the configuration file, rkive puts the article 
into the directory that contained the initial posting (volume19/rkive). 
For Archive-Name that has not specified Package archiving or for 
Volume/Issue archiving, the article would still be labeled as
volume23/rkive/patch1 or volume23/v23i014 respectively.

rkive also writes a .patchlog file in the BASEDIR for the newsgroup
that is used to track patches to originally posted software. The
.patchlog is going to be used for the "random software downloader :-)"
so that complete software packages (sources and patches) can be requested
from sites that do not use combined Archive-Name and Package archiving.
The format of the .patchlog file is:

#
#	Patch log for comp.sources.whoknows
#
# Path To                    Patch       Package        Initial
# Patchfile             Volume  Issue     Name      Volume  Issue  
#
volume6/bb/patch01          6     86       bb          3    70-73
            or if volume issue format..
volume6/v06i86              6     86       bb          3    70-73

-------------------------
Article Header Reduction:
-------------------------
Articles that are stored just as they arrived on your system are potentially
wasting disk space. Certain rfc822/rfc1036 header lines are of little use
after the article is archived.  If you wish to have the headers "trimmed" 
when the file is archived, assure that REDUCE_HEADERS is defined. Currently 
all header lines that are *not* either;

    From:, Newsgroups:, Subject:, Message-ID: Approved:, and Date:

will be removed. This can produce a savings of as much as 200 to 500 
bytes per archived article.

See news_arc.c if you wish to add or subtract header lines to be kept.
The modifications need to be made to the hdrstokeep table just above the
keep_line() function.

---------
Security:
---------
rkive sets the ownership, group and modes on the archived members according
to the information specified in the configuration file. Currently though,
rkive uses the default umask for creating the log and index files.

rkive will not archive files outside of the BASEDIR specified in the 
configuration file so a "prankster" can not do nasty things to your
system files by having an Archive-name line like:
	Archive-name: ../../../../../../etc/passwd

It will also not overwrite duplicate files. They are stored underneath
the problems directory specified in the configuration file. The admin 
is alerted to the fact and it then becomes a manual cleanup problem.

                        -------
                        article 
                        -------

Article allows you to view the article headers in much the same manner
that you use a printf statement.  This was initially done for debugging
purposes but I quickly found that it was extremely useful in dealing
with news articles in general. It works great in shell scripts to view
articles that need to be read.... Also super for perusing the archives
directly and generating indexes to the archives in *many* different 
ways...:-)  If you do not want to use rkive for archiving sources or
newsgroups, do yourself a favor and compile article. It will be worth
the time it takes...

                        --------
                        ckconfig 
                        --------

This program is used by the admin to verify just how rkive will 
interpret the variable specifications in an archive configuration
file. If you have problems, it will bomb out when it encounters
the problem. Not real smart but it does the job..

------------------------------------------------------------------------
This software set was developed under an archiving model similar to
that maintained currently on uunet. It was intended that the archiving
facilities were more of a "site" facility and not an individuals
facility. (That is unless the individual owned the site :-)). I have
even used rkive for maintaining private (many on a single machine)
archives. rkive will accept an rkive.cf file specified on the
command line so it would be possible for an individual to have their own
mini archive directory structure. This is not recommended if the site is
doing archiving since the software will store multiple copies thus wasting
more disk space than it is worth.
------------------------------------------------------------------------
Credits:
--------
I have to give credit to where credit is do.

I used the code in header.c of the News 2.11 as the basis of ideas for
dealing with the article headers. The code I have written is not the same
but most of the concepts and some of the flow control resulted from reviewing
how it was "suppose to be done". (rfcs only go so far.. :-)) For that I
thank Rick Adams and the authors of B news for the excellent code to study
from.. :-)

I would also like to thank my beta testers for the headaches of dealing
with me, with forcing different ideas on me at a time when I was "almost"
willing to listen :-) and for the "full redistribution of sources" when
I had a new version. 
------------------------------------------------------------------------

Please read all the directions below before you proceed any 
further, and then follow them carefully.  

                    --------------
                     Installation
                    --------------

This package uses Doug Gwyn's directory access routines posted in
comp.sources.unix/volume9 (with the bug fix as well). You may need
to get a copy if you don't already have one and your system does 
not support POSIX Compatible directory access routines.  rkive was
written using the POSIX directory routines to make the code cleaner
and easier to test. If you want to waste your time moving it to any
other directory reading routines, don't send me your patches. These
are the only patches I will not even consider including in the baseline
sources.  Now with my rudeness aside... :-)

1)  Take the time to format and read the man pages prior to continuing.
		make man | less/pg/more

2)  Review/modify rkive.h to make sure system defines are correct.  

3)  Determine the method for directory creation and edit the Makefile
    accordingly.

4)  make

    This will attempt to make the software in the current directory.

5)  Put rkive, ckconfig, and article into a public directory 
    (normally /usr/local/bin), and put a template of the rkive
    configuration file  (if one does not exist) into an rkive specific
    library directory (normally as /usr/local/lib/rkive/rkive.cf).  Place 
    the man pages in the appropriate man directories for your site.


6)  I have set up an account for the source archives.  This is not really 
    necessary but is a personal preference. Archive needs to be run as 
    root *if* you do not have the mkdir () and wish to use the builtin
    since it needs to use mknod() to create directories. It also needs
    to run as root if the owner/group specified in the rkive.cf is 
    different from the user it is being run under.

    ---x--x--x  1 root archive    43048 Apr  9 16:38 /usr/local/bin/rkive
    ---x--x--x  1 src  archive    14836 Apr  9 16:38 /usr/local/bin/article
    ---x--x--x  1 src  archive    27448 Apr  9 16:38 /usr/local/bin/ckconfig
    -r--r--r--  1 src  archive     6173 Apr  9 16:40 /usr/local/lib/rkive.cf

7)  Re-read the manual entry for rkive.1 and rkive.5.

8)  Modify the template rkive configuration file to reflect the local
    archive conditions. ckconfig should be used in order to check the
    information that you have just entered/modified in the rkive.cf file. 

9)  VERY IMPORTANT! If you have a problem, there's someone else out there 
    who either has had or will have the same problem.  Please send all 
    patches, ideas, etc to kent@sparky.imd.sterling.com (or uunet!sparky!kent)
    so that I can continue to improve the functionality and portability of 
    this package. 
