.ND
.cm  This file is set for nroff (make document).  To produce typeset format
.cm  using 'itroff -ms doc.ms' or 'ptroff -ms doc.ms', comment out .pl 10i.
.cm
.pl 10i
.nr PS 11
.nr LL 6.3i
.nr LT 6.3i
.nr PO .8i
.ds LH NNStat\(emInternet Statistics Package 
.ds CH 
.ds RH Release 3.0
.ds LF Braden & DeSchon
.ds RF [Page %]
.LP
.nh

.ce 10
.LG
.LG
NNStat:

Internet Statistics Collection Package

Introduction and User Guide



.NL
Robert T. Braden
Annette L. DeSchon

USC / Information Sciences Institute
Marina del Rey, California

January 1991

.cm .fi
.na
.in +0.4i
.nr LL 5.9i
.ll 5.9i
.SM
.ce 
RELEASE 3.0
.ce 0

.mc |
This document describes Release 3.0 of NNStat, a package of programs
for the distributed collection of Internet traffic statistics.



.ce
SYNOPSIS OF CHANGES IN RELEASE 3.0
.ce 0

.IP *
Support was added for Ultrix and for little-endian machine architectures.
.IP *
The configuration language was extended to include all the features
originally envisioned in the SIGCOMM '88 paper.  In particular, boolean
expressions, \*Qsymmetric if\*U statements,  and \*Qselect\*U statements
are now supported.
.IP *
The algorithm used to compile a 
.B statspy
configuration was improved to produce more efficient execution.
For details, see ISI Report RR-88-207.
.IP *
Configuration error messages now include line numbers, to pinpoint configuration
errors.
.IP *
Two new commands were added to statspy: \*Qinclude\*U allows more convenient
creation of configurations, and \*Qfile\*U allows retrospective diversion of
standard output to a file.
.IP *
The \fIbin-pkt\fP and \fIworking-set\fP object classes have been
extended in several ways, and two new binary classes \fIbin-pkt2\fP and
\fIworking-set2\fP have been added.  See Appendix A for details.
.IP *
An old bug in
.B collect
that caused occasional polls to be missed has been fixed.
.IP *
A number of other minor errors have been found and fixed.  See the
CHANGES file in the release for more details.
.IP *
The Ethernet interface code was reorganized to simplify support for
alternative system interfaces.
.IP *
There were some significant internal reorganizations performed on
statspy.  See the CHANGES file.
.IP *
Many internal cosmetic changes were made.
.LP

.mc  
See Appendix E for a summary of earlier releases.
.nr LL 6.3i
.ll 6.3i
.in 0
.bp


.NH
Introduction
.LP
NNStat is a facility for the
distributed collection of Internet traffic statistics. This facility is
designed to support the requirements of a network administrator for
gathering long-term usage statistics simultaneously at many network
entry points.  Although it is primarily intended for collecting
long-term traffic statistics for administration, management, and
topology engineering, NNStat is sufficiently general to be useful for
some operational problem solving.

Distributed statistics collection has two aspects: (1) acquisition of the
primary data at one or more locations, and (2) collection of all this acquired
data at a single location.

.IP (1)
Distributed Data Acquisition

The raw data must be acquired at a number of network/Internet points
simultaneously.  In the NNStat model, there will be a
.I
statistics acquisition agent
.R
(SAA) process executing in a computer system attached to each
network/Internet node for which data is required.  The SAA machines
could be packet switches, gateways, general-purpose hosts, or hosts
dedicated to the acquisition function.

.IP (2)
Centralized Data Collection

Data (or summaries of data) acquired by the SAA processes must be
transmitted to a central site for analysis, reporting, and long-term
storage.  This central site, the
.I
statistics collection host
.R
(SCH), will run a data collection program to gather the data from the
SAA processes.  In many cases, a single locus for data collection is
sufficient; however, it should be possible to have multiple SCH's
simultaneously gathering data from the same set of acquisition agents.
We may think of a primary SCH that serves as a central repository for
usage data by a particular administration, with perhaps secondary
collection hosts being used intermittently for short-term statistical
studies.
.LP

The principal components of the NNStat package are an SAA program and
an SCH program.  The NNStat design is based upon the common use of
Ethernets for interconnection of networks and Internet regions.  NSFnet
provides an example:

.IP o
Each component of NSFnet above the campus level (i.e., the NSFnet
Backbone and each of the middle-level networks) consists of a set of IP
gateways connected by serial lines.

.IP o
Each gateway is also connected to an Ethernet that is used as the
interconnect medium to one or more lower-level networks.  We refer to
this as an \fIinterconnect Ethernet\fP.
.LP

Figure 1 shows a typical configuration at one of the network nodes.
The gateway G is a packet switch that forms part of the network under
consideration.  G1 and G2 are entrance gateways to the same or
different lower-level networks.

.KS
.cs R 18 
.ss 18
.nf
.in +0.8i

\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ Lower-level Network(s)
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ |\ \ \ \ \ \ \ \ |
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ |\ \ \ \ \ \ \ \ |
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ G1\ \ \ \ \ \ \ G2
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ |\ \ \ \ \ \ \ \ |
\ \ \ \ \ \ \ \ \ Interconnect\ |\ \ \ Ether|net
\ \ \ \ \ \ \ \ |======.======.========.========.====|
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ |
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ __|__
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ G\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ | SAA |
\ \ \ \ \ \ \ \ \ \ \ \ \ \ /\ \\e \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ |_____|
\ \ \ \ \ \ \ \ \ \ \ \ \ /\ \ \ \e
\ \ \ \ \ \ \ \ \ \ \ \ /\ \ \ \ \ \e
\ \ \ \ \ \ \ Serial lines to other
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ network nodes


     Figure 1.  Typical Network Node Configuration


.cs R
.fi
.in 0
.KE

Interconnect Ethernets provide convenient and appropriate points for
gathering NSFnet statistics.  They are convenient because an SAA
executing on a host connected to one to these Ethernets can monitor the
traffic in promiscuous mode (see Figure 1).  Thus, we can monitor the
entrance and exit traffic without changing any gateway code.  The
interconnect Ethernets are also appropriate points for administrative
statistics-gathering.  Administrators and traffic planners are
concerned mainly with packets entering and leaving the network; the
fact that traffic between individual network routers cannot be
monitored from the Ethernets is not a serious drawback.

Implementing NNStat in an SAA host rather than in a gateway or packet
switch had a number of advantages.

.IP (1)
Timeliness:  The facilities provided by NNStat were needed quickly for
NSFnet management.  It will be some time before equivalent traffic
measurement standards are developed and implemented by gateway
vendors.

.IP (2)
Generality:  We wanted to incorporate a degree of flexibility and
generality into NNStat that is not currently available in gateways.

.IP (3)
Performance:  Comprehensive statistics gathering require a non-trivial
amount of CPU time and memory space; it is very undesirable to burden
the current generation of gateways with this additional resource
drain.

.IP (4)
Experimentation:  By implementing this function outside gateways, we
are free to experiment with different approaches; eventual
incorporation of our results into gateways is a reasonable goal.

.IP (5)
Universality:  There may not be a gateway at the point to be monitored;
for example, there might be a link-level bridge.
.LP

The primary task of the SAA is to count the occurrences of packets with
\*Qinteresting\*U configurations of values in their header fields.  In
the NNStat design, what is \*Qinteresting\*U is determined by the SAA
configurations, which can be set or changed dynamically.

Our model of NNStat operation within a particular network is as
follows.  The administration will set up the acquisition agents, one at
each point from which data is desired, configured to collect a basic
set of statistics.  These statistics will be reported to the SCH to be
summarized over sites, time, and perhaps administrative subsets of the
networks.  In addition, management and operational personnel will
dynamically modify the SAA configurations from time to time, to answer
additional statistical questions about the traffic.
.LP

Finally, we should mention some non-goals for the NNStat effort.

.IP o
NNStat does not provide fancy display or analysis programs for
presenting the statistics.  This is potentially a large and complex
problem that is outside the scope of the NNStat effort.

.IP o
NNStat cannot gather statistics for traffic on the serial lines between
IP routers; it can measure only the network entry and exit traffic.
NNStat is intended to complement, not replace, the statistics gathering
facilities built into gateways.  For example, gateways typically count
line errors and dropped packets on each of their physical interfaces,
to monitor and diagnose line problems.  These facilities are vital for
operation and maintenance of the gateways and lines, forming the
\*Qfirst line of defense\*U for problem diagnosis.  However, NNStat is
not generally concerned with short-term operational functions.
.LP

.NH
Overview of NNStat
.LP

The NNStat package, which has been implemented for a 4.2/3BSD system,
includes the following components:

.IP (A)
SAA Program \(em
.B statspy

The statistics acquisition agent program of NNStat is named
.B statspy\fP.
.mc |
.B Statspy
currently supports:
.RS
.IP o
Sun3 and Sun4 workstations under Sun OS releases 3.4,
3.5, 4.0.3, and 4.1; 
.IP o
IBM RT processors running 4.3BSD networking code; and 
.IP o
little-endian architecture machines running Ultrix.
.LP
.RE
The code could be ported to any 4.2/3BSD system
that provides an interface for promiscuous access to the Ethernet.
.mc

Each Ethernet packet that 
.B statspy
observes contains an Ethernet header followed by a sequence of one or
more other protocol headers (e.g., IP, TCP, etc)., which reflect the
successive encapsulation implied by protocol layering.  Each protocol
header may be considered to be a string of bits that is logically
divided into substrings called
.I fields.

A particular
.B statspy
process can (and typically will) gather a number of different
statistical measures of the packet traffic simultaneously.  Each of
these measures is gathered by a separate \fIstatistical object\fP, or
simply \fIobject\fP.  The set of objects and the selection of protocol
fields that they monitor is determined by the
.I configuration ,
that can be set or changed while
.B statspy
is executing.

.B Statspy
is controlled by a command language that provides commands for setting
and displaying the configuration and for displaying the statistical
data gathered by its objects.
.B Statspy
commands may be entered from three locations:

.RS
.IP o
From a file, at start-up time.

This is the recommended way to set up the configuration for collecting
long-term statistics, so
.B statspy
will be self-configuring if the SAA host crashes and restarts.

.IP o
Interactively, from the local console controlling statspy.

This allows
.B statspy
to be used as a standalone monitoring tool.

.IP o
Interactively, from a remote system running the 
.B rspy
program (see below).
.RE

Section 3 describes the command language, including the command used to
set or modify the configuration. Appendix C suggests useful
configuration techniques.

If it is executed in foreground,
.B statspy
accepts commands and displays statistics locally. Whether in foreground
or background, it listens for a TCP connection from the remote
collection machine (SCH) or from a remote
.B rspy
program, and processes all commands entered over that TCP connection.
However, the acquisition of new statistical data from the Ethernet
takes highest priority.

Note that
.B statspy
is not expected to record its data on a local disk; permanent data
recording is assumed to take place only at the SCH.  This choice was
made to minimize operational problems at each SAA site.

For more details on the operation and configuration of
.B statspy ,
see Section 3 below.
  
.IP (B)
Remote SAA Control Program \(em  
.B rspy

The
.B rspy
program provides an interactive command interface for controlling a
remote
.B statspy
instance.
.B Rspy
can be used to establish, query, or modify the configuration and to
read and/or clear the statistical objects.  The use of
.B rspy
is described in Section 3.4.

.IP (C)
Centralized Collection Program --
.B collect

.B Collect
is the central data collection program of NNStat; it executes on the
SCH to collect data from one or more
.B statspy
instances.
.B Rspy
and
.B collect
use the same remote network interface to statspy, but they are designed
for different tasks:  while
.B rspy
is intended to be used interactively for testing, probing, and running
short-term statistical studies,
.B collect
is intended to be executed as a daemon, collecting and recording
traffic data over a long period of time.

In normal operation,
.B collect
will periodically poll a specified set of SAA's for statistical data
and write the results into cumulative data files.  Note that data is
delivered to
.B collect
only as a result of its polling the SAA's.  An alternative design would
have the SAA's spontaneously report their data periodically to the
SCH.  We chose to use polling for data collection in order to ensure
(approximate) synchronization in gathering statistics from all the
SAA's, while avoiding an \*Qimplosion\*U of reports to at the central
site.

The following basic parameters must be defined to run
.B collect\fP:

.RS
.IP *
List of SAA host names or addresses.
.IP *
TCP port for
.B statspy
on each SAA (optional).
.IP  *
The name(s) of objects whose accumulated data are to be retrieved from
each
.B statspy
instance.
.IP *
Polling interval Ti.
.IP *
Checkpoint interval Tc.
.IP  *
Clear (\*Qreset\*U) interval Tr.
.RE

In one data collection cycle,
.B collect
will open a TCP connection to
.B statspy
on each of the listed hosts and retrieve data from (\*Qread\*U) the
specified objects, recording the results in files.  This cycle will be
repeated every Ti minutes, but
.B collect
will save or \*Qcheckpoint\*U the data for later analysis only every Tc
minutes.

The totals returned by each poll are cumulative, unless the objects are
explicitly cleared by command or the SAA (crashes and) restarts.
Therefore, if communication between the SCH and an SAA is lost
temporarily, a later successful poll should return complete data. The
minimum polling interval Ti should be short enough that data lost
because of a SAA restart will be negligible.  Of course, if an SAA is
down for an extended period, there is no way to capture statistics from
that interconnect Ethernet for that period.

.B Statspy
generally keeps 32-bit counters for counting packet events,
.mc |
and 56-bit counters for accumulating byte totals.
.mc
If the
average rate were 1000 packets per second, some packet counters might overflow
once every 4 weeks.
.B Statspy
makes no special provision for overflow, but instead expects that
.B collect
will be set up to periodically clear all the counters using the Tc
parameter.  Every Tc minutes,
.B collect
will instruct each
.B statspy
to clear its data counters after the current values are retrieved.

Suggested values for the time parameters to 
.B collect
are:

 Ti = 5 minutes
 Tc = 60 minutes
 Tr = 1440 minutes (24 hours).

.B Collect
will produce a separate data file for each (statistical measure, SAA
host) pair, for all the statistical measures and hosts specified in its
parameters.  Each of these data files will contain the read data for
every checkpoint time, plus the last data recorded before the
.B statspy
was restarted or its object(s) cleared, and will be
cumulative from the time that
.B collect
program was started.  If the SCH crashes or
.B collect
is restarted for some reason, a new set of data files will be created.

Section 4 explains how to use 
.B collect\fP.


.IP (D)
Data Reduction Programs

The NNStat distribution includes some useful programs and AWK scripts for
processing and summarizing the data files created by the
.B collect
program.  These will be described in the Section 4.
.LP

.bp
.NH
Statspy
.LP

.NH 2
Using Statspy
.LP

To execute \fBstatspy\fP, issue the following system command:

.nf
                                                     
  \fBstatspy\fP [\fB\-i \fIinterface\fR] [\fB\-p \fIport\fR] [\fB\-h\fP] [\fB\-1\fP] [\fIcommand-file\fP]

.fi
    
The parameters are:

.mc |
.IP \fB\-i\fP
Ethernet interface device name; the default is the (first) Ethernet interface
on the local system.
.mc
.IP \fB\-p\fP
TCP port number on which 
.B statspy
will listen for a connection from
.B collect
or 
.B rspy .
The default is 2222.  

.IP \fB\-h\fP
 
.B Statspy
will write a history of remote commands into the standard output.

.mc |
.IP \fB\-1\fP
("minus one") Output from read operations will be displayed in single-column format.

.mc
.IP \fIcommand-file\fP
This optional parameter is the name of a file containing
commands to be executed when 
.B statspy
starts.
Normally, these will be commands to establish the
initial configuration of objects for gathering data.
If this parameter is omitted, 
.B statspy
will await commands from the local console
or from 
.B rspy 
(executing either on the SAA host or remotely)
to establish the configuration.
.LP

When it starts, 
.B statspy
executes the commands found in \fIcommand-file\fP, if any.  If it has
been executed in foreground, 
.B statspy
then enters an interactive command mode
in which it repeatedly issues a prompt (\*Q>\*U) and awaits command input.
If 
.B statspy
is executed in background, its standard output and standard error
output should be directed to a file to aid diagnosis in case a problem
occurs.

.B Statspy
listens on the specified TCP port for a connection from a
remote 
.B collect
or 
.B rspy
program.  It is currently limited to one TCP connection at a time, so
the TCP connection is opened for each sequence of remote commands and
closed again when the responses have been returned.

.NH 2
Statspy Command Language
.LP

The operation and configuration of
.B statspy
are controlled by a simple command language.  Commands to
.B statspy
can be entered from three sources:

.IP (1)
the initial command file (see preceding section);

.IP (2)
interactively from the controlling console (i.e., from
standard input); 

.IP (3)
remotely from a 
.B collect
or 
.B rspy
program.
.LP 

Remote command requests have priority over local commands, while
processing new data from the Ethernet generally preempts either local
or remote command processing.

Commands from any source are free-form and may occupy as many lines as
necessary.  Any text following the \*Q#\*U character and up to the next
newline will be ignored, to allow comments in the command stream.

Various
.B statspy
commands reference objects and protocol fields by name.  The field
names are built into the program (see Figure 2 in Section 3.3), while
object names are assigned by configuration commands.  There are
commands to return a complete lists of the names of objects and of fields
(\*Qread ?\*U and \*Qshow ?\*U, respectively).

Commands refer to particular objects by their names.  They can refer to
a set of objects by using a \*Qwildcard\*U matching scheme. An object
specification parameter,  known as an \fIobject spec\fP, may contain
asterisks as wildcard characters to match any number of characters.
For example, the command:

   read *IP*
   
will apply the read operation to all objects whose names include the
string \*QIP\*U, and

   read *
   
will read all objects.
In setting up the configuration, the user should choose a consistent scheme
for assigning object names to increase the
usefulness of this wildcard matching.

As we will see in the next section, some objects do not themselves gather
data, but instead conditionally select other objects that do.  
Such conditional objects can be left unnamed, since they
will generally not be referenced by a command after they are created. 
Commands differ in how they treat such unnamed objects
(see below). 

We now list all the commands recognized by 
.B statspy\fP.

.IP o
read <object spec>

Displays the data recorded by the object(s) whose names
match <object\ spec>. 
Unnamed objects cannot be the target of a read operation.

.IP o
read ?

Displays a summary, which includes the names of all objects.
Unnamed objects will be included in this summary.

.IP o
clear <object spec>

Sets all objects whose names match <objec\ spec> to their initial
states, i.e., clears their statistical accumulation.  
\*QClear *\*U will clear all objects, including
unnamed objects.

.IP o
readclear <object spec>

Executes a read followed by a clear operation, atomically.
\*QReadclear *\*U will clear but not read all unnamed objects.

.IP o
show *

Displays a summary of the current configuration.

.IP o
show ?

Displays a list of the built-in field names.

.IP o
restrict readwrite <address> <mask>
.br
restrict readonly  <address> <mask>

These commands may be used to restrict remote
.B statspy
access, by allowing access from only a specified set of hosts.  The set
is defined by the IP address value <address> and the 32-bit address
mask <mask>.  Here <address> must be expressed in dotted-decimal
notation, while <mask> may be in dotted-decimal or written as a
hexadecimal constant.

The 1 bits in <mask> correspond to significant bits in <address>;
thus, the bits in <address> that correspond to zero bits in <mask> are
"wild".  The \fIrestrict\fP command can be entered either on the local
console or from the initialization file; a \fIrestrict\fP command
cannot be be entered remotely.

If no \fIrestrict\fP commands have been executed by
.B statspy\fP,
then full remote access is available from any host.  If any
\fIrestrict\fP command has been executed, however, then remote commands
will be accepted only from a remote host whose IP address matches the
(<address>,<mask>) pair of one of the \fIrestrict\fP commands.  The
commands are examined in the order that they have been executed; the
first match also determines the access mode for the host, either
read/write or read-only.  In read-only mode, only the \fIread\fP and
\fIshow\fP commands are allowed; in read/write mode, all remote
commands are allowed (except \fIrestrict\fP commands).
A mask of 0.0.0.0 will allow all hosts to access with the specified mode,
regardless of the <address> value.

.ne 4
Example:
   restrict readonly 128.9.0.0  255.255.0.0
   restrict readwrite 128.9.1.51 0xffffffff
   
allows any host on network 128.9 read-only access to
.B statspy
data, but only host 128.9.1.51 can remotely clear the objects or
change the configuration.

.IP o
subnet <network> <mask>

This command specifies that the network specified by <network> is be subnetted
with address mask <mask>.  Here <network> is an IP network address expressed in
dotted-decimal notation, and <mask> is a 32-bit address mask expressed
as a hexadecimal constant or in dotted-decimal.

If the specified <network> was defined in a previous
subnet command, then the new <mask> replaces the previous mask, and
the command completes with the user reply "Replaced".  Otherwise,
the new (network,mask) pair is added to the existing list of subnets,
and the command completes with the user reply "Added".

For every IP datagram received by
.B statspy,
the source and destination addresses are compared with each of the
subnetted networks in the table, to define the value of the virtual
subnet fields (see below) as appropriate.  Since these comparisons increase
the CPU load,
.B statspy
may optionally be generated without the
comparison code or virtual fields for subnets by using: "make statspy SUBNET=".
If the \fIsubnet\fP command is issued locally or remotely to an instance of
.B statspy
that has been generated in this manner, the \fIsubnet\fP command will fail
with the message: "Subnets not supported."
   
.IP o
attach { <configuration program> }

Augments the current configuration with the additional statistical object(s)
specified by <configuration program>. The curly braces are required.
The <configuration program> is written using a set of rules 
that we will refer to as the 
.I configuration language, 
although
it is really a sub-language of the command
language; see Section 3 for details.

The \fIattach\fP command is atomic \(em if any error is found, 
the current configuration will remain unchanged.
 
.IP o
detach <object spec>

Deletes from the configuration each object whose name matches <object\ spec>.  
This may implicitly delete other objects in order to 
keep the configuration consistent.
\*Qdetach *\*U will detach all objects, including those that have no names.

.IP o
?

Displays a list of the commands.  This command is only available on the local
console.

.IP o
quit

Exits to the operating system (shell) on the SAA host.  This command cannot
be issued across the network.

.IP o
enum { <enum parameters> }

Defines a set of label strings for use in \fIread\fP command displays.  See
Section 3.3.4 for more explanation.
The curly braces are required.

.mc |
.IP o
include <file name>

Replaces the include command with the configuration program text contained
in the specified file.

.IP o
list "<file name>"

The standard output is diverted to the specified file, whose name must be
enclosed in quotation marks.  An empty <file name> (list "") will return
output to standard out.
.mc 
.LP

These commands may be entered either remotely from 
.B rspy
or else locally.
The 
.B collect
program effectively issues the \fIread\fP and \fIreadclear\fP commands.

When a \*Qshow ?\*U command is issued to 
.B statspy\fP,
the first line displayed summarizes the overall packet processing since 
.B statspy
was started.  For example:

.nf 
    Acquired 56343 packets in 163 secs=> 
                           345(avg) 755(max) 1250(inst)/sec
.fi

This shows the total Ethernet packets acquired, the elapsed time since
.B statspy
started, the average packets per second, the maximum number of packets
processed in one second, and finally the maximum \*Qinstantaneous\*U
packet rate.  The last is obtained by extrapolating to one second the
maximum number of packets captured in one clock tick (20ms on the Sun
workstation).

.NH 2
Configuring Statspy
.LP

We divide the extraction of statistical data from a particular Ethernet
packet into two phases:

.IP (1)
Parse the protocol headers to determine the values of the various
header fields.

Since efficiency is essential and packet header formats do not change
very often, the header formats are compiled into the
.B statspy
code.  Each incoming Ethernet packet is passed to a subroutine that
\*Qknows\*U how to parse all the headers and where to locate the
fields.  To add new protocols or change header formats, it will be
necessary to recompile this packet-parsing subroutine of
.B statspy.

.IP (2)
Analyze the parsed field values and gather the desired statistics.

This phase is performed interpretively, using a set of rules that
comprises the
.B statspy
configuration.
.LP

.NH 3
Fields
.LP

Figure 2 shows a list of the fields that will be extracted by
.B statspy
and made available to the analysis phase.  A particular packet will
define values for only a subset of the possible fields; for example, a
TCP packet will define the TCP source and destination ports but cannot
define UDP ports or an ICMP type field.

As Figure 2 shows, each field is assigned a mnemonic name string, a
size in bytes, and an intrinsic type.  The type is used principally to
choose an appropriate format for displaying the data values from that
field.  Each field is extracted into an integral number of 8-bit
bytes.  Thus, the IP version number (field \*QIP.version\*U) is
actually 4 bits but is extracted (right-justified) by the parser into a
byte.

.KF
.nf
.RS

.nf
.ta 1.8iC 3.0i
Field Name	   Length(bytes)	Type 
.ta 1.8iR 3.0i

Ether.src	6	Ethernet Address
Ether.dst	6	Ethernet Address
Ether.type	2	Integer
   
IP.version *	1	Integer
IP.length	2	Integer
IP.option *	1	Integer
IP.TOS	1	Bits
IP.offset *	2	Integer
IP.protocol	1	Integer
     
IP.srchost	4	IP Address                
IP.dsthost	4	IP Address 
IP.srcnet *	4	IP Address                
IP.dstnet *	4	IP Address
.mc |
IP.srcsubn *	4	IP Address                
IP.dstsubn *	4	IP Address  
.mc
       
TCP.srcport	4	Integer               
TCP.dstport	4	Integer               
UDP.srcport	4	Integer               
UDP.dstport	4	Integer
ICMP.type	1	Integer
.mc |
ICMP.code	1	Integer 
.mc
   
.ta 1.8iC 3.0i
packet *	Variable	Bits
length *	4	Integer
 
     Figure 2.  Field Definitions in Packet Parser 


.fi
.KE
This list includes virtual fields whose values are derived from those
actually appearing in the header; these are marked with \*Q*\*U in
Figure 2.  The virtual fields have the following meanings:

.IP (a)
packet

This virtual field contains the binary value of the header sequence.
It is intended for recording particular packet headers for diagnostic
rather than statistical purposes.

.IP (b)
length, IP.length

The \*Qlength\*U field is defined for all packets to be the total length in
bytes exclusive of the Ethernet header.  Field \*QIP.length\*U
is only defined for IP datagrams, but when it is defined it has
the same value as the field \*Qlength\*U.

.IP (c)
IP.version

This virtual field containing the IP version number is extracted by
.B statspy
for analysis \fIonly\fP for a packet with a non-standard IP version
number (i.e., not 4). No later fields (IP, TCP, or UDP) can or will be
extracted from the same packet.

.IP (d)
IP.option

This is the code byte for each IP option field found in the packet, or
zero if there are no options.  Note that a single packet may contain
several options, so this pseudo-field may be multiply defined.

.IP (e)
IP.offset

This virtual field is extracted (we say \*Qdefined\*U) by
.B statspy
only for a packet that is a fragment of a complete IP datagram.  When
it is defined, IP.offset is the reassembly offset of this fragment in
bytes (i.e., 8 times the offset field in the IP header).

Only the first fragment, i.e., the fragment at offset zero, can be
parsed further for higher-level protocol headers (TCP, UDP, or ICMP).
We made the reasonable assumption that these headers will always fit
within the first fragment, i.e, that the first fragment will always be
larger than about 90 bytes (unless it is also the last fragment).

Note that for a fragmented packet the IP.length, IP.option, and IP.TOS
fields are defined for each fragment separately. Thus, IP.length gives
the length of the fragment;
.B statspy
cannot determine the total length the reassembled IP datagram.

.IP (f)
IP.srcnet, IP.dstnet

.mc |
These are the (Class A, B, C, or D) network numbers derived from the real
IP source and destination address fields, respectively.  These virtual
fields provide a simple and efficient way to develop statistics based
upon networks rather than hosts.  Note: for a class D address, the
network number and the host number are the same.

.mc
.IP (g)
IP.srcsubn, IP.dstsubn*

.FS
*Note: Release 2.4 documentation used the incorrect names: IP.ssubnet, IP.dsubnet
for the subnet
fields.  We regret the confusion caused by
this faulty documentation.
.FE
These are the source and destination subnet numbers, derived
using the address masks found in the subnet table for the corresponding
Class A, B, or C network numbers.  The subnet table is built using the
\fIsubnet\fP configuration command (see above).  If a particular
network does not correspond to an entry in the subnet table, then the
corresponding subnet number virtual field will be the same as the
network number field.

Example: Suppose the command has been issued:

  subnet 128.9.0.0 0xffffff00
  
and a packet is received with destination address 128.9.7.25.  Then:

  IP.srchost = 128.9.7.25
  IP.srcsubn = 128.9.7.0
  IP.srcnet =  128.9.0 0

As noted earlier, 
.B statspy
may be generated without the subnet virtual fields IP.srcsubn and IP.dstsubn
by using: "make statspy SUBNET=".
.LP

.NH 3
Objects and Invocations
.LP

Statistical analysis of field values is performed by a set of 
.B statspy
entities known as (statistical)
.I objects.
NNStat implements unary and binary objects, i.e., objects that take one
and two input values, respectively.  Each object may have a unique name that
is assigned when the object is defined.

Objects 
are logically independent of fields; objects
simply build and report statistical data structures based on 
(field) values written
into them.  
The analysis phase is essentially a series of calls on object
.I Write
subroutines; in each call, 
a particular field value (or pair of field values) is
passed as a parameter.
These Write subroutine calls are known as 
.I invocations.

For example, the configuration might specify that an object named
\*QProtocol.freq\*U is to be invoked on the field named
\*QIP.protocol\*U; that is, the value of field \*QIP.protocol\*U will be
written into object \*QProtocol.freq\*U.  The configuration may specify
that the same field is to invoke more than one object.  Conversely, the
same object may be invoked on more than one field, to build a composite
statistic. Fields that invoke the same object must be compatible, i.e.,
they must have the same size and type (see Figure 2 for the types).

Each object is an instance of a particular object class; all objects of
the same class share the same program modules but each has its private
data structure.  The
.B statspy
object classes generally fall into two categories:  recorders and
filters.

.IP (A)
Recorders

A data recorder object or 
.I recorder
builds some statistical data structure (e.g., a 
frequency distribution table) from the field values with which it is invoked.
.RS

Example:  \fIfreq-all\fP

An object
of class \fIfreq-all\fP builds a frequency distribution table for all
distinct values of the field on which it is invoked.
.RE

Figure 3 shows
an example of the display output resulting from a read operation 
on a \fIfreq-all\fP object named \*Qgwys\*U.  The field values recorded in
this object are 48-bit Ethernet addresses.

.KS
.nf

 OBJECT: gwys Class= freq-all [CreationTime: 11:51:25 11-05-87]
   ReadTime: 11:52:18 11-05-87, 
   ClearTime: 11:51:25 11-05-87 (@\-53 secs)
  Total Count= 492 (+0 orphans)
  #bins= 8  
 [2:7:1:0:8:30]= 219     (45%) @\-1secs 
 [8:0:2:0:49:30]= 127    (26%) @\-1secs 
 [2:60:8c:ee:2:34]= 52   (11%) @\-1secs 
 [24:24:80:9:0:6b]= 44   (8.9%) @\-1secs 
 [8:0:14:10:12:8]= 27    (5.5%) @\-1secs 
 [8:0:2:0:f7:2b]= 20     (4.1%) @\-2secs 
 [aa:0:3:1:5:90]= 2      (0.41%) @\-13secs 
 [2:7:1:0:4:45]= 1       (0.2%) @\-32secs 
 
 
      Figure 3. Example of Read Output

.fi
.KE

Each of the bottom 8 lines displays the contents of one bin:
the value (6 bytes in hex), 
the count, the percentage count,
and the last-update time relative to the current time (\*QReadTime\*U).

Figure 4 shows another example display example, the output of a read operation 
on an object named \*Qnets\*U of class \fImatrix-sym-bytes\fP.
This object builds a table of frequencies of pairs of values,
in this case the  IP source and destination networks.  Objects of this class
accumulate not only the packet counts but also the total byte counts for
each bin; the byte count is bracketed in \*Q&...B\*U.
The \*QTotal Count\*U and \*QTotal Bytes\*U values are the sums across
all the bins.

.KS
.nf

OBJECT: nets  Class= matrix-sym-bytes [Created: 17:00:03 11-29-89]
  ReadTime: 08:21:07 11-30-89, 
  ClearTime: 17:00:03 11-29-89 (@\-55264sec)
  Total Count= 430374 (+0 orphans)
  Total Bytes= 76886567B  #bins= 7  Maxchain = 1  SortMoves = 34
[128.9.0.0 : 128.9.0.0]= 400426 &68310714B (93.0%) @\-2sec 
[128.9.0.0 : 128.18.0.0]= 11766 &4011645B  ( 2.7%) @\-49145sec 
[128.9.0.0 : 192.48.219.0]= 7678 &2098103B ( 1.8%) @\-48187sec 
[128.9.0.0 : 128.89.0.0]= 7378 &2332623B   ( 1.7%) @\-52265sec 
[128.9.0.0 : 128.125.0.0]= 3052 &128965B   ( 0.7%) @\-33sec 
[128.9.0.0 : 192.5.18.0]= 49 &2605B        ( <.1%) @\-48315sec 
[128.9.0.0 : 131.179.0.0]= 25 &1912B       ( <.1%) @\-48177sec 
 
 
      Figure 4. Example of Read Output

.fi
.KE

.IP (B)
Filters

Data filter objects or 
.I filters
provide conditional branches in the
configuration.

When invoked with a field value, a filter tests it against some
numerical or set-inclusion criterion, to determine a Boolean (True/False)
value.  This Boolean value is then used by the \fBstatspy\fP interpreter
to select one of two alternative sub-sequences of invocations, where either
of these sub-sequences may be empty.

.RS
Example 1: \fIeqf\fP

An object of class \fIeqf\fP tests a field value for 
equality to a parameter
value that is specified when the object is created.

.mc |
Example 2: \fIsetf\fP

An object of class \fIsetf\fP tests a field value for
equality to one of a set of parameter values, specified when
the object is created.  
.RE

Note that \fIeqf\fP is simply a special
case of \fIsetf\fP, for a set of one member; \fIeqf\fP is included
as a separate class because it is significantly faster in execution.
The \fIsetf\fP class uses a hash lookup to provide efficient matches
against large numbers of parameters.

.mc
By nesting filter invocations in a configuration, 
.B record
invocations can
be conditioned upon an arbitrary Boolean expression over field values.

Figure 5 shows an example of (a fragment of) a pseudo-program,
in flow-chart form. This
sequence of invocations was designed to answer the question: \*Qwhat are
the Ethernet addresses of hosts sending or receiving local packets,
i.e., of [IP] packets that have both source and destination IP addresses on
the local network?\*U.
.KF
.nf
.ta 0.5i 2.5i
.lc _

	 ____________________________	
	|  Invoke \fIeqf\fP filter object  |
	|     with parm (128.9.0.0)  |
	|     on field \*QIP.srcnet\*U   |
	|____________________________|
.ta 1i 2i
	|	|
	| FALSE	| TRUE
	V	|
        (Null)	|
		|
		V
.ta 0.75i 2.75i
	_____________________________	
	|  Invoke \fIeqf\fP filter object  |
	|     with parm (128.9.0.0)  |
	|     on field \*QIP.dstnet\*U   |
	|____________________________|
.ta 1.25i 2.25i
	|	|
	| FALSE	| TRUE
	V	|
       \ \ \ \ \ \ (Null)	|
		|
		V	
.ta 1.0i 3.0i
	 __________________________	
	| Invoke \fIfreq-all\fP recorder |
	|    object named \*Qgwys\*U   |
	|   on field \*QEther.src\*U   |
	|__________________________|
.ta 2.0i
	|
	|
	V
.ta 1.0i 3.0i
	 __________________________	
	| Invoke \fIfreq-all\fP recorder |
	|    object named \*Qgwys\*U   |
	|   on field \*QEther.dst\*U   |
	|__________________________|
	  
      Figure 5.  Example Pseudo-program

.fi
.KE 
Figure 5 includes two invocations
of an unnamed \fIeqf\fP filter object whose parameter is the value
128.9.0.0 (the IP address of the local Ethernet).
Thus, the first invocation shown in Figure 5 will return
TRUE if the IP source network number in field \*QIP.srcnet\*U
is 128.9.0.0, and FALSE otherwise.

.NH 3
Configuration Language
.LP

We begin with an example of the configuration (sub-)language for 
.B statspy.
The configuration of Figure 5 could be created by entering an
\fIattach\fP command with the <configuration program> shown in Figure 6.

.KS
.nf
.RS

  if IP.srcnet is eqf(128.9.0.0)  {
     if IP.dstnet is eqf(128.9.0.0)  {
        record Ether.src in local freq-all;
        record Ether.dst in local;
     }
  }
  
.fi
 Figure 6. Configuration program that compiles into Figure 5.
     
.RE
.KE

Figure 6 illustrates several points about the configuration language:

.IP 1.
The language is free-field with newlines having no meaning.  Hence, we
can indent to illuminate the structure of the program.

.IP 2.
The language includes compound statements and \fBif\fP statements; the latter
correspond to invocations of filter objects.

.IP 3.
The first time a named object occurs, its class (and parameters, if any)
must be specified.  They may be omitted in later references to the same
object. 
When class and/or parameters are included in a later reference, their
values must exactly match the corresponding values specified in the
first occurrence of the same object.

.IP 4.
Parameters, when required, are enclosed in parentheses following the
class name.  If there is more than one parameter, they are listed
separated by commas.

.IP 5.
An unnamed object may be created, by giving only its class (and
parameter, if any).  Filter objects are often left unnamed, since there
is usually no need to read them.

Note: class names are reserved, and may not coincide with object names.
The valid class names are all listed in Appendix A.
.LP

Two more things about the language are not apparent from this example:

.IP 6.
The outer set of braces in Figure 6 is unnecessary.  The syntax of the
configuration language is generally like \*QC\*U.

.IP 7.
Execution of a specific configuration rule is triggered only when the fields upon
which it depends are defined in a packet.

Thus, in Figure 6 it was not necessary to explicitly test that the
packet in question was an IP packet; if it were not an IP packet, then
the IP.srcnet and IP.dstnet fields would not be defined.

Furthermore,
.B statspy
checks and enforces consistency among the fields, so that the
configuration cannot include invocations that are logically
impossible.  An example of such an illegal configuration is:

.nf
    if TCP.srcport is eqf(23)
       IF UDP.dstport is eqf(6)
          record Ether.src in Imposs-obj freq-all;
.fi
          
This configuration is illegal because TCP.srcport and UDP.dstport cannot
be defined in the same packet, hence the \*QImposs-obj\*U recorder
would never be invoked.  
.mc |
This conflict will be detected as an error
when the configuration is compiled; the following error message
results: "Config Error: impossible combo of fields: TCP.srcport UDP.dstport".
.mc
.LP

In Release 3.0, boolean expressions were introduced in \*Qif\*U
statements.  Boolean expressions are built using
\*Qand\*U and \*Qor\*U operators.
For example, Figure 7 will compile the same configuration as Figure 6.
 
.KS
.nf
.RS

  if ((IP.srcnet is eqf(128.9.0.0) ) and 
      (IP.dstnet is eqf(128.9.0.0) ))  {
        record Ether.src in local freq-all;
        record Ether.dst in local;
  }
  
  Figure 7. Another Version of Figure 6.
.fi
.RE     
.KE

The parentheses surrounding the parameter, as in: \*Qeqf(...)\*U
are necessary, but all the other parentheses in Figure 7 are optional.
We included them in Figure 7 to illustrate that parentheses can always
be used to avoid ambiguity.

We now list (partial) syntax rules for the configuration language;
Appendix B specifies the full syntax.

.IP (1)
RECORDER OBJECT:

To create a (unary) recorder object that is invoked on a specified field, use
the following statement:

    \fBrecord\fP <field name> \fBin\fP <object name> 
    
                <class> \fB(\fP <parameters> \fB) ;\fP
 
For a binary object, two fields are required:

    \fBrecord\fP  <field name>\fB,\fP <field name> \fBin\fP <object name> 
    
                <class> \fB(\fP <parameters> \fB) ; \fP


In either case, <class> must match the name of a valid recorder class.*
.FS
*This is not strictly true; a \*Qrecord\*U statement may specify a filter object,
although this is rarely useful.
.FE
The current set of object classes and corresponding <parameters> are
defined in Appendix A.

Normally, every recorder object ought to have a unique <object name>,
so that it can be read, cleared, and/or detached independently of other
objects.  However, a recorder object may be created with a null <object
name>.  Such an object can be destroyed (detached) or cleared, but not
read.

Note the semicolons following \fBrecord\fP statements; these are required.

The <parameters> string generally specifies a list of one or more
values separated by commas.  The number and meaning of these values
depend upon the particular class (see Appendix A for details).  The
<parameters> string and the surrounding parentheses may be omitted if
the particular class does not require parameters or if the specified
object has been defined previously with parameters.

If the specified <object name> already exists, a new invocation
specification refers to the same object.  In this case, <class> may be
omitted, but if it is respecified then it must agree with the class of the
existing object.  If <class> is respecified, then
\*Q\ (\ <parameters>\ )\*U may also be respecified, but its values must
agree with the parameters given for the existing object.

.IP (2)
FILTER OBJECTS:

To create a filter object to be invoked by a specified field, use a
conditional statement.  The simplest form is an \fB<if clause>\fP:

.nf
   \fBif\fP <field name> \fBis\fP <object name> 
                             <class> \fB(\fP<params>\fB)\fP
.fi

followed by either:

    <TRUE invocation>

 or:

    <TRUE invocation> \fBelse\fP <FALSE invocation>
.sp

Here <class> must match the name of a valid filter
class.  
The sense of the filter clause may be reversed by specifying \fBisnot\fP
instead of \fBis\fP.

<TRUE invocation> and <FALSE invocation> may themselves be recorder or
filter invocations, or may be arbitrary sublists of invocations,
grouped together inside braces \*Q{\ }\*U.  These sublists may
themselves include filter invocations, and this nesting can go to any
depth.

.mc |
More general conditional clauses may be constructed by combining
conditions of the form:

 <field name> \fBis\fP/\fBisnot\fP <object name> <class> \fB(\fP<parameters>\fB)\fP

using \*Qand\*U and \*Qor\*U operators with optional parentheses. 
See Figure 7 for a simple example.

It should be pointed out that boolean expressions are a notational convenience
rather than a performance improvement.  In the compilation process, any
\*Qand\*U or \*Qor\*U operator is expanded to the equivalent nesting of
simple \fBif\fP statements.  For example, Figure 7 is effectively
converted into Figure 6.

.mc
.mc |
.IP (3)
SYMMETRIC IF STATEMENTS

Protocol header fields containing address-like values frequently occur
as (source, destination) pairs.  For example, there are six
such pairs among the \fBstatspy\fP fields listed in Figure 2:
Ethernet addresses, IP addresses, IP networks, IP subnets,
TCP ports, and UDP ports.  We refer to these as \*Qsymmetric pairs\*U of
fields.

In measuring traffic, we sometimes want to create full-duplex
statistics that combine both directions of the flow.  For example, we
want to lump together TCP data packets and the resulting acknowledgment
packets; both are part of the packet load imposed by those
connections.
This leads to configurations of the following form:

.RS
.KS
.nf
.ta 5.6iR
{ \fBif\fP Fc ...  {
        \fBrecord\fP Fx \fBin\fP...;
        . . .  }
   \fBelse\ if\fP Fc' ... {
         \fBrecord\fP Fx' \fBin\fP...;
        . . .  }
}

    Figure 8: Symmetric Configuration         

.fi
.KE
.RE

Here Fc, Fc', Fs, and Fs' all represent field names, and field
Fc' (Fs') is related to Fc (Fs, respectively) through a field-symmetry
mapping.  That is, either (Fc, Fc') form a symmetric pair, or else they
are identical, and similarly for (Fs, Fs').

To compactly represent such full-duplex configurations, \fBstatspy\fP implements
the \*Qsymmetric if\*U statement.  This is a variant of an
\*Qif\*U statement, with the keyword \*Qif\*U replaced by \*Qsymif\*U. 
The statement:

   \fBsymif\fP <condition> <T-Statement> \fBelse\fP <F-Statement> 

is logically equivalent to the expanded form:

.nf
\ \ \ { \fBif\fP <condition>  <T-Statement> 
\ \ \ \ \fBelse\ if\fP <condition'>  <T-Statement'>
\ \ \ \ \fBelse\fP  <F-Statement> 
\ \ \ }
.fi

where the primes indicate that the field symmetry mapping has been
applied to all fields in the syntactic unit.

Like boolean expressions, the \*Qsymmetric if\*U statement is
a notational convenience rather than a performance improvement.

.IP (4)
SELECT STATEMENT

Suppose that it is desired to record the source and destination addresses
of TCP packets in three classes: interactive, file transfer, and email.
The configuration might include a nested conditional statement such
as the following:

.KS
.nf
\fBif\fP TCP.dstport \fBis\fP  port.telnet  \fBsetf(\fP23, 43, 79, 513\fB)\fP 
   \fBrecord\fP IP.srchost\fB,\fP IP.dsthost \fBin\fP Telnet.hosts matrix-sym\fB;\fP
      
\fBelse if\fP TCP.dstport \fBis\fP  port.ftp  \fBsetf(\fP20, 21, 69\fB)\fP 
   \fBrecord\fP IP.srchost IP.dsthost \fBin\fP ftp.hosts  matrix-sym\fB;\fP
    
\fBelse if\fP TCP.dstport \fBis\fP  port.mail  \fBsetf(\fP25, 103, 104, 119\fB)\fP 
   \fBrecord\fP IP.srchost IP.dsthost \fBin\fP mail.hosts matrix-sym\fB;\fP
.fi
.KE

The setf objects are filters that have the value \*QTrue\*U if
the field value is equal to one of the listed parameter values,
\*QFalse\*U otherwise.

An alternative way to express this problem is with a \fBselect\fP
statement, which is generically a form of \*Qcase\*U statement.
Here is the same example as a \*Qselect\*U statement:

.KS
.nf
.ta 5.6iR
\fBselect\fP\ \ TCP.dstport\ {           
\ \ \fBcase\fP\ (23, 43, 79, 513):
\ \ \ \ \fBrecord\fP IP.srchost, IP.dsthost \fBin\fP Telnet.hosts matrix-sym;

\ \ \fBcase\fP\ (20, 21, 69):
\ \ \ \ \fBrecord\fP IP.srchost, IP.dsthost \fBin\fP FTP.hosts matrix-sym;

\ \ \fBcase\fP\ (25, 103, 104, 119):
\ \ \ \ \fBrecord\fP IP.srchost, IP.dsthost \fBin\fP Mail.hosts matrix-sym;
}
.fi
.KE

Note that the braces surrounding the list of cases, the
parentheses enclosing lists of values, and the colons are all required.  
When one of the values in a particular case is matched, the statement following
the corresponding colon is executed; this completes execution of the entire
\*Qselect\*U.  Thus, control does not \*Qfall through\*Q to the
next case as in the \*QC\*U \fIswitch\fP statement, and therefore no \*Qbreak\*U
statements are needed or allowed in the configuration language.

In addition to the cases, there
can be one \*Qdefault\*U alternative of the form:

       \fBdefault:\fP
             <Statement>
             
The \fBselect\fP statement provides a performance improvement over
the corresponding nested \fBif\fP statements, since \fBselect\fP
is implemented by a single hashed lookup to obtain an index, and
this index selects one case from a vector of cases.              
.LP

.mc
.NH 3
Enumerations
.LP

Some packet header fields (e.g., the IP protocol number) may be
characterized as \*Qenumerations\*U, meaning that there is a discrete
set of possible values.  It is helpful to humans viewing the output of
a \fIread\fP command to have appropriate mnemonic labels attached to
the values of enumeration fields.  The \fIenum\fP command may be used
to define such mnemonic label strings.

The \fIenum\fP command has only local effect; \fIenum\fP commands taken
from the
.B statspy
configuration file or entered locally on the
.B statspy
console control only the formatting of local read commands.  Similarly,
an \fIenum\fP command entered in
.B rspy
is used locally at 
.B rspy
for formatting
read results; it is not transmitted across the network to 
.B statspy\fP.

The \fIenum\fP command has the form:

     \fBenum {\fP <enum parameters> \fB}\fP
   
The general form of <enum parameters> is:

.DS
<object spec> ( <value> <label>, ... , <value> <label> ),

     ...
     
<object spec> ( <value> <label>, ... , <value> <label> )

.DE

That is, it generally specifies a list of object name specs, 
and for each 
a sub-list of (label, value) pairs. 
 
Here is a possible \fIenum\fP command parameter that defines label strings
for objects attached to the IP protocol field:

.KS
   *IP.proto* (1 \*QICMP\*U, 3 \*QGGP\*U, 6 \*QTCP\*U, 8 \*QEGP\*U, 
     
           12 \*QPUP\*U, 17 \*QUDP\*U,  20 \*QHMP\*U, 21 \*QXNS-IDP\*U, 
       
           27 \*QRDP\*U, 77 \*QND\*U) 
.KE

The string \*Q*IP.proto*\*U is an <object spec>, indicating that this
list of labels will be used in formatting a read operation for any object
whose name contains the embedded string \*QIP.proto\*U.

Note that each sublist is keyed to an <object spec>, not an object name
or a field name.  When the results of reading a specific object are
formatted for display, the <object name> of that object is matched
against each <object spec> that has appeared in any \fIenum\fP command;
the first match causes the corresponding set of (label, value) pairs to
be used.  Careful choice of object names is necessary to take advantage
of this wildcard matching mechanism.

The <label> elements may be surrounded with quotation marks (\*Q\*U), and
must be if they contain embedded blanks or other special
characters.  A label surrounded with quotation marks may contain any
printable characters except: comma, linefeed ("\en"), or
quotation marks themselves.

The effect of \fIenum\fP commands is cumulative.  There is no command to
delete an enumeration; however, new label definitions will override
previous definitions for the same (object-spec, value) pairing.
.LP

.NH 2
Remote Control of Statspy
.LP

The 
.B rspy
program may be used to enter commands remotely to a running
.B statspy
program.  The command to execute 
.B rspy
is:

.sp                                             
  \fBrspy\fP  [\fB\-p \fIport\fR] [\fB\-h \fIhost\fR] [\fB\-1\fP] [\fIcommand-file\fP] 
    
.B2
.sp        
Here the parameters are:

.IP \fB\-p\fP
TCP port number on which 
.B statspy
is listening.
The default is 2222.  

.IP \fB\-h\fP
The name or dotted-decimal IP address of the 
.B statspy
host.  The default is the local host.

.mc |
.IP \fB\-1\fP
Output from read operations will be displayed in single-column format.

.mc
.IP \fIcommand-file\fP
The optional name of a file containing a script of commands to be executed when
.B rspy
begins.
.LP

.B Rspy
will then prompt for input (\*QRspy>\*U), accepting new commands from
standard input and writing any output to standard output.

The commands to
.B rspy
are those listed in Section 3.2 for
.B statspy\fP,
plus one additional command peculiar to
.B rspy\fP:

.IP o
\fIhost\fP <IP address>

Overrides the \-h parameter to specify the remote host to which
following commands will be directed.  Here <IP address> may be a host
domain name or a dotted-decimal IP address.
Also note that the
.B statspy
command \fIrestrict\fP cannot be entered remotedly from
.B rspy\fP.
.LP

Generally, commands entered to
.B rspy
are sent to
.B statspy
on the remote host.  However, the \fI?\fP, \fIenum\fP, \fIhost\fP, and
\fIquit\fP commands are executed locally by \fBrspy\fP.

.bp
.NH 
Data Collection
.LP
   
.NH 2
Using Collect
.LP

.B Collect
is executed as:


.nf

  \fBcollect\fP [\fB\-e \fIenumfile\fR]  [\fB\-h \fIhost1\fR] ... [\fB\-h \fIhostn\fR]   [\fB\-p \fIport\fR]
  
      [\fB\-i \fImin\fR]  [\fB\-c \fImin\fR]  [\fB\-r \fImin\fR]  [\fB\-d\fP|\fB\-dl\fP|\fB\-dx\fP]  \fIobject-spec\fP

.fi

These parameters are:

.nf
.IP \fIobject-spec\fP
The objects from which data is to be collected.  \fIobject-spec\fP may
contain the \*Qwild-card\*U character \*Q*\*U.  \fIobject-spec\fP is a
mandatory parameter, with no default.

.IP \fB\-e\fP
Name of a file containing <enum parameters> (see Section 3.3.4) for
labeling results.  Default is no enum file.

.IP \fB\-p\fP
TCP port to connect to; default is 2222.

.IP \fB\-h\fP
An SAA host from which data is being collected.  A series of \fB\-h\fP
parameters may appear, to define a list of SAA hosts.  Parameter may be
a dotted decimal IP address or a domain name.  Default is the local
host.

.IP \fB\-i\fP
Polling interval Ti in minutes. Default is 0, causing \fBcollect\fP to
run once.

.IP \fB\-c\fP
Checkpoint interval Tc in minutes.  Default is 0, causing only the
latest data to be saved.

.IP \fB\-r\fP
Clear interval Tr in minutes.  This is the interval at which
\fBcollect\fP sends \fIreadclear\fP instead of \fIread\fP command to
the hosts.  Default is zero, causing no clearing to take place.

.IP \fB\-d\fP
Direct trace & log to stdout.
.IP \fB\-dl\fP
Direct trace to stdout, log to files.
.IP \fB\-dx\fP
Direct trace & hex dump to stdout.

Default is no trace, log directed to files.
.LP			

The time parameters used by the
.B collect
program must be entered in minutes; this unit was chosen both for
convenience and to avoid giving the user a false sense of precision.

An example of typical parameters that one might use to run
.B collect
from a \*QC shell\*U is as follows:

   collect '*' \-h 35.1.1.21 \-i 5 \-c 60 \-r 1440 >& errors.log &
.fi

In this example:

.IP o
The read interval Ti is 5 minutes, the checkpoint interval Tc is 60
minutes, and the clear interval Tr is 1440 minutes (24 hours).

.IP o
The '*' object-spec tells
.B collect
to read and save statistics from all objects.

.IP o
The \-h 35.1.1.21 parameter specifies the host on which
.B statspy
is executing.

.IP o
The \*Q>& errors.log\*U redirects any error reports to the file
\*Qerrors.log\*U.
.IP o
The final \*Q&\*U starts the collection of statistics in background
mode, so that it is unaffected by other use of the shell and logouts.
It can be stopped via use of the \*Qkill\*U command.
.LP

.NH 2
Collect Log Files
.LP

.B Collect
saves statistics in files whose names are formed from the
.B statspy
host name, the object name, and the time that
.B collect
is started.  For example, the statistics file named:

    \*Q35.1.1.21-gwys.1214.1540\*U

was created at 1540 (local time for
.B collect
) on December 14th, and contains statistics from the object named
\*Qgwys\*U on
.B statspy
host \*Q35.1.1.21\*U.

Figure 3 shows an example of an individual
.B collect\c
log entry, which has the same format as a local display of the read
data.  Each log entry contains three timestamps:
.IP o
\*QCreationTime\*U is the time that the
.B statspy
module was started.
.IP o
\*QReadTime\*U is the time associated with the data in the current log entry.
.IP o
\*QClearTime\*U is the last time that this object was cleared.  If the
object has never been explicitly cleared, the ClearTime is the same as
the CreationTime.
.LP

The
.B statspy
module sends these timestamps in universal (UNIX) time, and
.B collect
converts them to its local time when they are formatted and written.

Not all of the data that
.B collect
receives from
.B statspy
are saved permanently in log files.  Data must be saved in two
situations:

.IP (1)
The time between checkpoints has elapsed.

The \*Qcheckpoint\*U parameter is typically used to provide a
statistical breakdown of traffic by time of day, e.g., the number of
packets received during each hour of the day.

.IP (2)
The
.B statspy
object has been cleared.

The
.B statspy
object may have been cleared intentionally by a clear command or
unintentionally by a crash and restart of the SAA machine.  The
.B collect
program polls each
.B statspy
every Ti minutes, which should be short enough to minimize the
statistical loss due to SAA crashes.
.LP

It is also possible that the SCH running
.B collect
will itself crash.  Therefore,
.B collect
always writes new data into the log file, but it may either overwrite
the previous log entry or append to the end of the file, saving the
previous entry.

In order to decide whether a particular entry should be saved,
.B collect
keeps track of the last \*QClearTime\*U received and the next time that
a checkpoint log entry should be saved.  Specifically,
.B collect
will overwrite the previous entry unless:
.IP a.
The previous entry is the very first entry appearing in the log file,
or
.IP b.
The previous entry was received after/at the time that a checkpoint was
required, or
.IP c.
The ClearTime on the current entry is different from the ClearTime on
the previous entry.
.LP

.NH 2
Data Reduction Programs
.LP

The NNStat package includes several programs to process log files
produced by the  
.B collect
program.  
The 
.B lookupnames
program scans through log files, outputting the
original text of the log file with the appropriate domain name added
following each instance of a host number or a network number.
Shell scripts that invoke AWK programs have also been included.  
These scripts may be installed as command aliases named 
.B count-totals
and 
.B bin-totals\fP.
These commands are intended to provide
part of the data reduction capability needed to produce traffic statistics. 

.NH 3
Lookupnames Program
.LP

The 
.B lookupnames
program may be used to scan log files for embedded host and 
network numbers, map these numbers into corresponding names, and create a
new file with the names inserted immediately after the corresponding numbers.

The following is an example of output produced by 
.B lookupnames\fP:

.KS
  [35.0.0.0 MERIT:35.0.0.0 MERIT]= 112665 (93.9%) @\-0sec
  [35.0.0.0 MERIT:128.116.0.0 USAN]= 387  ( 0.3%) @\-36sec
  [128.116.0.0 USAN:35.0.0.0 MERIT]= 462  ( 0.4%) @\-35sec
  [35.0.0.0 MERIT:128.182.0.0 PSCNET]= 388 ( 0.3%) @\-46sec
   [128.182.0.0 PSCNET:35.0.0.0 MERIT]= 345 ( 0.3%) @\-45sec
.KE


The number-to-name conversion is performed using the Domain Name 
system, or if
no matching entry is returned, by a local file of network names.  A standard
hosts.txt file may also be used to supply the network names.  
If no host/network name is found in either database, the string
\*Q(UNKNOWN-HOST)\*U is displayed in place of the name.

The usage is:


  \fBlookupnames\fP [\fB\-n \fIfilename\fR] [\fB\-t \fIseconds\fR] \fIinput-file-list\fP

.B2    
    
The input files, concatenated and augmented with the name strings, are
written to standard output.

The optional command line flags are as follows:

.IP \fB\-n\fP
The name of a networks file. If this parameter is unspecified, the
program looks for a file named: 
\*Qnetworks.txt\*U.  
This file should contain the networks 
portion of a standard \*Qhosts.txt\*U file, for example:

	NET : 128.9.0.0 : ISI-NET :

Alternatively, the full hosts.txt may be used.  The 
.B lookupnames
program scans the networks list for \*QNET\*U entries, until the
beginning of first \*QGATEWAY\*U entry or the end of the file.

.IP \fB\-t\fP
The timeout time for host name lookups; the default is 5 seconds.
If this timeout expires, the 
.B lookupnames
program checks the
networks-file for a matching entry.  If none is found, the string
\*Q(TIMEOUT)\*U is printed in place of the host/network name.

.NH 3
Count-totals
.LP

The command 
.B count-totals
can be used to summarize total packet counts logged by the 
.B collect
program.
Taking into account any 
.B statspy
restarts or 
.B statspy
totals that were
cleared, it computes both daily totals and a grand total from a given log
file.  It may be invoked on a list of log files, in which case it 
summarizes each file independently.

The command used to invoke 
.B count-totals
is:

    \fBcount-totals [v=1] \fIlogfile\fR ... 

The required \fIlogfile\fP parameter is a list of one or more log files 
produced by 
.B collect
(and no others).
The list may contain the usual wildcard specification(s).
Output is written to a results file named \*Qcount-totals.out\*U,
as well as to the standard output. 
 
An example is:

    count-totals v=1 *IP*

The optional \fBv=1\fP parameter is used to signify that the results should
be \*Qverbose\*U, which in this case means that a line summarizing each update
appears in the results file.  If the \fBv=1\fP parameter is omitted, only
daily totals and a grand total are included.

The following is an example of the format of a verbose results file:

.KS
.nf

  File: 35.1.1.21-IP.lens.1221.1523

  Log created on Tue Dec 22 08:00:25 1987, for host 35.1.1.21.
    Sample interval = 60 min; checkpoint interval = 60 min.
    Object name = 'IP.lens'.
.ta 1.3i 2.3i 3.3i

    Read-Time	Clear-Time	Total-Count	Increment
    ---------	----------	-----------	---------
.ta 2.3iR 2.8iR 3.8iR

    08:01:10 12-22	07:06:22 12-22	19262	0
    09:00:55 12-22		52297	33035
    10:01:04 12-22		81393	29096
    11:00:54 12-22		119954	38561
   ... (etc)
    23:00:54 12-22		468588	12147
    00:00:55 12-23		493492	24904
  Daily total = 474230 (08:01:10 12-22 to 00:00:55 12-23)

    01:00:54 12-23		515588	22096
    02:00:54 12-23		542095	26507
    03:00:54 12-23		566430	24335
    04:00:54 12-23		586511	20081
   ... (etc)
    22:00:57 12-23		1043959	7317
    23:00:55 12-23		1054753	10794
    00:00:58 12-24		1070114	15361
  Daily total = 576622 (00:00:55 12-23 to 00:00:58 12-24)
   ... (etc)
   ... (etc)
  Daily total = 182408 (00:00:58 12-24 to 14:01:17 12-24)

  Grand Total = 1233260 (08:01:10 12-22 to 14:01:17 12-24)
  
.fi
.KE

The following is an example of the corresponding non-verbose
format:

.KS
.nf

  File: 35.1.1.21-IP.lens.1221.1523

  Log created on Tue Dec 22 08:00:25 1987, for host 35.1.1.21.
    Sample interval = 60 min; checkpoint interval = 60 min.
    Object name = 'IP.lens'.

  Daily total = 474230 (08:01:10 12-22 to 00:00:55 12-23)
  Daily total = 576622 (00:00:55 12-23 to 00:00:58 12-24)
  Daily total = 182408 (00:00:58 12-24 to 14:01:17 12-24)

  Grand Total = 1233260 (08:01:10 12-22 to 14:01:17 12-24)

.fi
.KE


In the verbose format, column headings have the following meanings:

.IP o
\*QRead-Time\*U contains the ReadTime
returned from each
.B statspy
response to a query performed by
.B collect\fP.
 
.IP o
\*QClear-Time\*U is filled in for each new 
ClearTime found in the current log file being processed.  A blank
Clear-Time field signifies that the field is unchanged since the previous
entry. 

.IP o
\*QTotal-Count\*U corresponds to the \*QTotalCount\*U on each
response.
  
.IP o
\*QIncrement\*U contains the number of packets counted
between the current response and the previous response.
.LP
                           
.NH 3
Bin-totals
.LP

The command 
.B bin-totals
produces a summary showing the total number of packets counted in each of the
corresponding bins appearing in a log file, taking into account any 
.B statspy
restarts or object clears.
It can be invoked on a list of log files
to summarize each independently.

The command to invoke 
.B bin-totals
is:

    \fBbin-totals \fIlogfile\fR ...
    
As before, the parameter is a list of name(s) of one or more log files produced
by the 
.B collect
program, or a wildcard file specification that matches
(only) files produced by the 
.B collect
program.
Output is written to a results file named \*Qbin-totals.out,\*U
as well as to the standard output.
  
The following is an example of the output from the 
.B bin-totals
command:

.KS
.nf

  File: 35.1.1.21-IP.lens.1221.1523

  Log created on Tue Dec 22 08:00:25 1987, for host 35.1.1.21.
    Sample interval = 60 min; checkpoint interval = 60 min.
    Object name = 'IP.lens'.

  Summary Period: 08:01:10 12-22 to 14:01:17 12-24.

    [0-9] total = 0
    [10-19] total = 0
    [20-39] total = 1112
    [40-79] total = 557177
    [80-159] total = 149395
    [160-319] total = 194274
    [320-639] total = 61625
    [640-1279] total = 192769
    [1280-2559] total = 76908
    [2560-5119] total = 0

.fi
.KE

.bp                        
.SH
Appendix A \(em Catalog of Objects
.LP

This Appendix describes the statistical object classes currently implemented
in \fBstatspy\fP.

.SH 2
Recorder Objects
.LP

A recorder object is invoked at its Write() entry point to record values
of a specific field or set of fields.

.IP 1.
Frequency of all Values

      \fBfreq-all\fP 

An object of class \fIfreq-all\fP (abbreviated as \fIFA\fP) builds a
frequency distribution table for a single field.  This table is built
dynamically, with a bin for every distinct value that occurs.

Each time a bin is added or incremented, the current time (in seconds
since Jan 1, 1970) is recorded in the bin.  A read operation returns this
last-update time with the value and count for each bin.  We expect that
these times will be useful in analysis of the data; for example, it
would be possible to extract only recently occuring values.

The list of bins returned by a read operation is sorted into order
of decreasing counts, and within the same count, by last update time.

The implementation of the \fIfreq-all\fP class uses a chained hash
scheme, dynamically allocating memory for bins in \*Qpages\*U of 2K
bytes.  There is a built-in limit of 1024 bins.  In addition to the hash
chain, each bin is chained into a doubly-linked sorted list that is
used to order the read sequence.  Sorting
bins into this list is accomplished using an incremental algorithm whose
CPU time is linear in the total count (and is in fact negligible).

.IP 2.
Frequency of all Values, with Byte Totals

      \fBfreq-all-bytes\fP 

An object of class \fIfreq-all-bytes\fP (abbreviated as \fIFAB\fP) builds a
frequency distribution table for a single field.  This table is built
dynamically, with a bin for every distinct value that occurs.

The object also accumulates in each bin the total lengths of the corresponding
packets, in bytes.  In all other respects, objects of this class are the
same as objects of class \fBfreq-all\fP.
Note that the packet length is an implicit
parameter; the object is unary.
The lengths that are accumulated are of the entire packet, exclusive of 
Ethernet header. 

.bp
.IP 3.
Frequency of Selected Set of Values

      \fBfreq-only ( <value>, ... <value> ) \fP

An object of class \fIfreq-only\fP (abbreviated as \fIFO\fP) builds a
frequency distribution table for only those values that are included in
the parameter list; values not in the list are counted in a single
\*Qdefault\*U bin.  A read operation on the object displays this
frequency table and the \*Qdefault\*U count.  If the given set of values is
empty, the \*Qdefault\*U count equals the total number of invocations.

The <value> entries may be expressed in a variety of ways.

.RS
.IP o
Decimal integer, limited to 2**31 maximum.

.IP o
Hex integer, using the \*QC\*U notation 0x....

.IP o
IP address, specified as either a dotted-decimal
number or as a domain name.

.IP o
Ethernet Address, specified in \*Qcoloned-hex\*U format:
xx:xx:xx:xx:xx:xx, where each x represents a hex digit.
 
.IP o
A  quoted enumeration label: \*Q<label>\*U.  This implies
the corresponding value, and
provides a way to define field values symbolically.  See
Appendix C for examples.
.RE
.LP

.IP 4.
Frequency of Selected Set of Values, with Byte Totals

      \fBfreq-only-bytes ( <value>, ... <value> ) \fP

An object of class \fIfreq-only-bytes\fP (abbreviated as \fIFOB\fP) builds a
frequency distribution table for only those values that are included in
the parameter list; values not in the list are counted in a single
\*Qdefault\*U bin. 
The object also accumulates in each bin the total lengths of the corresponding
packets, in bytes.  In all other respects, objects of this class are the
same as objects of class \fBfreq-only\fP.
The lengths that are accumulated are of the entire packet, exclusive of 
the Ethernet header.  
.IP 5.
Frequency of all Value Pairs

       \fBmatrix-all\fP 

An object of class \fImatrix-all\fP (abbreviated as \fIMA\fP) builds a
table of frequencies of all pairs of values in two fields.  This table is
built dynamically, with a bin for every distinct value pair that occurs.
The pair of values (a,b) is counted separately from the pair (b,a).  
The list of bins returned by a read operation is sorted into
order of decreasing counts, and within the same count, by last update time.

The internal structure and implementation of this class is the same as the
\fIfreq-all\fP class, described above.

If the object name matches an enumeration, the corresponding labels
are used for the first value of each pair.

Note: if a \fImatrix-all\fP object is defined with a non-zero parameter, it
operates as a \fImatrix-sym\fP object (see the following).

.IP 6.
Frequency of all Value Pairs, with Byte Totals

       \fBmatrix-all-bytes\fP 

An object of class \fImatrix-all-bytes\fP (abbreviated as \fIMAB\fP) is
exactly like a \fBmatrix-all\fP object, except that a \fImatrix-all-bytes\fP
object also accumulates in each bin the total lengths of the corresponding
packets, in bytes.  The lengths are of the entire packet, exclusive of 
Ethernet header.

.bp
.IP 7.
Symmetric Frequency of Value Pairs

       \fBmatrix-sym\fP

An object of class \fImatrix-sym\fP (abbreviated as \fIMS\fP) builds a
table of frequencies of all pairs of values in two fields.  This table is
built dynamically, with a bin for every distinct value pair that occurs.
The list of bins returned by a read operation is sorted in order
of decreasing counts, and within the same count, by last update time.

If the two argument fields have the same length, they are
treated \*Qsymmetrically\*U:  (b,a) and (a,b) are counted in the same bin.
If the lengths differ, \fImatrix-sym\fP operates like \fImatrix-all\fP.

The internal structure and implementation of this class is the same as the
\fImatrix-all\fP class, described above.

If the object name matches an enumeration, the corresponding labels
are used for the first value of each pair.

.IP 8.
Symmetric Frequency of Value Pairs, with Byte Totals

       \fBmatrix-sym-bytes\fP

An object of class \fImatrix-sym-bytes\fP (abbreviated as \fIMSB\fP) is
exactly like a \fBmatrix-sym\fP object, except that a \fImatrix-sym-bytes\fP
object also accumulates in each bin the total lengths of the corresponding
packets, in bytes.  The lengths are of the entire packet, exclusive of 
Ethernet header.
.bp
.IP 9.
Histogram

      \fBhist ( <scale factor> [, <max bin>] )\fP
    
An object of class \fIhist\fP (abbreviated as \fIHI\fP) 
builds a linear histogram of (unsigned)
integer field values. Each bin of the histogram has the same size,
given by the value <scale factor>. 
The optional second parameter specifies the ordinal number of the
maximum bin that is collected; if it is omitted, 1024 is used.

If <scale factor> is S and
<max bin> is M, a Read operation on the object defined by hist(S, M)
returns the counts:

.KS
    Bin 0:  Count( 0 <= X < S )
    ...
    Bin j:  Count( j*S <= X < (J+1)*S )
    ...
    Bin M:  Count( M*S <= X < (M+1)*S )
.KE
    
plus a count of values that were off-scale, i.e., >= (M+1)*S.
Here \*QCount(\ f(X)\ )\*U means the number of invocations with
value X for which f(X) was true.

The Read operation also reports the average, maximum, and minimum
values observed.
A \fIhist\fP object is restricted to invocation on a field of 4 bytes or less.

.IP 10.
Logarithmic Histogram

      \fBhist-pwr2 (\fP <scale factor> \fB)\fP

An object of class \fIhist-pwr2\fP (abbreviated as \fIP2\fP) 
builds a logarithmic histogram, i.e, one with intervals
increasing as powers of 2.  Specifically, a Read()
operation on a \fIhist-pwr2\fP object returns the following counts:

.KS
    Bin 0:  Count(X < S)
    
    Bin 1:  Count(S <= X < 2*S)
    ...
    Bin j:  Count(S*(2**j) <= X < S*(2**(j+1)) )
.KE
    
where S is the value of the unsigned integer <scale factor>.

A \fIhist-pwr2\fP object also reports the average, maximum, and minimum values
observed.
A \fIhist\fP object is restricted to invocation on a field of 4 bytes or less.

.IP 11.
Measure Temporal Locality of Reference
  
       \fBworking-set\fP
.mc |
       \fBworking-set2\fP

.mc
An object of class \fIworking-set\fP (abbreviated as \fIWS\fP) measures
the degree of temporal clustering of values of a given field.  This
clustering is known as \*Qlocality of reference\*U, and in the memory
domain leads to the concept of a \fIworking set\fP.
.mc |
However, the \fIworking-set\fP class is misnamed; rather than measuring
the size of the working set, it measures LRU (\*QLeast-Recently Used\*U)
cache hit probabilities.

Suppose that we maintain a list of the n distinct values that have occurred most
recently in the field.  This list will change over time, as new values
occur that were not in the list replace the oldest (\*Qleast-recently
used\*U) values in the list.  Let C(n) be the number of values in
the observed sequence that are already in the list.  If there
have been a total of N packets in the sequence, C(n)/N is the
probability of the next value being already in the list (cache).
The \fIworking-set\fP object measures the values of C(1), C(2), C(4),
C(8),...  C(4096).

.mc |
A \fIworking-set2\fP object takes two fields (i.e, it is a binary
object), concatenating the two fields into a single value that is used
to build an LRU cache just like a \fIworking-set\fP object.

.mc
.IP 12.
Record Sequence of Values in Binary

        \fBbin-pkt (\fP <count> [, <max length>] \fB)\fP
.mc |
        \fBbin-pkt2 (\fP <count> [, <max length>] \fB)\fP

An object of class \fIbin-pkt\fP (abbreviated as \fIBP\fP) builds a
circular buffer of up to <count> entries, containing the most recent values in
a specified field.  If the second parameter is specified, only the first
<max length> bytes of each field are saved.
A read operation on the object displays all values
in this buffer, oldest first.

Although this object may be invoked from any field, it is really
intended for recording the complete headers of
packets.  For this purpose, the virtual field \*Qpacket\*U is defined
as the entire set of packet headers captured from the Ethernet; this
is at most the first 108 bytes of the packet, currently.

A \fBbin-pkt2\fP object takes two fields (i.e, it is a binary
object), concatenating the two fields into a single value that truncated
if necessary to <max length> packets and then recorded.
For example, the following program will save a circular buffer of 8-byte
quantities, containing source and destination address pairs:

    record IP.srchost, IP.dsthost in AddrPairs BP2(100);
    
A read operation on one of these objects will behave in a special way
when the \*Qfile\*U command is in effect: the buffer contents will be
recorded as BINARY data.  The format of this data is defined in the
struct bpe_entry in sobjbp.c

.mc
.bp
.IP 13.
Variant of \fIfreq-all\fP

	 \fBfreq-all2\fP

An object of class \fIfreq-all2\fP (abbreviated as \fIFA2\fP) performs
the same function as an object of the \fIfreq-all\fP class, except a
\fIfreq-all2\fP objects does not sort the list of bins, but rather
displays bins in the order of their first occurrence.  As a result,
\fIfreq-all2\fP objects may use slightly less CPU time (although the
difference appears to be negligible) and always use less memory for
bins (16-20 bytes per bin, compared to 24-28 bytes for \fIfreq-all\fP
objects).

.IP 14.
Variant of \fImatrix-all\fP

	\fB matrix-all2\fP

An object of class \fImatrix-all2\fP (abbreviated  as \fIMA2\fP)
performs the same function as an object of the \fImatrix-all\fP class,
except a \fImatrix-all2\fP objects does not sort the list of bin, but
rather displays bins in the order of their first occurrence.  As a
result, \fImatrix-all2\fP objects may use slightly less CPU time
(although the difference appears to be negligible) and always use less
memory for bins (16-24 bytes per bin, compared to 24-32 bytes for
\fImatrix-all\fP objects).

.bp
.SH 2
Filter Objects
.LP

A filter object tests given field values against some criterion and
returns a Boolean value; the interpreter uses this result to select one
of two alternative sequences of invocations.

A filter object generally has a read-only data structure, but it does
keep two statistical counters: the total number of invocations, and
the number that resulted in a TRUE result.  These two numbers are
returned by a Read operation.

.IP 1.
Filter on range of values

.mc |
       \fBrangef (\fP <Lower>, <Upper> [ ,<mask> ]\fB)\fP 
   
A \fIrangef\fP object (abbreviated as \fIRF\fP)
returns TRUE if the given value X, after ANDing with <mask> if one
is present, falls inside the
specified range:

       L <= X&M <= U
     
Otherwise, it returns FALSE.  
Here L and U are the unsigned integer values corresponding to <Lower>
and <Upper>, respectively, and M is <mask> or all one bits if <mask>
is omitted.
Note that L, U, and <mask> are permitted to be integers or 6-byte 
Ethernet addresses.

.mc
.IP 2.
Filter on equality

       \fBeqf(\fP <value> \fB)\fP
       
An \fIeqf\fP object (abbreviated as \fIEQ\fP)
returns TRUE if the given field value matches the
specified parameter value, otherwise it returns false.
Here <value> may take any of the forms described earlier for
the \fIfreq-only\fP class.

.IP 3
Filter on selected set of values

       \fBsetf (\fP <value>, ... <value> \fB)\fP 
   
A \fIsetf\fP object (abbreviated as \fISF\fP)
returns TRUE if the given field value matches one of the
values in the parameter list, otherwise it returns FALSE.
Each <value> may take any of the forms described earlier for
the \fIfreq-only\fP class.

Note that \fIsetf\fP with a single value is equivalent to \fIeqf\fP, but is
much less efficient.
.LP

.SH 2
Limits of Statistical Objects
.LP

The objects that have been defined have the following limits:

.IP (1)
Max Frequency Counts

All frequency counts, both individual bins and totals across bins, are
maintained as 32-bit unsigned integers, and are therefore limited to
4*10**9 packets.  Removing this restriction would involve major changes
throughout the code.

.IP (2)
Maximum Byte Totals

Byte totals, both in bins and across bins, are maintained in a
multiple-precision format with a 24-bit low-order part and a 32-bit
high-order part.  In practice, it should never be possible to reach
this limit of approximately 10**16 bytes.

.IP (3)
Number of Bins

The object classes that build bins dynamically (\fIfreq-all...\fP,
\fImatrix...\fP, and \fIworking-set\fP) all impose a limit on the total
number of bins.  It is set by a #define BIN_MAX in each source file; it
is currently 1024 everywhere.  This limit is only a reasonableness
check and is arbitrary; to expand it, simply change the source file and
re-make the program.

.IP (4)
Field Sizes

In the current implementation, the frequency distribution classes
\fIfreq-all...\fP, \fIfreq-only...\fP, and \fImatrix...\fP can handle
values of 1 to 8 bytes in length.  This has not been a restriction in
practice, since the largest field that actually occurs (see Figures 1
and 6) is 6 bytes.  Increasing the maximum field size beyond 8
would require significant programming changes.

.IP (5)
Number of Parameters

The maximum number of parameters that may be listed when an object is
first created by an attach command is set by a #define MAXPARMS to
256.  This limit is arbitrary and can be expanded by recompilation.
The only object classes that use extended parameters lists are
\fIfreq-only...\fP and \fIsetf\fP.
.LP

.bp
.SH
Appendix B \(em Syntax of Attach Command
.LP

This Appendix contains a BNF specification of the \fIattach\fP command syntax.

The syntax of Attach parameters has been (deliberately) designed to parallel
the syntax of \*QC\*U statements (that correspond to simple invocations) and
statement-lists (that correspond to lists of invocations).

.nf
  <Attach command> ::= attach { <S-list> }
  
  <S-list> ::= <Statement> | <S-list> <Statement>    

  <Statement> ::= \fBrecord\fP <record-invoke> \fB;\fP |

          <\fBif\fP clause> <Statement> \fBelse\fP <Statement> |
          
          <\fBif\fP clause> <Statement> |
          
          <\fBselect\fP clause> \fB{\fP <case body> \fB}\f |       
                         
         \fB{\fP <S-list> \fB}\fP  |  \fB;\fP
                                    
  <record-invoke> ::= <field name> \fBin\fP <object defn>  |
     
                      <field name>\fB,\fP <field name> \fBin\fP <object defn>   |
                      
                      <field name> <field name> \fBin\fP <object defn> 
     
  <\fBif\fP clause> ::=  \fBif\fP <condition> |  \fBsymif\fP <condition>
  
  
  <condition> ::=   <C-term> | <condition> \fBor\fP <C-term>
  
  <C-term> ::=      <C-primary> | <C-term> \fBand\fP <C-primary>
  
  <C-primary> ::=   <if-invoke> | \fB(\fP <condition> \fB)\fP
  
  <if-invoke> ::=   <field name> \fBis\fP <object defn>  |
 
                    <field name> \fBisnot\fP <object defn>
                            
  
  <\fBselect\fP clause> ::=   \fBselect\fP <field name> <object name> |
  
                              \fBselect\fP <field name>
                              
  <case body> ::=   <case body> <case label> <Statement> | <empty>
  
  <case label> ::=  \fBcase\fP <value> \fB:\fP | \fBcase (\fP <value list> \fB) :\fP |
               
                    \fBdefault\fP <value> \fB:\fP | \fBdefault (\fP <value list> \fB) :\fP
                   
                   
  <object defn> ::=   <object name> <class> <class parm> |
      
                      <class> <class parm> | 
 
                      <object name> 


  <class parm> ::=  <empty> | ( <value list> )  

  <value list> ::= <empty> | <value list> <value>  | 
  
                   <value list> \fB,\fP <value>

  <value> ::= <decimal integer> |
     
               \fB0x\fP<hex number>  |  \fB0X\fP<hex number>  |
                 
               <IP address>  |
                                  
               <Ethernet address> |
                 
               \*Q<label>\*U
                 
  <IP address> ::= 
                 
               <dotted-decimal number> |
                 
               <host domain name>
                 
  <Ethernet address> ::=
                      
                <hex digit>\fB:\fP<hex digit>\fB:\fP<hex digit>\fB:\fP
                     <hex digit>\fB:\fP<hex digit>\fB:\fP<hex digit>
                                         
  <hex number> ::= <hex digit>  |  <hex number><hex digit>
      
  <hex digit> ::= 00 | 01 | ... | fe | ff           
                 
  <field name> ::= <identifier> 
        
  <object name> ::= <identifier>
    
  <identifier> ::= a letter, followed by: any string of
                   letters, digits, or any of the 
                   special characters +-&._

.fi

.bp
.SH
Appendix C \(em Building Configuration Files
.LP

This Appendix provides some guidelines, suggestions, and examples for building
.B statspy
configurations.

.SH
C.1  Example 1
.LP

Traffic flowing to and from a particular gateway can be selected 
with a filter:

attach {
    if Ether.dst is eqf(08:00:2b:03:4a:e7) { 
          ####going to gateway
       record ... ;
       record ... ;
       ...
       }
    if Ether.src is eqf(08:00:2b:03:4a:e7) {  
          ####coming from gateway
       record ... ;
       record ... ;
       ...
       }
    }

where 08:00:2b:03:4a:e7 is the Ethernet address of the gateway.  
(Note the convention for Ethernet addresses: \*Qcoloned-hex\*U).

.SH
C.2  Example 2
.LP

Another filtering approach is to select packets by classes of IP
addresses.  For example, suppose it is known that the list 128.1.0.0,
128.2.0.0, and 128.3.0.0 includes all local networks.  In that case, the
following selects \*Qtransit\*U packets, i.e., packets whose source and
destination are both outside the local administrative area:

.KS
  if IP.srcnet isnot 
              setf(128.1.0.0, 128.2.0.0, 128.3.0.0)
       if IP.dstnet isnot
              setf(128.1.0.0, 128.2.0.0, 128.3.0.0) {
           record ... ;
           record ... ;
              ...
        }
.KE
            
.B Statspy 
has been designed to be efficient even if the list of values
used as parameters to \fIsetf\fP() is very large (say, several hundred
values) \(em a \fIsetf\fP object uses a hash-table. 

.SH
C.3  Example 3
.LP

Suppose the problem is to collect data on all TCP traffic destined
for a particular gateway, broken down by source and destination IP addresses
as well as packet type (Telnet, FTP, etc)..

Here, \*Qpacket type\*U is a little bit hazy, but it is related to the
occurrence of a well-known port number in the TCP source or destination
port.

In principle, there is no reason why
.B statspy
could not provide for three-way distributions, but it does not.  One of
the main reasons (besides distaste for the resulting messiness in the
specifications and code) for not implementing three-way distributions is
disbelief that administrators will want all that data!  Running
.B statspy
12 hours on a typical large Ethernet has found packets to/from 100
networks and 500 different IP hosts.  The complete three-way matrix asked
for here may therefore contain 10**6 bins.  It seems unlikely that anyone
will have use for a million numbers, accumulated over days, weeks, and
months!

In fact, it seems doubtful that even the hardiest administrator will
really want to keep complete statistics by (IP source, IP destination)
host pairs; after a few months of looking at 10**5 numbers, he/she will
tire of it and begin to collect data only on source and destination
network, or only for specific subsets of networks.

The 
.B statspy
design includes a number of features to contain the
amount of data it can generate, for example the inclusion of IP network
numbers distinct from host numbers, the conditional (filter)
mechanism, and the \fIsetf\fP() filter described above. 

The recommended approach to using NNStat is as follows: set up some simple,
general overall statistical measures, producing a volume of data that 
can reasonably be scanned.  If some apparent anomalies are observed \(em
e.g., a
particular network seems to be producing more packets than expected \(em
then augment the configuration for 24 hours with specific
objects to analyze exactly those anomalous data.

In any case, the following command sets up a configuration to
provide counts broken down by source address, destination address,
and packet type.

.KS
.nf
 attach {

   if TCP.dstport is  setf(23, 43, 79, 513) 
      record IP.srchost, IP.dsthost in Telnet.hosts matrix-sym;
   else if TCP.srcport is  setf(23, 43, 79, 513)
      record IP.srchost, IP.dsthost in Telnet.hosts;
      
   else if TCP.dstport is  setf(20, 21, 69) 
      record IP.srchost IP.dsthost in ftp.hosts matrix-sym;
   else if TCP.srcport is  setf(20, 21, 69) 
      record IP.srchost IP.dsthost in ftp.hosts;
    
   else if TCP.dstport is  setf(25, 103, 104, 119) 
      record IP.srchost IP.dsthost in mail.hosts matrix-sym;
   else if TCP.srcport is  setf(25, 103, 104, 119)
      record IP.srchost IP.dsthost in mail.hosts;
 }
.KE
.fi

We can avoid replicating the parameter lists to the setf objects by
naming the first occurrence of each case and referencing the same
object in later occurrences, as shown by following:

.KS
.nf
 attach {

   if TCP.dstport is  port.telnet  setf(23, 43, 79, 513) 
      record IP.srchost, IP.dsthost in Telnet.hosts matrix-sym;
   else if TCP.srcport is  port.telnet
      record IP.srchost, IP.dsthost in Telnet.hosts;
      
   else if TCP.dstport is  port.ftp  setf(20, 21, 69) 
      record IP.srchost IP.dsthost in ftp.hosts matrix-sym;
   else if TCP.srcport is  port.ftp 
      record IP.srchost IP.dsthost in ftp.hosts;
    
   else if TCP.dstport is  port.mail  setf(25, 103, 104, 119) 
      record IP.srchost IP.dsthost in mail.hosts matrix-sym;
   else if TCP.srcport is  port.mail
      record IP.srchost IP.dsthost in mail.hosts;
 }
.KE
.fi

Using symbolic labels defined by an \fIenum\fP command, we can write this as:

.KS
.nf
 enum {
  *port* (20 \*QFTP data\*U, 21 FTP, 23 Telnet, 25 SMTP,
     37 Time, 42 Name, 43 Whois, 53 Domains, 
     69 TFTP, 79 Finger, 103 X.400, 104 \*QX.400-SND\*U,
     109 POP2, 111 sunrpc, 115 SFTP, 119 NetNews, 
     153 SGMP, 512 exec, 513 \*Qrwho|rlogin\*U, 514 shell,
     515 printer, 520 RIP)       
 }

 attach {
   if TCP.dstport is port.telnet 
              setf(\*QTelnet\*U, \*QWhois\*U, \*QFinger\*U, \*Qrwho|rlogin\*U)
        record IP.srchost, IP.dsthost in Telnet.hosts matrix-sym;
        
   else if TCP.srcport is port.telnet 
        record IP.srchost, IP.dsthost in Telnet.hosts;
      
   else if TCP.dstport is port.ftp setf(\*QFTP data\*U, \*QFTP\*U, \*QTFTP\*U) 
        record IP.srchost IP.dsthost in ftp.hosts matrix-sym;
        
   else if TCP.srcport is port.ftp 
        record IP.srchost IP.dsthost in ftp.hosts;
      
   else if TCP.dstport is port.mail 
              setf(\*QSMTP\*U, \*QX.400\*U, \*QX.400-SND\*U, \*QNetNews\*U) 
        record IP.srchost IP.dsthost in mail.hosts matrix-sym;
        
   else if TCP.srcport is port.mail
        record IP.srchost IP.dsthost in mail.hosts;
 }
.KE
.fi

Now, all this needs to be conditional upon packets coming and going
through a specific gateway.  This requires a very redundant configuration
file, but the good news is that the redundancy does not effect either the
CPU time or memory space required for data collection.  
Assuming the \fIenum\fP command of the previous example, the
complete \fIattach\fP command can be written as:

.KS
.nf
 attach {

 if  Ether.src is eqf(08:00:2b:03:4a:e7) { 
   if TCP.dstport is port.telnet 
              setf(\*QTelnet\*U, \*QWhois\*U, \*QFinger\*U, \*Qrwho|rlogin\*U)
        record IP.srchost, IP.dsthost in Telnet.hosts matrix-sym;
        
   else if TCP.srcport is port.telnet               
        record IP.srchost, IP.dsthost in Telnet.hosts;
        
   else if TCP.dstport is port.ftp setf(\*QFTP data\*U, \*QFTP\*U, \*QTFTP\*U) 
        record IP.srchost IP.dsthost in ftp.hosts matrix-sym;
        
   else if TCP.srcport is port.ftp  
        record IP.srchost IP.dsthost in ftp.hosts;
      
   else if TCP.dstport is port.mail 
              setf(\*QSMTP\*U, \*QX.400\*U, \*QX.400-SND\*U, \*QNetNews\*U) 
        record IP.srchost IP.dsthost in mail.hosts matrix-sym;
        
   else if TCP.srcport is port.mail 
        record IP.srchost IP.dsthost in mail.hosts;
 }
 else if  Ether.dst is eqf(08:00:2b:03:4a:e7) { 
   if TCP.dstport is port.telnet 
        record IP.srchost, IP.dsthost in Telnet.hosts matrix-sym;
        
   else if TCP.srcport is port.telnet 
        record IP.srchost, IP.dsthost in Telnet.hosts;
      
   else if TCP.dstport is port.ftp  
        record IP.srchost IP.dsthost in ftp.hosts matrix-sym;
        
   else if TCP.srcport is port.ftp 
        record IP.srchost IP.dsthost in ftp.hosts;
      
   else if TCP.dstport is port.mail 
        record IP.srchost IP.dsthost in mail.hosts matrix-sym;
        
   else if TCP.srcport is port.mail 
        record IP.srchost IP.dsthost in mail.hosts;
  }
 }
.KE
.fi

.mc |
In Release 3.0, there is even more good news.  This example can be
streamlined using \fBsymif\fP and/or \fBselect\fP statements.

First, we can use \fBsymif\fP statements to collapse pairs of the
inner \fBif\fP statements:

.KS
.nf
attach {

 if  Ether.src is eqf(08:00:2b:03:4a:e7) { 
   symif TCP.dstport is port.telnet 
              setf(\*QTelnet\*U, \*QWhois\*U, \*QFinger\*U, \*Qrwho|rlogin\*U)
      record IP.srchost, IP.dsthost in Telnet.hosts matrix-sym;
                
   else symif TCP.dstport is port.ftp 
              setf(\*QFTP data\*U, \*QFTP\*U, \*QTFTP\*U) 
      record IP.srchost IP.dsthost in ftp.hosts matrix-sym;
        
   else symif TCP.dstport is port.mail 
              setf(\*QSMTP\*U, \*QX.400\*U, \*QX.400-SND\*U, \*QNetNews\*U) 
      record IP.srchost IP.dsthost in mail.hosts matrix-sym;
 }
 else if  Ether.dst is eqf(08:00:2b:03:4a:e7) { 
   symif TCP.dstport is port.telnet 
      record IP.srchost, IP.dsthost in Telnet.hosts matrix-sym;
              
   else symif TCP.dstport is port.ftp  
      record IP.srchost IP.dsthost in ftp.hosts matrix-sym;
              
   else symif TCP.dstport is port.mail 
      record IP.srchost IP.dsthost in mail.hosts matrix-sym;
 }
}
.KE
.fi

Now, due to the symmetry of the example, we can use another \fBsymif\fP
statement for the outer alternative:

.KS
.nf
 attach {

 symif  Ether.src is eqf(08:00:2b:03:4a:e7) { 
   symif TCP.dstport is port.telnet 
            setf(\*QTelnet\*U, \*QWhois\*U, \*QFinger\*U, \*Qrwho|rlogin\*U)
      record IP.srchost, IP.dsthost in Telnet.hosts matrix-sym;
                
   else symif TCP.dstport is port.ftp 
            setf(\*QFTP data\*U, \*QFTP\*U, \*QTFTP\*U) 
      record IP.srchost IP.dsthost in ftp.hosts matrix-sym;
        
   else symif TCP.dstport is port.mail 
            setf(\*QSMTP\*U, \*QX.400\*U, \*QX.400-SND\*U, \*QNetNews\*U) 
      record IP.srchost IP.dsthost in mail.hosts matrix-sym;
 }
}
.KE
.fi

Alternatively, \fBselect\fP statements can be used in the inner nesting.
Here is another equivalent program:

.KS
.nf
attach {
 symif  Ether.src is eqf(08:00:2b:03:4a:e7) { 
   select TCP.dstport selectSport {
      case ("Telnet", "Whois", "Finger", "rwho|rlogin"): 
          record IP.srchost, IP.dsthost in Telnet.hosts matrix-sym;
          
      case ("FTP data", "FTP", "TFTP"):
          record IP.srchost IP.dsthost in ftp.hosts matrix-sym;
          
      case ("SMTP", "X.400", "X.400-SND", "NetNews"):
          record IP.srchost IP.dsthost in mail.hosts matrix-sym;
          
      default: 
          select TCP.srcport  selectDport {
              case ("Telnet", "Whois", "Finger", "rwho|rlogin"): 
                 record IP.srchost, IP.dsthost in Telnet.hosts 
                                                           matrix-sym;
          
             case ("FTP data", "FTP", "TFTP"):
                 record IP.srchost IP.dsthost in ftp.hosts matrix-sym;
          
             case ("SMTP", "X.400", "X.400-SND", "NetNews"):
                 record IP.srchost IP.dsthost in mail.hosts matrix-sym;
          }  #end of select TCP.srcport
          
    }  # end of select TCP.dstport 
  }       
}
.fi
.KE
.mc

.bp
.SH
Appendix D \(em Attach Error Messages
.LP

This section lists the error messages that may occur in processing an
attach command.

.IP *
ATTACH error \(em Bad field name: <field name>

The specified string is not the name of any defined field.  The valid
field names can be obtained at any time using "show ?".
.IP *
ATTACH error \(em Class Conflict for: <object name>

Two invocations of the same object specify conflicting class names.
.IP *
ATTACH error \(em Parm list conflict for: <object name>

Two invocations of the same object specify conflicting parameter lists.
.IP *
ATTACH error \(em Unknown class for new object: <object name>

In the first invocation of an object, no class has been specified.
.IP *
ATTACH error \(em Conflicting data type: <object name>

The same object is being invoked on different fields thathave different
types and are therefore incompatible.
.IP *
ATTACH error \(em Conflicting field size: <object name>

The same object is being invoked on different fields that have different
lengths and are therefore incompatible.
.IP *
ATTACH error \(em Cannot start with <input text>

Syntax error.
.IP *
ATTACH error \(em Syntax error at <input text>
.IP *
ATTACH error \(em No matching enum for <text>

Unable to find matching enum string for symbolic parameter value.
.IP *
ATTACH error \(em Unknown name: <string>

Unknown host domain name used as a parameter value.
.RE

.bp
.SH
Appendix E \(em Summary of Earlier Releases
.LP 

Release 2.4 contained following important changes:
.IP *
It incorporates the few minor bug fixes reported for Release 2.3.
.IP *
It includes four new frequency distribution object classes, to collect total bytes
as well as a packet count in each bin.  These new objects are:
.RS
.LP o
\fIfreq-only-bytes\fP (FOB)
.LP o
\fIfreq-all-bytes\fP (FAB)
.LP o
\fImatrix-all-bytes\fP (MAB)
.LP o
\fImatrix-sym-bytes\fP (MSB)
.RE

The packet length is an IMPLICIT parameter to these objects; as a
result, their usage is exactly the same as the corresponding objects
\fIfreq-only\fP, \fIfreq-all\fP, \fImatrix-all\fP, and
\fImatrix-sym\fP.
.IP *
In connection with this these new objects, a new virtual field named
\*Qlength\*U contains the total packet length exclusive of the Ethernet
header.  For an IP datagram, this will have the same value as
\*QIP.length\*U, which is retained for compatibility.
.IP *
The display format for a \fIfreq-only\fP object has been changed slightly,
to be consistent with \fIfreq-only-bytes\fP object.
.IP *
A mechanism has been added to allow a remote \fIrspy\fP or
\fIcollect\fP to query a \fIstatspy\fP about its version number.  This
scheme allows the introduction of version numbers compatibly with
earlier versions.  Specifically, when rspy opens a new TCP connection,
it first sends a new command VERSION; statspy replies with its version
string.  Earlier versions of statspy will reply with an error, which
identifies their antiquity.  This scheme will allow possible
future changes in the network encoding of remote commands.
.IP *
A \*Qterse\*U mode has been added to \fIcollect\fP, to reduce the volume of 
data collected over a long time period.  The main changes are to suppress
the percentages and to truncate trailing zero bytes in network addresses.
.LP

Release 2.3 (November 6, 1989) contained the following important
changes:
.IP *
Supports Sun 4 (SPARC) hardware, based on code provided by Phil Wood
of the Los Alamos National Laboratory.
.IP *
Provides access controls, heavily based on code developed for the NSFnet backbone
by Dave Katz of Merit.
.IP *
Includes support for running statspy on a PC RT.  This support was developed by
Dave Katz for use in the IBM/Merit NSFnet backbone packet switches.
.IP *
Includes two new virtual fields for subnetted networks, inspired by code
supplied by Alan Stebbens of UCSB.
.IP *
Includes additional 'collect' parameters:  -u (universal time) and
-m (mode), and a new statspy parameter: -s scheduling_priority. These
were provided by Dave Katz.
.IP *
Provides for statspy finding a default Ethernet interface by taking the
first entry in the kernel's interface list.
.LP

