		   Internet Rover 3.0 Discovery Rover

Chapter 1.  Discovery Rover

  One other problem with the InetRoverd is that all of the nodes and
tests need to be manually configured into the hostfile.  Whenever
humans are required to manually update configuration files, there is a
non-zero likelihood that the configuration may be incorrect.  Also, we
have seen and exponential growth in network traffic, and a
proportionate growth in the number of network equipment that required
monitoring. Combine the two situations and you have the potential for a
configuration mess.

  An additional problem has to do with alert the alert monitoring showing
side affects of problems as well as the cause of the problem.   For
example, if a network partition occurs, the cause of the outage may a
broken link, and the effect may be several nodes becoming not reachable. 
If you used the PING() test to monitor these nodes, all of the side
affects of the outage would show up as alerts.  Perhaps the link being
down would cause an alert as well, but this "real" problem may buried
in a series of side affect alerts.  The desired alert display only
contains a list of problems that the operators need to work on, and
filter out the side affects.  

  The discovery rover solves these problems by only reporting what is
"known" to be a problem.   This is accomplished by letting nodes and
links occupy one of four states: UP, BUSY, Not Reachable (NR) and DOWN.


	********************************************
	*
	*  graphics -- Cannot represent as text.
	*
	********************************************

     Example: All nodes responded to neighbor table poll, each reporting all links in the UP State


	********************************************
	*
	*  graphics -- Cannot represent as text.
	*
	********************************************

  Example 2: A link goes down, and the algorithm correctly identifies the
broken component.


	********************************************
	*
	*  graphics -- Cannot represent as text.
	*
	********************************************

  Example 3:  A network partition occurs, and the algorithm correctly
identifies the broken components.


	********************************************
	*
	*  graphics -- Cannot represent as text.
	*
	********************************************

  Example 4: A node not responding to queries, but seen as "UP" by any
adjacent node is marked "BUSY".  

  <here expand on the states and include some pictures as examples>

  Luckily, on the NSFNET backbone,  the SNMP provides easy access to
information that allows applications to "discover" the state and topology
of the backbone.  Using Merit's Fast SNMP API, which we will discuss
later, we are able to discover the state and topology of the backbone
in thirty seconds or less.  

  The discovery code queries each node for a list of neighbors and link
states, and uses this information to maintain a "Network Status
File".   As new nodes are added to the backbone, the discovery rover
automatically adds them to the Network Status File.   As nodes and
links transition between states, the network status file is updated to
reflect this.  Two programs use this file; SortStatus and the graphical
rover map.

  SortStatus is invoked after the poller discovers the network, and
adds or removes problems from the PROBLEM.FILE appropriately.  This
program prvides the "Glue" between the Discovery Rover and the
PROBLEM.FILE-based display programs.

  Many available network technologies do provide instrumentation denoting
adjacencies.  For these networks, we split the topology discovery and
state determination into separate pieces.  The discovery code will dump
the routing tables to discover ajacencies.  Often this involves
hundreds of queries.   The output of the discovery code will be a "LINKS"
file that contains detailed information about the network links,
including interface addresses and interface numbers.

1.1  The Graphical Rover Display

  The graphical rover map provides a graphical reflection of the
network status file.  As nodes and links change state, the color of the
nodes and links on the map change color.  As new nodes and links are
added to the network, these nodes and links are also added to the
backbone map.  New nodes are placed interactively on the map, and
facilities exist for each user to create their own network map.

  Also associated with each graphical rover map is a set of actions. 
In their .Xdefaults file, user define what actions should be executed
when they click on a node or a link.  They can define their own menus
of actions for nodes and links.  Function key assignments can also be
assigned to invoke actions when the mouse is positioned over a node or
a link.  This kind of configurability make the graphical rover map
extremely powerful.

  Each map is started up by specifying the name of the network, where the
network name is a symbolic link to the xmap program.  You can do this
on unix boxes by a command like:  ln -s xmap nsfnett3  .   The xmap
will dtermine from the name of the program, what the network Class is,
and from that determine which user resources to get from the resource
manager.  Assume the network is called nsfnett3, and the user has a
.Xdefaults file like this:

     Nsfnett3*draw.XmPushButton.Translations: #augment \
     <Key>F1:	system( viewisis ) \n\
     <Btn1Down>,<Btn1Up>,<Btn1Down>,<Btn1Up>: system( openping ) \n\
     <Btn3Down>: PostNodeMenu \n\
     Ctrl<Btn2Down> : MoveIt() 
      
     Nsfnett3*draw.Translations: #override \
     Ctrl<btn2Up>: PlaceIt() \n\
     <Btn3Down>: PostLinkMenu() \n\
      
     Nsfnett3.nodeMenuItem1: openping
     Nsfnett3.nodeMenuItem2: opentelnet
     Nsfnett3.nodeMenuItem3: opentelnet OUT-OF-BAND
     Nsfnett3.linkMenuItem1: DIagnoseProblem
     Nsfnett3.linkMenuItem2: CheckDSUs

  In the first section of this user's .Xdefaults file, the F1 function
key, when clicked on a node, will invoke the viewisis script.  All
programs invoked by the graphical display will receive environmental
variables detailing which node or link was selected, so programming the
graphical display becomes simply a matter of defining what actions are to
occur when the user performs key clicks or other X actions.  

  The second translation in the first section shows how a sequence of
mouse events can be used to trigger a script.  In this example, a mouse
double click triggers an openping script, which opens up a window pinging
the node.

  The last two translations are necessary intrinsics for the application.
The PostNodeMenu action will be discussed later, and provides a
user-defined popup menu for network nodes.  The MoveIt() action allows
a node to be selected for interactive placement.

  The programmable graphical display also supports up to 20 menu items
for nodes and links.  The second section shows an example of how this
might be configured.  In this example, there are three node menu items
shown to the user when a node is selected with mouse button 3, and two
menu items when a link is clicked upon with mouse button 3.

  What we have done administratively is to use a "Execution Environment
script" to assure that all users get the same map, menus, actions, etc.  
This script sets $XENVIRONMENT to a resource file that contains the
resources for that network, and then invokes the xmap application
specifying the correct community name.  Users can always choose to create
their own .Xdefaults file and invoke the program independent of this
effort, but for people just starting, this works out very nicely.  It
also lets individuals network "owners" maintain the appearance of their
maps independently of one another.

  Sample network Resource File:


	********************************************
	*
	*  raster -- Cannot represent as text.
	*
	********************************************

1.2  Graphical Rover Display Actions

  We have created scripts that automatically diagnose common network
faults such as circuit problems.  NOC operators can click on a link and
see a graph showing input and output packets over the previous 24
hours, which is especially useful for planning scheduled outages. 
Additionally, simple scripts that present historical information with
regards to node and link outages have been written, along with scripts
that access the trouble ticket database for more information. 

  The real power of the map is being able to see the state of the
network, and lauching tasks to query the network or locally stored
information.  All of the power and flexibility of the unix operating
environment is available to script and program writers.  This is an
important difference, as many available commercial packages require users
to write scripts in sometimes arcane proprietary languages, typically
with no process control, programming tools or debugging environment. 
Just code up the script or program and add it to the .Xdefaults file
and you are off.

Chapter 2.  MAP File Format

  The MAP file exists for each network, and performs to functions. 
First, it maps UniqueID from the STATUS file to a Name that will appear
on the map.  Secondly, it provides the x,y location on hte screen for the
node.  The nodes can be moved around and placed interactively on the
screen.  The file looks like this:


	********************************************
	*
	*  raster -- Cannot represent as text.
	*
	********************************************

Chapter 3.  LINKS File Format

  linktype TimeStamp UniqueID IPAddr ifNum ifNum IPAddr UniqueID
TimeStamp x y

  Where:

  linktype = use the ifType numbers from the rfc 1213 (MIB II) (use 0
otherwise)

  TimeStamp = The time this side of the link was validated (use 0
otherwise)

  UniqueID = a single queryable address of the box.  For this box,
there must be one         and only one UniqueID - this is the name that
describes the box.

  IPAddr = the IPAddress on this UniqueID that terminates this link.

  ifNum = the interface number - this is the interface rover will poll to
determine the state of this link.

  x, y = don't care - this was used for the Circuit Trouble Ticket
Sub-System and is probably not applicable for general use.   See the
CircuitCheck code if you have questions about this and think you may have
the ability to detect circuit transmission and reception problems by
querying your tranmission units (CSU,DSU,modems, etc.)

  Here is an example LINKS file.  John Vollbrecht (Merit) wrote a
script to generate this LINKS file from the cisco configuration files.


	********************************************
	*
	*  raster -- Cannot represent as text.
	*
	********************************************

Chapter 4.  Dynamic Link width on the graphical display

  The width of the links on the graphical display can be displayed as
dynamic.  It is up to the user to define the metrics for link width and
invoke the routines to create the information necessary for the map
application.  One such program is included in the package and is called
makelinkwidth.  The output of this program will be a linkwidth file (.LW)
placed in $PINGKYDIR, and will be automatically detected by the xmap
application.  The LinkWidth Menu will present to the user all of the
possible LinkWidth files.  

