
   ==================================================================
   ===                                                            ===
   ===           GENESIS Distributed Memory Benchmarks            ===
   ===                                                            ===
   ===                           COMMS1                           ===
   ===                                                            ===
   ===                      Nearest Pingpong                      ===
   ===                                                            ===
   ===               Author:   Roger Hockney                      ===
   ===     Department of Electronics and Computer Science         ===
   ===               University of Southampton                    ===
   ===               Southampton SO9 5NH, U.K.                    ===
   ===     fax.:+44-703-593045   e-mail:rwh@uk.ac.soton.ecs       ===
   ===                                                            ===
   ===     Copyright: SNARC, University of Southampton            ===
   ===                                                            ===
   ===          Last update: November 1991; Release: 2.0          ===
   ===                                                            ===
   ==================================================================


1. Description
--------------
This benchmark measures the basic communication properties of a
computer network by performing the 'pingpong' experiment between
a neighbouring pair of nodes.  A message of varying length is sent
to a neighbouring node, and immediately returned after the data
has become available to the receiving user program. Half the time 
for this pingpong exchange is recorded as the time to send a 
message from one node to a neighbour.  This time is fitted by 
least-squares to the straight line relation:

                     tn = (n + nhalf) / rinf                   (1)

where  rinf  = the asymptotic stream rate (Byte/s), and
       nhalf = the message length (Byte) giving half the 
               asymptotic performance

This corresponds to an average performance, r, as a function of 
message length, n,
                            rinf
                    r = -------------                          (2)
                        (1 + nhalf/n)

In the above formula rinf is the asymptotic stream rate to use with the
value of nhalf in order to calculate the average bandwidth. For short
messages the values of rinf may be high but they will not be achieved
because of the effect of nhalf via equation (2).

The benchmark has been deliberately kept simple by restricting the
test to neighbouring nodes and asynchronous communication. This is
the most favourable case and gives a lower bound on the time for
the communication of a message. Asynchronous, here, means that a send
returns to the calling program when the user data array being sent 
may be safely reused.  This, however, may be before the message has
been received by the receiving node.  The receiving node program
stops (i.e. blocks) until the data is available for use by the user's
program.


2. Operating Instructions
-------------------------
Many message-passing computers have different timing for short and
long messages. This benchmark assumes by default that the break between 
the two types of message occurs between 99 and 100 Byte. If this is
not the case, then the user has to enter the actual value during the
benchmark initialisation.

To expand the PARMACS macros, compile and link the code with the
appropriate libraries enter the directory d77 and type:     make

On some systems it may be necessary to allocate the appropriate
resources before running the benchmark, eg. on the iPSC/860 to reserve
a cube of 2 processors, type:    getcube -t2

To run the benchmark executable, type:    host

This will automatically load both host and node programs. The progress
of the benchmark execution can be monitored via the standard output, 
whilst a permanent copy of the benchmark is written to a file called 
'result'. If the run is successful and a permanent record is required,
the file 'result' should be copied to another file before the next run
overwrites it.

  The whole test takes about 2 1/2 minutes on the Intel iPSC/860


3. Modifications
----------------
  The program may trivially be modified to perform the pingpong test
between any pair of nodes, in order to study the time variation with
separation within the network. Change the size of TORUS requested in
host.u, and the node process id's in the program node.u.
