.\" @(#)debugging	1.2	12/4/92
.EQ
delim off
.EN
.H1 "Debugging \*(PT and Extensions Within Pigi"
.pp
.Ir debugging
The extensibility of \*(PT can introduce problems.
Code that you add may be defective (few people write perfect code
every time), or may interact with \*(PT in unexpected ways.
These problems most frequently manifest themselves as a \*(PT crash,
where the \*(PT kernel aborts, creating a core file.
.Ir "core dump"
.Ir crashes
.pp
The fact that pigiRpc and VEM are separate Unix\** processes
.Ir pigiRpc
has the advantage that when pigiRpc aborts with a fatal
error, VEM keeps running.  Your VEM schematic is unharmed, and can
be safely saved.  VEM gives a cryptic error message something like:
.Ir "RPC Error"
.Ir "crashes"
.(c
RPC Error: server: application exited without calling RPCExit
Closing Application /home/ohm1/users/messer/ptolemy/lib/pigiRpcShell on host foucault.berkeley.edu
Elapsed time is 1538 seconds
.)c
The message
.(c
segmentation fault (core dumped)
.)c
.Ir "core dumped"
.Ir "segmentation fault"
will appear in the window from which you started pigi.
The first line in the above message might alternatively read
.(c
RPC Error: fread of long failed
.)c
.Ir "fread of long failed"
VEM is trying to tell you that it is unable to get data from the
link to the \*(PT kernel.
In either case,
it will create a large file in your home directory called ``core''.
This file is useful for finding the problem.
.H2 "A Quick Scan of the Stack"
.pp
.Ir "stack"
Assuming you are using Gnu tools, and assuming the pigiRpc executable
that you are using is in your path, go to your home directory
and type:
.(c
gdb pigiRpc
.)c
The symbolic debugger (gdb) will show the state of the stack at the 
point where the program failed.  The most recently called function
might give you a clue about the cause of the problem.
.Ir Gnu
.Ir gdb
.Ir "symbolic debugger"
Here is a typical session:
.(c
foucault.berkeley.edu 54: gdb pigiRpc
GDB is free software and you are welcome to distribute copies of it
 under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.7, Copyright 1992 Free Software Foundation, Inc...
(no debugging symbols found)...
(gdb) core core
Core was generated by `pigiRpc'.
Program terminated with signal 11, Segmentation fault.
#0  0x2eebb4 in end ()
(gdb) where
#0  0x2eebb4 in end ()
#1  0x1c774c in SimControl::doPreActions ()
#2  0x1c8538 in Star::run ()
#3  0x16ee74 in DataFlowStar::run ()
#4  0x16d888 in SDFScheduler::runOnce ()
#5  0x16d7c4 in SDFScheduler::run ()
#6  0x1cb4c0 in Target::run ()
#7  0x1cd4e4 in Runnable::run ()
#8  0x1bd810 in InterpUniverse::run ()
#9  0x197774 in KcRun ()
#10 0x18edac in RpcRun ()
#11 0x18eeb8 in RpcReRun ()
#12 0x177634 in RPCApplicationProcessEvents ()
#13 0x17675c in RPCMain ()
#14 0x2300 in main ()
(gdb) 
.)c
The ``where'' command shows that state of the stack
at the time of the crash.
Scanning this list we can recognize that the crash occurred during
the execution of a star.
Unfortunately, unless you are running a version of pigiRpc with
the debug symbols loaded, it will be difficult to tell much more
from this.
.H2 "More Extensive Debbugging"
.pp
To do more extensive debugging, you need to create or find
a version of pigiRpc with debug symbols, called pigiRpc.debug.
See the previous section for instructions on how to do this.
Once you have this, and have set the PIGIRPC environment variable
to point to it, run pigi as follows:
.(c
pigi -debug
.)c
An extra window running gdb appears.
(If this fails, then gdb is probably not installed at your site
or is not in your path.)
Type ``cont'' to continue past the initial breakpoint.
Now, if you can replicate the situation that created the crash,
you will be able to get more information about what happened.
Here is a sample of interaction with the debugger through
the gdb window:
.(c
GDB is free software and you are welcome to distribute copies of it
 under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.7, Copyright 1992 Free Software Foundation, Inc...
Breakpoint 1 at 0x22b0: file pigiMain.cc, line 45.

Breakpoint 1, main (argc=10, argv=0xf7fff77c) at pigiMain.cc:45
45              pigiFilename = argv[0];
(gdb) cont
Continuing.
.)c
At this point, you are running \*(PT.  Use it in the usual
way to replicate your problem.  When you succeed, you will get a
message something like:
.(c
Program received signal 11, Segmentation fault
0x166d8c in end ()
(gdb) 
.)c
At this point you can again examine the stack.
This time, however, there will be more information.
For same example as above:
.(c
(gdb) where
#0  0x166d8c in end ()
#1  0xa34ec in SimControl::doPreActions (which=0x2671c0)
    at ../../src/kernel/SimControl.h:89
#2  0xa42d8 in Star::run (this=0x2671c0) at ../../src/kernel/Star.cc:68
#3  0x5632c in DataFlowStar::run (this=0x2671c0)
    at ../../../../src/domains/sdf/kernel/SDFStar.cc:81
#4  0x54d40 in SDFScheduler::runOnce (this=0x2681d8)
    at ../../../../src/domains/sdf/kernel/SDFScheduler.cc:119
#5  0x54c7c in SDFScheduler::run (this=0x2681d8)
    at ../../../../src/domains/sdf/kernel/SDFScheduler.cc:98
#6  0xa7260 in Target::run (this=0x266f00) at ../../src/kernel/Target.cc:125
#7  0xa8eb4 in Runnable::run (this=0x266c50) at ../../src/kernel/Universe.h:69
#8  0x9c1f8 in InterpUniverse::run (this=0x266bf8)
    at ../../src/kernel/InterpUniverse.h:64
#9  0x7eb2c in KcRun (n=10) at ../../src/pigilib/kernelCalls.cc:458
#10 0x75e80 in Run (facetPtr=0xf7fff4c8) at ../../src/pigilib/exec.c:164
#11 0x761c0 in RpcRun (spot=0xf7fff5e4, cmdList=0x164158, userOptionWord=0)
    at ../../src/pigilib/exec.c:258
#12 0x5eaec in RPCApplicationProcessEvents ()
#13 0x5dc14 in RPCMain ()
#14 0x2300 in main (argc=10, argv=0xf7fff77c) at pigiMain.cc:50
(gdb) 
.)c
This particular stack trace is a little strange at the ``bottom''
(gdb calls the lower numbers the bottom even though they are at
the top of the list)
because it was generated by invoking a dynamically linked star,
and the symbol information is not complete.
However, you can still find out quite a bit.
Notice that you are now told where the files are that define
the methods being called.  The file names are all relative to
the directory in which the corresponding object file normally
resides.  The \*(PT files can all be found in some subdirectory
of ~ptolemy/src.
.pp
You can get help from gdb by typing ``help''.
Suppose you wish to find out first which star is being run when
the crash occurs.
The following sequence moves up in the stack until the ``run'' call
of a Star:
.(c
(gdb) up
#1  0xa34ec in SimControl::doPreActions (which=0x2671c0)
    at ../../src/kernel/SimControl.h:89
89                              : !haltRequested();
(gdb) up
#2  0xa42d8 in Star::run (this=0x2671c0) at ../../src/kernel/Star.cc:68
68              go();
.)c
At this point, you can see that line 68 of the
file ~ptolemy/src/kernel/Star.cc reads
.(c
                go();
.)c
Odds are pretty good that the problem is in the go() method
of the star.
You can find out to which star this method belongs as follows:
.(c
(gdb) print *this
$1 = {<Block> = {<NamedObj> = {nm = 0x267268 "BadStar1", prnt = 0x266bf8, 
      myDescriptor = 0x166ca0 "Causes a core dump deliberately.", 
      _vptr$FOO = 0x166e08}, ports = {<NamedObjList> = {<SequentialList> = {
          lastNode = 0x267258, dimen = 1}, }, }, 
    states = {<NamedObjList> = {<SequentialList> = {lastNode = 0x0, 
          dimen = 0}, }, }, 
    multiports = {<NamedObjList> = {<SequentialList> = {lastNode = 0x0, 
          dimen = 0}, }, }}, targetPtr = 0x266f00, indexValue = -1, 
  inStateFlag = 1}
(gdb)
.)c
This tells you that a star with name (nm) "BadStar1" and
descriptor "Causes a core dump deliberately." is being invoked.
This particular star has the following erroneous
go() method:
.(c
        go {
                char* p = 0;
                *p = 'c';
        }
.)c
.pp
More elaborate debugging requires that the symbols for the star
be included.  The easiest way to do this is to build a version
of pigiRpc.debug that includes your star already linked into the system.
Then repeat the above procedure.  The bottom of the stack frame will
have much more complete information about what is occurring.
.EQ
delim $$
.EN
.\" Local variables:
.\" mode: nroff
.\" End:
