\"				-*-Text-*-
\" $Locker:  $
\" $Revision: 1.0 $
\" $Log:	gmat.ms,v $
\" Revision 1.0  87/09/14  13:04:20  seager
\" Initial Release
\" 
\" $Header: gmat.ms,v 1.0 87/09/14 13:04:20 seager Rel $
\" To print this puppy out use:
\" eqn gmat.ms | tbl | TROFF -ms
\" where TROFF is your version of device independent troff (psroff, ditroff)
.nr PS 12
.nr VS 14
.ds CH
.ds RF UCID-21348
.ds CF -%-
.ds LF ISCR-87-2
.EQ
delim @@
.EN
.RP
.TL
Graphical Multiprocessing Analysis Tool (GMAT)
.AU
Mark K. Seager
.AI
Computing and Mathematics Research Division
Lawrence Livermore National Laboratory
Livermore, CA 94550
.AU
Susan Campbell \" Dept. of Mathematics, Carnegie Mellon University, Pittsburgh, PA 15213
.AU
Scott Sikora \" Dept. of Electrical Engineering & Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139
.AI
Institute for Scientific Computing Research
Lawrence Livermore National Laboratory
Livermore, CA 94550
.AU
Robert Strout
.AU
Mary Zosel
.AI
User System Division
Lawrence Livermore National Laboratory
Livermore, CA 94550
.AB
The design and debugging of parallel programs is a difficult task due
to the complex synchronization and data scoping issues involved.
To aid the programmer in parallel code development we have developed
two methodologies for the graphical display of execution of
parallel codes.
The Graphical Multiprocessing Analysis Tools (GMAT) consist of
\fBstategraph\fR, which represents an inheritance tree of task states,
and \fBtimeline\fR, which represents tasks as flowing sequence of events. 
Information about the code can be displayed as the application runs
(dynamic mode) or played back with time under user control (static mode).
This document discusses the design and user interface issues
involved in developing the parallel application display GMAT family.
Also, we present an introductory user's guide for both tools. 
.AE
.SH
Keywords
.LP
Parallel Processing, Debugging, Graphical User Interfaces,
Multitasking, Microtasking
.bp
.XS 3
1. Introduction
.XA 4
2. Design Goals for GMAT
.XA 5
3. \fBStategraph\fR
.XA 12
4. \fBTimeline\fR
.XA 20
5. GMAT \fBStategraph\fR User's Guide
.XA 23
6. GMAT \fBTimeline\fR User's Guide
.XA 27
7. The Cray Compatibility Library
.XA 40
8. Conclusions
.XA 41
9. Acknowledgements
.XA 41
10. Bibliography
.XA 42
11. Figures
.XE
.PX
.bp
.NH 1
Introduction
.PP
Writing and debugging multiprocessing codes can be a difficult process.
A successful program must control multiple processors bound to
multiple tasks cooperating on a single user job (or program).
Each task can execute and report to subsequent tasks only if it has
all relevant information from previous tasks.  
This results in complicated inter-task synchronization (on shared
memory multiprocessors) or communication strategies (on distributed
memory multiprocessors).
To debug even a relatively simple program, the programmer must have
access to information detailing how tasks are managed; how the data is
partitioned (shared vs. private); how and when tasks are synchronized.
We will focus primarily on shared memory machines, but all of our ideas
easily extend to the distributed memory machines.
.PP
In the following we make a distinction between tasks, processes and
processors.
Processors are the physical CPU's. 
Processes are the operating system scheduled units of computation run
on behalf of a user job.
Tasks are portions of work within the user code.
Tasks are implemented as separate threads of execution within a set of
processes and are scheduled by the multiprocessing library.
For example, a job may have three processes running on two physical
CPU's supporting five user tasks.
.PP
Currently, at the Lawrence Livermore National Laboratory (LLNL),
multiprocessing capabilities exists on various machines, including
four-processor Cray X/MP's and an eight-processor Alliant FX/8.
Specifically designed multiprocessing features of the Networking
Livermore TimeSharing operating System (NLTSS) on the X-MP's and the
UNIX operating system on the FX/8, multiprocessing can be utilized by
the applications/systems programmer via subroutine calls to
multitasking libraries or via (preprocessed) compiler directives
(microtasking). 
Another motivating factor for this work is to have source code
Compatibility between the Cray and Alliant environment.
A Cray compatibility library (libCRAY.a) was written for the Alliant
which duplicates the LLNL multitasking library (NSYSLIB) for Cray's as
closely as possible. 
In both the NLTSS/Cray X-MP and UNIX/Alliant FX/8 environments micro-
and multi-tasking can coexist in the same job.
Built into these libraries is the ability to log all multiprocessing
events (e.g., lock, event and barrier usage and task creation,
completion and destruction). 
The previous capability for analyizing this log resided in an NLTSS
utility: Multiprocessing Analysis Tool (MAT) [1].
>From a trace file generated by NSYSLIB, MAT generates a detailed human
readable (ASCII text) file displaying the execution of the
multiprocessing code.   
The MAT output file contains a static postprocessing trace of
all multiprocessing events.
The multiprocessing event data that is generated for MAT and GMAT is
patterned after a similar tool designed by Cray Research, Inc. called
MTDUMP.
.PP
>From studying these MAT traces it became apparent that a graphical
display of the mountain of data might be a preferable presentation.
Concurrent with the development of MAT was an effort at Argonne
National Laboratory (ANL) to give a graphical display of the data
dependence graph arising from the use of the Schedule multiprocessing
library in an application code [2]. 
Born of these two sources the Graphical Multiprocessing Analysis Tool
(GMAT) project was begun. 
GMAT provides a graphical display on a Sun Workstation (running UNIX
and SunView) of library generated trace files of programs run on
either a Cray X/MP (NSYSLIB) or the Alliant FX/8 (libCRAY.a). 
.NH 2
Brief Description of Argonne Schedule Tracing Facility
.PP
The Schedule library by Jack Dongarra and Danny Sorensen at ANL is
a set of Fortran callable subroutines designed to allow users to
write \fIportable\fR multiprocessing applications [2].
The approach of Schedule is to have the user define the elements of
work to be accomplished (called \*Qtasks\*U: not to be confused with
tasks as defined above) and to delineate the data dependencies between these
\*QTasks.\*U
Schedule then schedules these \*QTasks\*U for execution using
whatever the local operating system can offer in the way of
multiprocessing and inter-processor synchronization.
This data driven approach to multiprocessing is very interesting, but
is quite different from the control flow approach of the Cray
multitasking/microtasking libraries.
Within Schedule there is the capability to log library events (calls
to the library) and a trace file is created at the end of a run.
.PP
The Argonne schedule tracing facility reads in this trace file and creates
a graphical representation of the data dependencies of the tasks
involved in the execution of the multiprocessing code. 
This trace file differs greatly from the trace file produced by
NSYSLIB/libCRAY.a.
Whereas the NSYSLIB/libCRAY.a generated trace file is a listing of all
logged multiprocessing events, the first portion of the Argonne trace
file must specify the static structure of the data-dependency graph.  
For each task, the number of children, the number of parents, and the
task numbers of the parents are specified. 
Based on this information, the entire graph structure is predetermined
and can be drawn.
.PP
All tasks are initially drawn as hollow circles and are filled in as
execution continues.  Each task progresses through four execution
states ranging from \*Qnot yet executed\*U to \*Qexecution completed.\*U
Features of the tool include four panel sliders to indicate the number
of events processed and the number of active processes, and to allow
the setting of timing speed and event speed.  Panel buttons allow
loading a trace file, starting, stopping, and quitting the tool,
redrawing the graph, drawing a subtree, displaying a histogram of
active processes versus events processed, calculating a critical path,
and adjusting the size of the display.  Directory and trace file choices
are made via menu-selection.
.PP
In order to become intimately familiar with the Argonne approach and
determine its potential we took the code for the schedule trace
facility and enhanced it.
Modifications included numerous bug fixes, the addition of a text
subwindow for the display of messages, and a conversion of the SunView
tool from polling to interrupt-driven notification (see [3] for the
definition of terms). 
This conversion greatly decreased the CPU load during execution.
This version of the Schedule trace facility is now being distributed
by ANL via NETLIB [6].
.PP
Due to the difference in trace file formats and the fact that the small
number of states the Argonne schedule tracing facility displayed would
be too limited to convey all of the multiprocessing states
NSYSLIB/libCRAY.a would log, GMAT differs greatly from the Argonne
schedule tracing facility. 
We were not able to use much of the ANL tool in the GMAT effort.
Two routines were retained: the menu and file choice handler.
.NH 1
Design Goals of GMAT
.PP
The primary design goal of the GMAT family was to create a set of
standard SunView tools that graphically display, and help the user
analyze, a library generated execution trace file of a parallel
application. 
Implicit in this is: 
1) the ability to turn tracing on and off (selective use); 
2) to have the multiprocessing library contain all the code required
for tracing and still give users the ability to log their own events
(ease of data generation); 
3) to watch the graphics interface as the code may be running and to be
able to play back a previous execution trace over and over again in
order to study portions very carefully (dynamic and static modes);
4) to use only what was available from Sun Microsystems in the way of
creating a SunView application (\*Qoff-the-shelf parts\*U).
Note that there are fundamentally three aspects to accomplish this
goal: distributed computation, graphical display and control.
The distributed computation is needed because the parallel application is
running on a remote multiprocessing host and is accomplished via
communication sockets in the Alliant (UNIX 4.2BSD)/Sun (UNIX 4.2BSD)
environment and LINCS (Livermore Interactive Network Communication
System) interprocess communication protocols in the Cray (NLTSS)/Sun
(UNIX/LINCS 4.2BSD) environment. 
In the display area of GMAT it is apparent that there are two quite
different  types of information from a trace file that one would like
to view: task genealogy and synchronization.
Since inter-task synchronization is only really necessary as a result
of concurrent execution, the notion of time should be a fundamental
aspect of the graphical display of synchronization data.
On the other hand a time line approach is not conducive for displaying
task lineage because the fork/join operations are generally separated
by enough time so that getting all of it on a single screen is
impossible.
Hence, we developed two separate tools and highlighted each aspect
separately.
Most of the control features between the tools are similar and hence
duplicate work and user training is reduced.  
.PP
\fBStategraph\fR represents each task by a node in an ancestry tree,
and changes the node representation as events are processed that affect the
state of the task.
Lines are drawn connecting parent tasks to the children
that they spawn, thereby indicating the lineage of the tasks.  
One limitation of the state graph is that only events that affect the
state of each task are shown graphically.  
Other nonstate events (such as Not Waiting on a Lock) are only
available via the information windows for each task.  
.PP
\fBTimeline\fR represents each task by a line, and posts nodes along the
line as events occur.  Thus, events are shown graphically and one
can watch the execution of a task as the time line grows downward.
\fBtimeline\fR also provides a sense of time, as the events are processed on
a time interval basis, versus on a event basis as in \fBstategraph\fR.  One
limitation to \fBtimeline\fR is that little past history is kept.  Since all
events are posted, the time line grows rapidly, and one loses past
information such as a task's parent or children.  
\fBStategraph\fR and \fBtimeline\fR provide two unique ways of looking
at the NSYSLIB/libCRAY.a library generated trace information, each
with its own advantages and drawbacks.  
Let us now consider each method in detail.
.NH 1
Stategraph
.NH 2
Description and Design Considerations
.PP
The GMAT \fBstategraph\fR tool is a graphical display of the task
genealogy tree of a multiprocessing application.
Each node in the tree is a task.
Spawned tasks appear below the \*Qparent\*U task and are connected with a line.
As a task changes state (e.g., from unbound to a process to bound,
running to waiting) the displayed node changes shape and/or polarity 
(see Figures 1 and 2).
There is no sense of time in the state graph, only changes in state,
delimited by significant events in the trace file (see Section 3.6 for a
list of these significant events).
The graphical representations for different states were chosen so that
both multitasking and microtasking events can be displayed.
Thus, at any given moment, the graph will reflect the state of all
tasks created so far by the application.
The rate at which the screen updates occur is controlled by the user to 
facilitate the detailed viewing of important sections and the speedy
presentation of unimportant ones.
The presentation of the state graph and the associated control
functions to the user is accomplished via four main windows:
the panel subwindow, the text subwindow, the overview canvas, and the
primary canvas.
Each of these windows is discussed in detail in the following.
.NH 3
Panel Subwindow
.PP
The panel is the main user control interface (upper left hand window
in Figure 2).
The Sun Visual/Integrated Environment for Workstations (SunView)
contains all the necessary components to build an interactive user
control interface.
These SunView components allow users to:  1) move around within the
directory structure and choose trace files for viewing (PANEL_CHOICE
items); 2) control the speed of the event display (PANEL_SLIDER item);
3) view (in a numerical fashion) the the number of active tasks and
processes and the number of events processed so far in the trace file
(PANEL_MESSAGE items);
4) select computations for the \fBstategraph\fR tool to perform (e.g., load
trace files, step through a trace file, bring up the key or histogram
windows, etc.) (PANEL_BUTTON items).
See reference [3] for a complete description of the SunView
environment.
The speed control slider expresses the event display speed as a
percentage of the fastest possible speed. 
The default value of the slider is set at 75% and can easily be
changed by clicking with the left mouse button in the slider at the
desired value (current value is displayed in brackets).
When one of the panel buttons is clicked with the left mouse
button, control is passed from the SunView *Notifier* to the function
declared as that button's notify procedure.
Notify procedures are also used for the aforementioned SunView
PANEL_CHOICE and SunView PANEL_SLIDER items.
The code in any notify procedure is executed until a \fIwindow_return\fR
statement is encountered: at that point control returns to the SunView
*Notifier*.
.PP
Representation of a state graph may require the display of many
different symbols (graphical images).
To aid the user in interpreting these symbols a key (cheat-sheet) of
all the \fBstategraph\fR graphical images is available as a pop-up window
via the \*QKey\*U PANEL_BUTTON item (see Figure 1).
.PP
If one part of the inheritance tree is of particular interest to the
user, it can be analyzed in greater detail via the \*QSubtree\*U
PANEL_BUTTON item.
The user selects a task (node in the tree) to concentrate on.
The \*QSubtree\*U item activates a pop-up window to separately display
the ancestry subtree rooted at the selected node.
As long as the subtree is present, the main state graph and the
subtree graph are updated contemporaneously.
.PP
The \*QHistogram\*U PANEL_BUTTON activates a pop-up subwindow that
plots the number of active processes as a function of event. 
This gives a history mechanism to the \fBstategraph\fR that is otherwise
lacking.  Clearly process utilization is of the foremost importance in
a timesharing multiprocessing environment.  Other information could be
plotted as well (number of active tasks, tasks waiting on something,
etc.).
.NH 3
Text Subwindow
.PP
The text subwindow is used to display messages (middle left hand
window in Figure 2).  
The user has the ability to insert notes into the text subwindow at
any position, and can review previous messages via the scroll bar.
New messages from the tool will always be inserted at the end of the
window, so the user need not be concerned regarding the scroll position.
Unlike the panel, the user does not preempt control from the
SunView *Notifier* while the mouse cursor is in the text subwindow. 
.NH 3
Overview Canvas Window
.PP
In order to enable the user to gain a feeling for the overall
composition of the emerging graph, an overview canvas is provided
(upper right hand corner of Figure 2).
This canvas uses tiny squares to indicate the positions of the
various nodes of the graph.
In order to fit all of the nodes in a possibly very large state graph
on the overview canvas (without scrolling), the squares must be very
small: only 5 pixels by 5 pixels.
No attempt was made to connect the squares to indicate dependencies,
as the lines would merely confuse the overview. 
A hollow square indicates that the task is bound to a process and a
solid square indicates a task that is not bound.  Process icons are
placed on the first row and appear in the overview as vertical lines
(one line for unbound, two for bound).  The lines are meant to provide
the user with a reference point so he can scroll back to the process
icons.  
.PP
A dotted box in the overview canvas represents the portion of
the entire primary canvas that is being viewed through the primary
canvas window.
When the left mouse button is depressed while the cursor is in the
overview canvas the view box is centered at the cursor location 
(with accommodations made if it is too close to an edge) and the view
in the primary canvas window is adjusted accordingly.
This implements a two dimensional scrolling capability within
\fBstategraph\fR.
.PP
Our original intent was to allow users to utilize the canvas
scroll bars to position their view of the overall canvas as they saw
fit, with the overview of the graph available as a visual aid, only.  
In order to enable user scrolling via the scroll bars, control must
reside with the *Notifier*.  
Once a button is selected from the panel (i.e., the \*QGo\*U button),
control is passed to the appropriate notify procedure, thereby
removing control from the *Notifier* and disabling any scrolling
capabilities.
All scrolling actions (clicking in the scroll bar, dragging, etc.)
would be buffered and executed when control returned to the
*Notifier*.  
This would no doubt be a source of frustration for the user, as he
would see nothing happen as he tried to use the scroll bars, and then
all of his actions would be executed.
One way to limit this time lag would be to return control to the
*Notifier* frequently. 
Returning control to the *Notifier* after each piece of trace file
data is received was judged undesirable, as speed was a major concern.  
We therefore decided to use the overview to allow the user to scroll
and make the scroll bars in the primary canvas invisible (they were
still required to implement scrolling).  
It should be noted here that when the tool is processing trace file
events in \*QGo\*U (non-stop) mode a scrolling command terminates the
event processing so that the scroll can be accomplished.
To resume trace file event processing the user must activate the \*QGo\*U
or \*QStep\*U control PANEL_BUTTONs.
.NH 3
Primary Canvas
.PP
The primary canvas contains a full-size drawing of the complete graph
(bottom window in Figure 2).
Only one section of the primary canvas may be viewed through the
window at a time: the section indicated by a dotted box in the overview.
As each significant action is processed the nodes of the graph and
the process icons are updated in the overview canvas window (other
windows are also updated, as appropriate).  
As new tasks are created, space in the primary canvas must be
allocated for their display.
The lack of knowledge regarding the structure in ancestry graphs found
in trace files poses some interesting problems regarding the drawing
algorithm and associated data structures and will be discussed in
detail in the next section.
The processes are represented by icons: a different icon for each state
that a process may be in (idle, suspended, reacquired, bound, and
unbound).
Process icons are displayed on the first row of the primary canvas.  
When a process is bound to a task the number of the task will appear
next to the icon. 
.PP
As mentioned above, a key of all the symbols is available via the
\*QKey\*U PANEL_BUTTON in the panel subwindow.
Another set of user aids is the node information pop-up windows.
The user moves the cursor over a node in the graph.
When the right mouse button is depressed, the node information pop-up
window appears, and when the right mouse button is released, it disappears.
The pop-up window contains useful state information about the task
associated with the node in the graph: task number, task number of the
parent task, the last five events associated with this task, what
event or lock the task is waiting on as well as others. 
Figure 2 shows a pop-up window for task 2, which is waiting for task 5
to complete.
.NH 2
Ancestry Graph Drawing Algorithm
.PP
The ancestry graph in a multiprocessing job can be arbitrarily complex
(unlike the situation in a binary tree).
Any task can spawn any number of tasks.
Therefore, the chief aim of the drawing algorithm is to present the
next task in the best possible position, with consideration placed on
the size of the graph that can be seen at one time.
.PP
To start, the primary canvas was divided into 75 rows and 56 columns,
each the size of one task node (including borders). 
The graph is referenced with the top, center position as (0,0); with
room to move in both the positive and negative direction across, but
only the positive direction down.
The user's view of the primary canvas is approximately 9 rows long by
9 columns wide. 
.PP
There are three primary considerations on where to place the next
node: 1) where the node is in the ancestry graph (main node, son,
grandson, etc.); 2) keep as much of the graph visible at one time; 3)
try to avoid the situation where node connection lines overlap.
With these considerations in mind the concept of sibling offset is
introduced. 
Initially, each node that lies in the exact center of the graph has a
sibling offest, which specifies the distance across the canvas to place
the first child.  
All other children positions are calculated from this offset.
For example, the first task that is placed has a sibling offset of eight.
That is, its first child will be placed one row down and eight columns
to the right.  If that first task were to have another child, the next
screen position is determined from its last child position.  In this
case it would be placed one row down and eight positions to the left.
>From then on, each subsequent child will be placed two columns inward;
6 columns left and right, 4 columns left and right and 2 columns left
and right.  After all the alternating children have been placed, the
algorithm continues by reducing the constraint by one and alternating
again; 7 columns left and right, 5 columns left and right, 3 columns
left and right and finally 1 column left and right.  
Once the two passes have been performed, empty spaces are filled
incrementally; 9 columns left and right, 10 columns left and right, 11
columns left and right and so on.
.PP
This procedure is followed for each center task node that has
children.  The constraints are decremented for each level in the
graph.  For example, the child center node of the root will have a
constraint of seven.  Its child center node will have a six and so
forth.  This ensures that the graph will have some inside room to
spawn tasks.
.PP
The task nodes that are not in the center are handled in a different
manner. Considering the size that a graph may achieve, the portion
that can be viewed at one time is relatively small.  To exploit the
depth and breadth of the entire drawing canvas, the graph must
inherently proceed down and out.  And this is exactly how the children
for the non-center task nodes are handled.  When one of these tasks
has a child, it places a child directly beneath it.  The next child
will be placed to the outside of the first, the third to the outside
of the second and so on.  So if a task located in column -6 had a
child, it would be located in the same column one row down.  The next
child would be one row down but in column -7.  A task node in column 5
would place its children in columns 5,6,7... subsequently.
.PP
It must be noted that the placing of a task's child is dependent on
where its last child was placed.  
This is very similar to how the the algorithm handles occupied
positions.  
If the space designated for a parents child is already occupied by
another, unrelated task, the algorithm checks the next determined
space as if the unrelated task was its own.  
Thus if a non-center task is
designated to place a child in row 5, but this place is taken, it
immediately looks to row 6, row 7 and so on, until a free space is
found.  Similarly, a center task might look at row 2, -2, 5, -5, until
an open row and column are found.
.PP
While this algorithm succeeds in keeping the graph constrained to the
screen, it tends to pack nodes rather tightly.  
If three non-center nodes right next to each other suddenly each
spawned five children, everyone would attempt to go down and out in
the same way, packing the area underneath the parents.  
If, after a few generations, one of the original parents
had a late child, there would be little room, and space would be
found in a relatively unobvious place, causing confusing heritage
lines.  
Of course, in static mode, one could read in the entire trace file and
determine the graph structure a-priori, but this approach suffers from
significant start-up overhead (the I/O time) for large trace files.
.NH 2
Internal Task and Process Data Representation
.PP
The ancestral tree is stored internally as an array of pointers to
\fBstategraph\fR node structures (recall that nodes in the state graph
represent tasks).
This allows for the dynamic allocation of structures as they are
needed and facilitates the quick traversal of the state graph.
Each node structure contains the (x,y) coordinate of the
left-hand upper corner of the position where the node is displayed
(used to help determine where its children should be placed), the
(x,y) coordinate of the left-hand upper corner of the graphical
representation of the node in the subtree canvas, a number indicating
the state of the node based on the action number of the posted event,
the process number of the process that is working on this task (NULL
if the task is not bound to a process), the task identification number
of the task's parent task (used for subtrees and critical paths), a
field for auxiliary information (such as what a waiting task is
waiting for), an integer indicating the microtasking state of a task (-2 if not a
slave, -1 if a master, 0 if a slave waiting for work, else the task
number of the master the slave is working for), an integer indicating
the number of stored actions that have affected the task (a maximum of
five), a pointer to the first of a linked list of action structures
(to keep track of the last five logged events for each task), a
pointer to the last of this linked list of action structures, and a
pointer to the first of a linked list of structures containing the
task numbers of the children of that node (for subtree).
Thus, a task's parent is easily found by using the parent id field of
the node structure as an index into the array of pointers to structures.
.PP
Process information is stored in an array of process structures.  
The amount of storage required for the process structure is small and
the number of processes (as apposed to tasks) is small so we can
preallocate all the storage necessary.
Hence the complication of an additional pointer dereferencing (as in
the task information structure) is not needed.
.NH 2
Significant Actions
.PP
Since the GMAT \fBstategraph\fR displays the state of each task, only those
events which affect the state of a task are considered significant.
As an event is read in from the trace file (in static mode) the event
counter is incremented and if the action is significant the display is
also adjusted.
We have included an option for the user to define up to five
additional events that should be considered significant by \fBstategraph\fR
and hence displayed.
This enables users to customize the GMAT graph by having their user
defined actions displayed or by displaying actions that GMAT would
otherwise ignore. 
GMAT significant action numbers are listed in Section 3.6.
.NH 2
Extensions to Stategraph
.PP
The following is a list of possible additions to the GMAT \fBstategraph\fR
tool we envisage for the near future.
.IP \(->
Dynamic Mode - Complete the implementation of dynamic mode in GMAT
\fBstategraph\fR which will enable a user to watch a graphical state graph 
display of his code as it runs on the Cray X/MP or the Alliant FX/8.
The hooks are in place in GMAT for this addition.  
The only additional work needed for dynamic mode is to actually
implement the communication protocol in NSYSLIB on the Cray and in
libCRAY.a on the Alliant.
.IP \(->
Delete Recovered Tasks - A button or choice item on the panel that would
allow the user to turn the deletion of recovered tasks (tasks that
have been terminated and \*Qrecovered\*U by the library) on and off.  
This feature is advantageous when the user knows that the
multiprocessing job will create a large number of tasks and he wishes 
to concentrate on the active tasks.
This would serve to reduce clutter on the primary canvas by
allowing the recycling of canvas space.  
.IP \(->
Intermittent Logging - Version 1.0 of GMAT \fBstategraph\fR does not permit
intermittent logging.
Intermittent logging arises when the multiprocessing user application
turns logging on and off many times during the course of a run or when
the internal circular buffer storage for trace information is too
small to hold all the events and wraps over upon itself.
Another feature of intermittent logging is the ability of the tool to
jump ahead in the trace to a specific location (e.g., a task creation,
an action number, or an event number).
.IP \(->
Clock Based Mode - Clock Based Mode is the basis for the GMAT
\fBtimeline\fR, and can be introduced to the \fBstategraph\fR as well via 
simulated waiting between events.  
See the time warp section below for a discussion of the concept of
time in \fBtimeline\fR and for further details on simulated waiting.
.IP \(->
Redraw - If the ancestral tree becomes confusing due to complicated
task creation hierarchies, redrawing the state graph would (due to the
extra information available at redraw time) allow the state graph to
be redrawn in a format that is easier to understand.
.bp
.NH 2
Significant Action Numbers - GMAT Stategraph
.PP
The following is a list of the NSYSLIB/libCRAY.a action numbers that
the GMAT state graph tool recognizes.
User defined events, 1 through 127, are also recognized.
See [1] for a complete listing of the NSYSLIB/libCRAY.a action numbers.
The third column of the table gives a terse description of what GMAT
does upon encountering each recognizable NSYSLIB/libCRAY.a action number.
.nr PS 10
.nr VS 12
.PP	\" Don't MUCK with this .PP.  It forces the above .nr's to work.
.TS
center tab(/) box;
cB s s
c c c
n l l.
Significant Action Numbers - GMAT Stategraph
Number/Library Action/GMAT Action
=
128/And There Was Light/Draw and bind first Task 1 and Process 0
129/Start Task/T{
Draw task, update its parent's list of children
connect parent to child
T}
130/Complete Task/T{
Unbind the task from the process it was bound to (do not show as
complete, since waiting tasks are not informed the task is complete
until recovery)
T}
131/Recover Task/Draw as recovered 
132/Bind Task to Process/Update task node and process icon
133/Unbind Task/Update task node and process icon
_
145/Acquire new Process/Draw process icon
146/Idle a Process/Update process icon
147/Suspend a Process/Update process icon
150/Reacquire Process/Update process icon
_
160/Wait on a Task/Update task node
161/Wait on a Lock/Update task node
162/Wait on an Event/Update task node
_
166/Resume after Task Wait/Update task node
167/Resume after Lock Wait/Update task node
168/Resume after Event Wait/Update task node
_
181/Wait on a Barrier/Update task node
183/Resume after Barrier Wait/Update task node
_
256/Wait for Guard N/Update task node
257/Resume after Guard N/Update task node
258/Wait for Guard/Update task node
259/Resume after Guard/Update task node
262/Master Waits for Fray/Update task node
263/Master Resumes after Fray/Update task node
264/Wait for Control Structure/Update task node
266/Resume after Control Structure/Update task node
267/Slave waits for work/Update task node
.TE
.nr PS 12
.nr VS 14
.NH 1
Timeline
.NH 2
Description and Design Considerations
.PP
GMAT \fBtimeline\fR differs greatly from GMAT \fBstategraph\fR.  
In \fBstategraph\fR, each node represents a task, and at any given
moment, the graph is a \*Qsnap-shot\*U of the multiprocessing job's
current state.
In an attempt to represent synchronization within a multiprocessing 
program, we feel that a completely different presentation of the
trace file data is necessary.
.PP
Time is a fundamental aspect of synchronization and hence the \fBtimeline\fR
tool incorporates time as its central focus.
As tasks are created they begin to have a \*Qtime line\*U (an existence,
with state, in time).
As time marches on, the time line of a task is drawn downward (see
Figures 3 and 4).
Events are shown as icons superimposed on the time line.
The graphical representation of a task's time line changes as the
task changes from a \*Qrunning\*U state to a \*Qwaiting\*U state, etc.
Tasks are connected via a horizontal line to the tasks that they spawn.
At any given moment the time line displayed will give the history of
the multiprocessing job.
.PP
In static mode (play back of a previously generated trace file) the
notion of time is obtained from the event time stamps in the trace file.  
As \fBtimeline\fR processes events it reads the next event and takes the
difference between the next event's time stamp and the current event's
time stamp (which is given in cycles and is a real time measurement as
apposed to an elapsed CPU measurement) and then \*Qwaits\*U that
amount of time until updating the display.
Waits is in quotes because we have introduced the concept of
\*Qsimulated waiting\*U or \*Qwarped time\*U in \fBtimeline\fR.
Some intervals between events can range between many seconds to many
minutes. 
Usually, one is not interested in viewing a non changing screen for
that length of time. 
So, if some number of seconds passes we speed up the \*Qclock\*U by a
factor of ten and continue to wait.
This process of waiting, speeding up and waiting continues until the
interval between events passes and then the \*Qclock\*U is reset.
On the other hand, events can come in such quick succession that the
display can not keep up.
In this situation the \*Qclock\*U is slowed down by a factor of ten
and the tool continues reading events.
If \fBtimeline\fR still can't keep up, the \*Qclock\*U is further slowed down.
The \*Qclock\*U is reset when \fBtimeline\fR can keep up with the events.
Visual clues (which will be discussed further below) are given to the
user that time has been warped so he can take this into account when
interpreting the display.
.PP
The window layout is relatively simple (see Figure 4).  
In the upper left hand corner, there is a panel that controls files,
processing speed, loading, running and informative subwindows.  
Its layout and usage is very similar to the \fBstategraph\fR control panel.
In the upper right hand corner there is a text subwindow where all
error messages and directions are printed.  
The rest of the screen is taken up by two canvases, where most of the
graphical action takes place.  
On the right hand edge is a small canvas, roughly 100 pixels wide, where all
the process icons are placed.  This is a \*Qhidden\*U canvas: the edges
overlap and there is no way to distinguish it from its neighboring
larger primary canvas.  
The process icons were put in their own separate canvas, so they would
not scroll off the screen with the time line graph. 
Finally, the rest of the screen is taken up by the large primary
canvas, where the actual graph is drawn.  
The entire canvas is much longer than the portion shown, so vertical
scroll bars have been installed to allow the user to reach any part of
the canvas desired.
The panel subwindow will be described in detail in the next section.
.NH 3
Panel Subwindow
.PP
The panel subwindow is the main user control interface (upper left
hand window in Figure 4) and is built up of the same elements of the
SunView environment as the GMAT \fBstategraph\fR tool, but includes a few
additions.  
The primary addition is the time warp icon (center of the control
panel).
It's function is to graphically display the current time warp (see the
Time Warp section, below). 
Another addition to the control panel is the address table
PANEL_CHOICE item.
.PP
Since the user seldom needs to know the full addresses of every lock,
barrier, or event that may be encountered, they have been associated
a small (the first lock, say, is 0 and each lock assignment event
increments the counter by one) integer value in the display.  
If at any time the correlation between these integer values and the
actual addresses is needed, a table can be brought up by depressing
and holding the middle mouse button in the canvas subwindow.
The type of table that appears is controlled with the address table
PANEL_CHOICE item which allows the user to select between a table of
events, locks or barriers addresses. 
.PP
In the event that not all of the elements fit into the table, a
scrolling feature is implemented.  Bringing up the window and moving
the mouse into its confines will allow all the user to scroll forwards
and backwards throughout the table.  This was implemented to allow a
large number of elements to be entered, without having to dynamically
resize the window.
.PP
Since a user is often not interested in every section of the
trace file, another addition to the control panel for the GMAT \fBtimeline\fR
is the \*QJump\*U PANEL_BUTTON.
The user specifies whether to jump forward to an event count
or to the next event originating from a specific task or to the next
event occurring with a specific action number.
The execution of the jump is performed rather simply,
the events are read, and the data structures updated, but the screen
is not.
When the appropriate variable matches the user specified one, the jump
stops and prints out the current event. 
.PP
In the GMAT \fBtimeline\fR control panel the \*QHistogram\*U PANEL_BUTTON has a
slightly different function from that in the GMAT \fBstategraph\fR.
The \fBtimeline\fR histogram gives a plot of the number of active processes
as a function of time (as apposed to event).
If the user changes the clock interval at any time during execution,
the histogram will continue with the new clock interval.  
However, a condensing problem arises when hundreds of clock
intervals of repeated information is encountered (see the Time Warp
section).  
Thus, within a blank interval, if the number of clock intervals is
greater than a hundred, two vertical lines are drawn, effectively
breaking the histogram.  
This is to signify to the user that a large segment of
time was repeated at that point, with no change in the number of
active processes.  However, there is no graphical representation to
show the length of the interval.  Nevertheless, the number of clock
cycles since the beginning of program execution is displayed
periodically in the bottom row of the histogram.  Thus, by looking at
the differences in executed clock cycles before and after a gap, the
user can get a good estimate of the length of a particular gap.  
.NH 3
Graphical Representation of Tasks
.PP
GMAT \fBtimeline\fR represents tasks on a \*Qtime line.\*U  Each task is
constantly updated as a function of time, progressing down the screen.
Thus, instead of a task being represented as a constantly changing
graph node, it is represented as a column of events.
.PP
The actions of a task have been divided into three separate graphical
representations: posted events, user defined events, and
wait/processor states.  
Posted events are flags that a task reports, as logged in the trace file.  
They may or may not affect a task's wait/processor state, but their
main purpose is to report a significant event has just occurred.
Posted events are represented in \fBtimeline\fR as a box with an appropriate
identifying string.  
Often the posted event will report another piece of auxiliary
information, a relevant integer.
For example, the \*Qwaiting on a task\*U posted event will contain the
number of the task on which it is waiting.
This shows up in in the box as an integer next to the characters.
.PP
The next type of graphical representation is a user defined event.
While posted events appear as square boxes, user defined events are
as boxes with rounded corners.
Since there is no way to print out an mnemonic string for an action
number defined by the user, user defined boxes are labeled with the
action number.
.PP
While a task will occasionally post an event, it is represented most
often in it's wait state.  At all times, a task is either ready to run
or it is waiting.  This is represented in the \fBtimeline\fR as a vertical
line, solid if the task is capable of running, and dotted if the task
is waiting.  Between posted events, a task is updated with its wait
state.  The effect is a column of indiscriminately spaced boxes
connected with either dotted or solid lines.
.PP
The tasks also have a processor state: bound or unbound to a process.
Each time a task changes processor state it's time line (column in the
graphical display) is modified.
This is achieved by outlining the wait state with two parallel lines
whenever a task is bound to a processor.
Whenever a task has no processor bound to it, the lines are not
present.
.PP
Thus, there are six different ways that a column can be updated.  If
the task has just posted an event, that event is represented by a box
with a mnemonic string and occasionally an integer.  However, if the
event is user defined, only the event number appears in a rounded
box.  If the task is able to run but not bound, it is updated with a
solid line.  If the task is idling and not bound, the column is
updated with a dotted line.  If the task is running (i.e. able to run
and bound) the column will contain three parallel solid lines.  If the
task is idle, but bound to a process (as sometimes occurs), it is
represented by two parallel solid lines surrounding a dotted line (see
Figure 3).
.NH 3
Graphical Representations of Processes
.PP
The process states are displayed in the minor canvas directly to the
right of the the actual time line.
The processes are icons, slightly smaller than regular Sun icons, that
change according to events in the trace file that change the process
state. 
A process can be either bound, idle or suspended.  
The difference between the latter two is slight but important.  
A process that is suspended is no longer associated with the user's
particular job, and is off performing work for someone else or the
system.
On the other hand, a process that is idle is an expensive state, CPU
time is still being charged to the user, but no actual work is being
done.  
For this reason, processor idle states are highlighted with a
characteristic gray.  
Bound processes, which are actually performing program work, are
highlighted in black. 
.NH 2
Time Warp
.PP
One of the major features of \fBtimeline\fR is the ability to represent
real time during the execution of a multiprocessing job.  
Following the traditional view of a time line, time should progress
linearly, while events are outlined at the time that they occur.  
In deciding upon the best way to show real time, several
problems had to be addressed.  
First, there was the fact that the computer would operate with a clock
cycle in nanoseconds, so the time line would have to progress by
\*Qblocks\*U of time, rather than individual clock cycles.  
In each block, there was the probability than none, one, or many
events could occur.
Also, there must be a way to deal with blank intervals of time where
no events would occur. 
Simply updating the time line every interval would quickly become
tedious, since the possibility of having hundreds or millions of blank
blocks was high.  
The final decision was a variation on the traditional time line concept.
.PP
Basically, the screen is updated on a combination event/clock interval
basis.   
Starting at the first event read, the time line \*Qclock\*U is set to zero.
A time stamp is displayed as the number of clock cycles since the
start of the file on the left edge of the screen, and 
a horizontal line across the entire primary canvas. 
.PP
The screen is then updated, as described above, for all the events
occurring in a set clock interval.
The clock interval is set by the user in the control panel in order to
allow tuning of the granularity of inspection.   
If the user was very interested in exactly how time was spent during
execution, a small clock interval should be set.
Conversely, to get a rough idea of where large blocks of time where
spent, a larger clock interval should be set. 
.PP
After all the events have been processed in the clock interval,
another time stamp is drawn, and the events for the next clock
interval are drawn.  
In this way, each clock interval is assigned as much primary canvas
space as required.  
.PP
The next data representation problem arises when there are blank
intervals: long periods of time during which no events occur.  
The first impulse is just to print a time stamp and continue the graph
for a row.  
This brings up the situation where many blank intervals are drawn in
succession.
The user can quickly get lost wading through screen after screen of
repeated graphical display.  
To avoid this, blank intervals are handled with a compact graphical
representation.  
A series of one or more blank intervals is given one row across the
screen.
For the first blank interval, a normal time stamp is drawn, and a thin
vertical bar is drawn in the row.  
For each successive blank interval, the time is displayed next to
the initial time stamp, and an additional thin vertical bar is drawn.
.PP
Since there may be hundreds, thousands or millions of blank intervals,
a speedup factor is worked into this scheme.  
In \*QGo\*U execution, after every one hundred blank intervals, the size
of the interval is increased by a factor of ten.  
This effectively warps the internal \*Qclock\*U of the \fBtimeline\fR tool.
A visual clue to the user is displayed graphically by placing a blank
space between the updating vertical bars, thus changing the pattern of
the bar.  
If another one hundred blank intervals passes with this new update
interval, the update interval is again increased by a factor of ten
and another space is inserted between update bars.
Thus, by the change of the vertical bar pattern, the user can tell a
speedup is occurring.  
The bars wrap around the row and continue drawing through the previous
patterns.  
This continues until the end of the blank time is reached.  
With this logarithmic expansion, the end of long intervals can be very
quickly reached, while still conveying a sense of time progression.
The time warp technique also leaves behind a visual reminder of the
long computation that is very compact and easy to integrate into the
overall graphical display of the multiprocessing job execution.
.PP
Another time warp visual clue is the panel \*Qspeedometer\*U.
This was implemented to give the user a sense of the time warp as the
screen is being updated.
A logarithmic dial with a range from 10^7 (slowdown) to 10^-7
(speedup) is displayed. 
For example, if the dial displayed a 10^5 reading, it would illustrate
that \fBtimeline\fR would take 10^5 seconds to process one second of program
execution.
.EQ
Time~Warp ~=~ Log~Factor * { Update~Time } over 
{ Cycles~since~last~Update * Clock~Cycle }
.EN
Where: @ Update~Time~=~0.1 @ seconds, which is the time it takes the
Sun to update the screen and @ Clock~Cycle~=~8.5 @ (Cray) or @ 65 @
(Alliant) nanoseconds.
The clock cycle is determined in the Panel by the user: a PANEL_CHOICE
item that toggles between the Cray X/MP and the Alliant.
.NH 2
Internal Screen and Task Data Representations
.PP
There are two important data structures that are utilized for
displaying information on the screen.  
The first is the screen data structures that delineate which task(s)
occupy which column(s) on the primary canvas.  
The second structure describes the tasks themselves and has six
fields: the x coordinate of the task on the screen, the process state,
the wait state, the posted event, an auxiliary integer field, and a
parent task ID. 
.PP
The procedure for updating these structures is relatively simple.   
First the icon representing an event number for the event is drawn.
Then, according to the specific event the screen and task structures
are updated.  
In most cases, the posted event is just a flag, and does not have any
effect on the state of the task.  
In this case, the post event field is filled with the appropriate
string, and after drawing, the field is cleared.  
In some cases, the posted event will change either the wait state or
the process state of the task.
If the wait state is changing, the posted event field is filled and the wait
state is updated.  
The posted event will then be drawn and cleared, but the wait state is
permanently affected.  
The same is true for a change in process state, except that no event
is posted.
An additional special case is when a task start event is processed.
Then a new task structure must be allocated and initialized.
For the screen structure the next open column from the center,
checking both left and right sides, is assigned to the new task.  
.PP
If there is no open column on the screen, the x coordinate field is
set to a value larger than the actual canvas.  
This presents no problem in drawing, since only valid columns are
examined for tasks, not valid tasks examined for columns.  
The same number of columns gets examined
for a task drawn whether there are four active tasks or four hundred.
Thus, a task that does not fit on the screen is forever lost from
graphical representation.  The structure will continue to get updated
normally, whenever an event occurs with the task's identification, but
it will never be examined when it is time to be drawn.  If a task does
not fit on the screen when it is created, it never will never appear.
.NH 2
Intermittent Logging
.PP
One of the features of \fBtimeline\fR is the ability to handle trace files
that do not start at the beginning of a program execution.  
As mentioned in the \fBstategraph\fR discussion, this situation can arise
from the application turning event logging off and then back on or if
the internal circular buffer wraps upon itself.
It is easy to handle intermittent logging in \fBtimeline\fR if one assumes
that when a discontinuety in the trace file is encountered the screen
is cleared and we restart the display from total ignorance of task state.
This corresponds to throwing out all information about the tasks we
have accumulated.
Then when an event is read for a task, as much information possible is
derived and stored.
The task is assigned a column, and all possible fields are filled.  
Thus, after a few events are read in the structure is completely
updated and task time lines appears correctly on the screen.
.NH 2
Memory and Speed Considerations 
.PP
In order to keep a reasonable amount of time line information
available to the user for post run analysis (scroll back), the primary
canvas has a length of 10,000 pixels.
However, such a canvas requires large amounts of memory.  
Thus, certain aspects of the program become much slower (due to page
faults) when they are dealing with the entire canvas on a Sun
Workstation with limited memory (four mega-bytes).  
For example, when loading a file a initialization routine is called
which clears the entire primary canvas.  
This one routine takes up almost all of the time spent in the loading
procedure.
.PP
Also, such large memory demands can make the program run noticeably
slower, on machines with limited memory, when doing such tasks as
bringing up the window and initializing the program.  
However, a large canvas allows the user to have a long history of
events.  
Currently, when the canvas becomes filled the entire canvas is cleared
and drawing starts again at the top.  
With a large canvas, more events can be drawn without having to
clear the canvas, and the user can scroll farther back to previous
events.  
We think that the page fault overhead is a tolerable price to pay for
the expanded history view.
.NH 2
Extensions to Timeline
.PP
The following is a list of possible additions to the GMAT time line
analysis tool we envisage for the near future.
.IP \(->
Dynamic Mode - Complete the implementation of dynamic mode in GMAT
\fBtimeline\fR which will enable a user to watch a graphical
representation of a programs synchronization during the execution.
The hooks are in place in GMAT for this addition.  
The only additional work needed for dynamic mode is to actually
implement the communication protocol in NSYSLIB on the Cray and the
Cray compatibility library on the Alliant.
.IP \(->
Printing The Canvas - To allow a permanent history of a trace file,
one of the nice features that could be implemented would be a print
option.  This would allow the user to print a canvas before having it
cleared for the next series of events.  All of the code has been
implemented to dump the canvas to a Sun raster file.
.IP \(->
Jumping Backwards in File - Although jumping forward in a file is very
useful one could envision the need to go back and review portions of
the time line, as it develops, several times.  One currently has the
ability to scroll back and view the time line that is on the primary
canvas.  
.IP \(->
Controlling Update Speed - Currently, in \*QGo\*U execution, the screen
is updated at a fixed rate and there is no way to control how fast
events are drawn.  
Thus, there are two speeds, one at a time or as fast as possible.  
A valuable addition would be to allow the time before updates to be
user controlled, and allow the user to proceed as fast or as slow as
desired. 
.bp
.NH 2
Significant Action Numbers - GMAT Timeline
.PP
The following is a list of the NSYSLIB/libCRAY.a action numbers that
the GMAT \fBtimeline\fR tool recognizes. 
User defined events, 1 through 127, are also recognized.
See [1] for a complete listing of the NSYSLIB/libCRAY.a action numbers.
The third column of the table gives a terse description of what GMAT
does upon encountering each recognizable NSYSLIB/libCRAY.a action number.
.nr PS 8
.nr VS 10
.PP	\" Don't MUCK with this .PP.  It forces the above .nr's to work.
.TS
center tab(/) box;
cB s s
c c c
n l l.
Significant Action Numbers - GMAT Timeline
Number/Library Action/GMAT Action
=
128/And There Was Light/Post Light event, bind Task 1 to Process 0
129/Start Task/Post Start event, connect parent to child
130/Complete Task/T{
Post Complete event, unbind the task from the 
process it was bound to - put in wait state. 
T}
131/Recover Task/Post Recover Event
132/Bind Task to Process/Update task line and process icon
133/Unbind Task/Update task line and process icon
_
145/Acquire new Process/Draw process icon
146/Idle a Process/Update process icon
147/Suspend a Process/Update process icon
150/Reacquire Process/Update process icon
_
160/Wait on a Task/Post event, put in wait state
161/Wait on a Lock/Post event, put in wait state
162/Wait on an Event/Post event, put in wait state
_
163/No Wait on a Task/Post event
164/No Wait on a Lock/Post event
165/No Wait on an Event/Post event
_
166/Resume after Task Wait/Post event, put in ready state
167/Resume after Lock Wait/Post event, put in ready state
168/Resume after Event Wait/Post event, put in ready state
_
169/Clear Event/Post event
170/Post Event/Post event
_
171/Assign Lock/Post event
172/Assign Event/Post event
173/Release Lock/Post event
174/Release Event/Post event
_
175/Clear Lock/Post event
176/Test a Task/Post event
177/Test a Lock/Post event
178/Test an Event/Post event
_
179/Assign Barrier/Post event
180/Release Barrier/Post event
_
181/Wait on a Barrier/Post event, put in wait state
182/No Wait on Barrier/Post event
183/Resume after Barrier Wait/Post event, put in ready state
.TE
.TS
center tab(/) box;
cB s s
c c c
n l l.
Significant Action Numbers - GMAT Timeline (Continued)
Number/Library Action/GMAT Action
=
256/Wait for Guard N/Post event, put in wait state
257/Resume after Guard N/Post event, put in ready state
258/Wait for Guard/Post event, put in wait state
259/Resume after Guard/Post event, put in ready state
260/Initialize Slaves/Post event
261/Master Starts Fray/Post event
262/Master Waits for Fray/Post event, put in wait state
263/Master Resumes after Fray/Post event, put in ready state
264/Wait for Control Structure/Post event, put in wait state
265/No Wait for Control Structure/Post event
266/Resume after Control Structure/Post event, put in ready state
267/Slave waits for work/Post event, put in wait state
268/Slave Starts Fray/Post event
269/Slave Exits/Post event
.TE
.nr PS 12
.nr VS 14
.NH 1
GMAT Stategraph User's Guide
.NH 2
Introduction
.PP
GMAT \fBstategraph\fR is a tool deigned to allow a user to see a graphical
display of a multiprocessing program as tasks are created,
synchronized, and destroyed. 
To use this tool, you must have access to a Sun workstation, and
an NSYSLIB/libCRAY.a generated trace file created under NLTSS on the Cray
X-MP or under Unix on the Alliant FX/8.
For information on how to create a trace file for use with this tool,
please see the NSYSLIB document [4] or the section below on the Cray
Compatibility Library (libCRAY.a).
.NH 2
Running Stategraph
.PP
To use the GMAT \fBstategraph\fR analysis tool in static mode merely type:
.nf
.sp
		% \fIstategraph s\fR
.sp
.fi
at the Unix shell command line.
The \fIs\fR argument to \fBstategraph\fR signifies static mode.
The user will be prompted as to whether or not additional significant
events are to be defined.
The user has the option of defining up to five action numbers which
he feels are significant. 
These can be either standard MAT action numbers that GMAT does not
treat as significant, or user-defined action numbers. 
A \fIy\fR response enables you to enter up to five numbers.
To terminate before entering all five events enter a \fI0\fR or
\fI-1\fR for an event number.
Once you have made your decision regarding user-defined significant
action numbers, the tool will begin setting up the windows.  
A window will be brought up covering most of the screen (a border on
the right-hand side has been left open to allow the viewing of
performance meters). 
The main window of the tool is divided into four sections (see Figure
2).  
These sections are the control panel (containing a slider, choice
items and buttons), the text subwindow (containing various messages),
the overview canvas (containing an overview of the entire graph), and
the primary canvas (the canvas on which the ancestry graph will be
drawn in detail).
.NH 2
The Control Panel
.PP
The panel subwindow is \fBstategraph\fR's main user control interface and
contains several features: 
.IP \(->
Directory: The user has the ability to step through various
directories to locate the desired trace file.  One can place the cursor
over the end of the directory string and press the right mouse button.
A menu listing of other directories will appear.  To change to one of
these directories, merely use the cursor to highlight the directory
and release the right mouse button. One of the directories in the
listing will have a check next to it: most probably the \*Q.\*U
directory.  
Depressing and releasing the left mouse button while the cursor is
positioned on the directory string will cause a change to 
the checked directory.  The user should remember, however, that while
directory changes are supported to assist in locating trace files,
trapsing about one's directory structure could lead to undesirable
results.
.IP \(->
Trace file:  When files exist in the current directory that begin with
the letters \*Qtrace\*U, the first of these will appear in the trace file
string.  Positioning the cursor on the string and depressing the right
mouse button will cause a menu of trace files to appear.  Files can be
selected from the menu by highlighting the file name and releasing the
right mouse button.  Depressing and releasing the left mouse button
will cause the next file in the menu to be selected.
.IP \(->
Event Speed: This slider controls the speed in which events are
processed while in GMAT Static Mode and \*QGo\*U has been chosen.
The speed control slider expresses the event display speed as a
percentage of the fastest possible speed. 
The default value of the slider is set at 75% and can easily be
changed by clicking with the left mouse button in the slider at the
desired value (current value is displayed in brackets, See Figure 2).
.IP \(->
Events Processed:  Keeps track of the number of logged events that
have been processed so far.  
Note that this is an event in the sense that each entry in an
NSYSLIB/libCRAY.a trace file is an event.
Both significant (state changing events) and non-significant events
are counted. 
.IP \(->
Active Processes:  Keeps track of the number of processes that have
been created by the system on the applications behalf.
.IP \(->
Number of Tasks: Keeps track of the number of tasks that have been
created during the program execution.
.PP
The ovals in the panel act as buttons.
All of the buttons on the panel are activated by clicking the left
mouse button within the boundaries of the button.
The button will remain gray as long as the action started by the
button continues. 
.IP \(->
Load: Initialize and reset \fBstategraph\fR for another trace file.
Once a trace file has been chosen it must be loaded in before state
graph analysis can be preformed.
.IP \(->
Step: Process one event from the trace file.  
The user will see the events processed increment.  
The screen is updated only when a significant action is encountered.  
.IP \(->
Go: Process events from the trace file consecutively without stopping.
The screen and counters in the control panel are updated
appropriately.  The only way the event processing is halted is for
you to hit the left mouse button while the mouse cursor is in the
control panel or the end of the trace file is reached.
.IP \(->
Step: Process one single event as in \*QGo\*U mode.
.IP \(->
Key: Displays a key of the various symbols used by the GMAT
\fBstategraph\fR SunView tool (see Figure 1).
The key will remain visible until the Key button is hit again.	
.IP \(->
Subtree: Draws a subtree of the current node and all of its
descendents in a smaller pop-up subwindow.  
This subwindow can be moved and resized by placing the mouse cursor on
the window edge (the mouse cursor turns into bulls eye) and using the
SunView window manipulation mouse button sequences.
A current node must be previously selected by clicking the left
mouse button while the mouse cursor is on any node in the state graph in
the primary canvas.  To remove the subtree, select the the \*QSubtree\*U
button again.
.IP \(->
Histogram: Causes a pop-up subwindow to be created which displays a
plot of the number of active processes as a function of action number.
This subwindow can also be moved and resized.
To remove the histogram, select the \*QHistogram\*U the button again.
.IP \(->
Quit: This will completely exit the \fBstategraph\fR tool, first asking for
confirmation. 
.PP
Beneath the panel is the text subwindow.  
It is in this window that messages from \fBstategraph\fR to the user will
be displayed.  
The text subwindow has a scroll bar, so you may scroll back to review
past messages. 
By moving the text cursor to any point in the text subwindow you may
insert text (as a note or marker).
Any further messages by \fBstategraph\fR will cause the text cursor to be
repositioned at the end of the current text in the text subwindow.
.PP
The overview canvas is to the right of the panel and the text
subwindow.
In this canvas the you will see a reduced size, but complete picture
of the entire graph. 
This overview is provided to help the you locate specific portions of
the detailed graph drawn in the primary canvas, but only partially
visible.  
This canvas will also be used for indicating scrolling on the primary
canvas.
A dotted box is used to indicate which portion of the overview 
is currently viewable through the primary canvas window.  
By positioning the cursor in the overview canvas and clicking the left
mouse button, the view in the primary canvas will be adjusted to
center the view box about that point without moving the box outside
of the overview canvas.
.NH 2
The Stategraph
.PP
The ancestry state graph itself consists of nodes connected together
with lines.
Each node represents a task, and the connections indicate genealogy.
The graph grows downward and outward, starting at the top and center
of the primary canvas. 
Each node is identified by its unique task number.  
When a process is bound to a task, a reverse-video square surrounds
the node.  
The nodes have different shades to represent the
various states of a task.  
A list of these symbols is available via the \*QKey\*U PANEL_BUTTON item
and is reproduced in Figure 1.
.PP
Additional information regarding each task may be obtained by
stopping the processing of events and placing the cursor over the 
node corresponding to the task and depressing the right mouse button.
A pop-up subwindow displays various information appropriate to the state
of the task at that event.
Some of the information that is displayed is as follows:
task ID; task ID of the parent; if the task is in a wait
state, the task ID of the task it is waiting for or the address of the
event, lock, or barrier it is waiting on; microtasking information if
appropriate, and the action numbers of the last five logged events
that affected that task.  
This information window will remain visible until the right mouse
button is released.
.PP
The first row of the primary canvas is used to display the process
icons.  
These icons indicate the states of the processes assigned to
the execution of your multiprocessing job.  
When a process is bound to a task, the task number will be indicated
next to the process and the process icon will have a \*QB\*U on it.  
The process icons are indicated in the overview canvas as little
squares, so that you may scroll back to them, if desired.
.NH 1
GMAT Timeline User's Guide
.NH 2
Introduction
.PP
\fBTimeline\fR is a graphical multiprocessing analysis tool that assists you
in the delicate art of debugging and analyzing time dependent problems
in a multiprocessing application. 
The \fBtimeline\fR tool is designed to allow you to graphically see a
multitasking program as tasks/processes are created and synchronized.
\fBTimeline\fR has the advantage of allowing fine detail in the inspection of
a multiprocessing code, while accurately representing a sense of real time.  
To use this tool, you need access to a Sun workstation, and
a NSYSLIB/libCRAY.a library generated trace file created under NLTSS
on the Cray X-MP or under Unix on the Alliant FX/8.
For information on how to create a MAT trace file for use with this tool,
please see the NSYSLIB document [4] or the section below on the Cray
compatibility library (libCRAY.a).
.NH 2
Running Timeline
.PP
To use the GMAT time line analysis tool merely type:
.nf
.sp
		% \fItimeline\fP
.sp
.fi
at the Unix shell command line.
\fBTimeline\fR then creates the necessary windowing environment flush with
the upper left hand corner of the screen.  
The main window contains three subwindows: panel, text, and canvas.
.NH 2
The Control Panel
.PP
The control panel is \fBtimeline\fR's main user control interface and
contains many of the same features of the \fBstategraph\fR control panel.
Some of the additional features of the \fBtimeline\fR control panel
include:
.IP \(->
Clock Interval: (Default: 10^4) This is a PANEL_CHOICE item that
allows you to select a desired clock interval.  This number is
actually the  number of clock cycles that must pass before a new
time stamp is drawn.
This will be explained in greater detail later.
.IP \(->
Machine Type: (Default: Cray X/MP) This should be set to the machine
type that the trace file you wish to analyize was created on.
.IP \(->
Address Table: (Default: Locks) Since the user seldom needs to know
exactly the full addresses of every lock, barrier, or event that may
be displayed, they have been associated a small integer value in the
display.  If at any time the correlation between those integer values
and the actual addresses is needed, a table can be brought up by
depressing and holding the middle mouse button in the canvas
subwindow. 
.PP
The above three items are user options that are controlled with a cycle
protocol (PANEL_CHOICE item). 
All the choice items are controlled in one of two ways.
First, to get a pop-up menu of all the selections, place the cursor on
the item and depress and hold the right mouse button.
The desired selection can then be chosen by highlighting the choice
and releasing the button.  The selections can also be stepped through
by clicking the cycle item once with the left mouse button.
.PP
The ovals in the panel act as buttons.  
All of the buttons on the panel are activated by clicking the left
mouse button within the boundaries of the button.
The button will remain gray as long as the action started by the
button continues. 
.IP \(->
Load: Initialize and reset \fBtimeline\fR for another trace file.
Once a trace file has been chosen it must be loaded in before time
line analysis can be preformed on it.
.IP \(->
Go: Process events from the trace file consecutively without stopping.
The screen and counters in the control panel are updated
appropriately.  The only way the event processing is halted is for
you to hit the left mouse button while the mouse cursor is in the
control panel or the end of the trace file is reached.
.IP \(->
Step: Will process one event or clock interval.  
.IP \(->
Key: Displays a key of the various symbols that are used by GMAT
\fBtimeline\fR SunView tool (see Figure 3).
The key will remain visible until the Key button is hit again.	
.IP \(->
Histogram: Causes a pop-up subwindow to be created which displays a
plot of the number of active processes as a function of time interval.
To remove the histogram, select the \*QHistogram\*U the button again.
If you change the clock interval at any time during execution, the
histogram will continue with the new clock interval.  
Within a long interval (more than 100 clock intervals) with no changes
in the display (see the clock intervals section, below), two vertical
lines are drawn, effectively breaking the plot.  
This is to signify to the user that a large segment of time was
repeated at that point, with no change in the process state.  
Also, the number of clock cycles since the beginning
of program execution is displayed periodically in the bottom row of the
histogram.  Thus the histogram gives an accurate idea of the efficiency
of process utilization at a glance.
.IP \(->
Jump: Selection of this button brings up a jump forward pop-up panel. 
The panel allows the user to jump forward in a file in one of three ways.
First, the user specifies the type of jumping.  A cycle item allows
the selection of jumping by a particular action number (refer to the
MAT User's Guide), to the next event dealing with a particular task,
or to jump to a specified Events Processed count.  The actual data is
typed in the panel, next to the \*QJump To\*U item.  All typing will be
automatically directed towards this spot, as long as the mouse is in
the confines of the pop-up panel.  Finally, the procedure can be
executed with the current data or canceled with the either of the two
buttons. 
.IP \(->
Quit: This will completely exit the \fBtimeline\fR tool, first asking for
confirmation. 
.PP
On the right hand side of the control panel (see Figure 4) are three
counters.
These counters keep track of the number of active processes and tasks
as a function of time and display the number of trace file events
processed.
.PP
The last panel item is a time warp speedometer (center of the control
panel, see Figure 4), which reports the power
of ten corresponding to the factor that real time is being displayed
with respect to actual time that the program was executed.  In short,
it displays the number of real time seconds it takes to display
one second of (wall clock) time during program execution.  
This number is computed by taking a known time that the tool takes to
update the screen, and compares it against the number of program clock
cycles that have elapsed since the last screen update.  
For example, a dial reading of 10^5 illustrates that \fBtimeline\fR would
take 10^5 seconds to process one (wall clock) second of program
execution.  
The speedometer has a range of 10^7 (slowdown) to 10^-7 (speedup) real
time seconds for each program second, however the speedometer will
seldom display a 10^-7 speedup with a trace file.
.NH 2
The Time Line
.PP
The canvas subwindow occupies the lower two-thirds of the window.  
This is where all the graphical information is displayed.  
The right-hand edge of the canvas is reserved for the process icons,
while the rest of the canvas is reserved for the actual time line.
Each task, as it is created, is given it's own column in the time
line, and as time progresses the column is extended down the screen.  
The graphical display is updated at the end of every time interval.
.PP
The actions of a task have been divided into three separate graphical
representations: posted events, user defined events, and
wait/processor state.  
Posted events are flags that a task reports, as logged in a
NSYSLIB/libCRAY.a trace file.
They may or may not affect a task's wait state,
but their main purpose is to report that a significant event has just
occurred.  Posted events are represented in \fBtimeline\fR as a box with the
appropriate text.  Often, the text will demand a piece of auxiliary
information.  For example, the \*Qwaiting on a task\*U posted event will
contain the number of the task on which it is waiting.  This shows up
in in the box as an integer next to the identifying string.
.PP
While posted events appear as square boxes, user defined events appear
as boxes with rounded corners.  Since there is no way to print out an
mnemonic string, User Defined boxes appear with just the action number.
.PP
While a task will occasionally post an event, it is represented most
often in it's wait state.  At all times, a task is either ready to run
or it is waiting.  This is represented in the time line as a vertical
line, solid if the task is capable of running, and dotted if the task
is waiting.  Between posted events, a task is updated with its wait
state.  The effect is a column of indiscriminately spaced boxes
connected with either dotted or solid lines.
.PP
The tasks also have a processor state: bound or unbound to a process.
Each time a task changes processor state it's time line (column in the
graphical display) is modified.
This is achieved by outlining the wait state with two parallel lines
whenever a task is bound to a processor.
Whenever a task has no processor bound to it, the lines are not
present.
.PP
Thus, there are six different ways that a column can be updated.  If
the task has just posted an event, that event is represented by a box,
user defined events occur in their own rounded box.  If the task is
able to run but not bound., it is updated with a solid line.  If the
task is idling and not bound, the column is updated with a dotted
line.  If the task is running, i.e. able to run and bound, the column
will contain three parallel solid lines.  If the task is idle, but
bound to a processor (as sometimes occurs), it is represented by two
parallel solid lines surrounding a dotted line.
.PP
The process icons appear in the right hand edge of the canvas. 
These are simply icons that correspond to the states of the processes
during the program.
A program trace file will always start with a bound process zero, and
gradually more processes will appear, each with a different icon.
The display will never show more icons than available processes for
the particular machine, as declared in machine type control panel
PANEL_CHOICE item.
The processes icons are slightly smaller than regular Sun icons.
A process can be either bound, idle or suspended.  
The difference between the last two is slight but important.  
A process that is suspended is no longer associated with the user's
particular job, and may be off performing for someone else, or doing
system work. 
On the other hand, a process that is idle is an expensive state, CPU
time is still being charged to the user, but no actual work is being
done.  
For this reason, processor idle states are highlighted with a
characteristic gray.  
Bound processes, which are actually performing program work, are
highlighted in black.
.NH 2
Clock Intervals
.PP
One of the advantages \fBtimeline\fR offers is the ability to report
relative clock time.  
This is achieved graphically with clock intervals.
When the first event is obtained from the trace file, time is set to
zero.  
Then a time stamp is drawn on the screen.  
The time stamp appears as the number of clock cycles (this is a real
time measurement) elapsing since the first time stamp in the 
trace file followed by a horizontal line across the screen.  
Then, according to the clock interval set by the user in the panel,
tasks are updated until the start of the next clock interval.  
At the next clock interval, another time stamp is drawn, and tasks are
updated until the start of the next clock interval.  
Since it is possible for any number of events to occur in a clock
interval, clock intervals are not evenly spaced down the screen.
There may be a hundred events in one clock interval, while
only one in another.  To provide complete graphical representation,
each clock interval is provided with as much space as required.
.PP
It often occurs that there are one or more clock intervals where
nothing happens.  Instead of just drawing a time stamp and updating the
screen, a more compact graphical representation is utilized.  
A series of one or more blank intervals is given one row across the screen.
For the first blank interval, a normal time stamp is drawn, and a small
vertical bar is drawn in the row.  For each successive blank interval,
the time is displayed next to the initial time stamp, and an additional
vertical bar is drawn.
.PP
Since there may be hundreds, thousands or millions of blank intervals,
a speedup factor is also worked into this scheme.  
In \*QGo\*U execution, after every one hundred blank intervals, the
size of the interval is increased by a factor of ten.  
This is displayed graphically by placing a blank space between the
updating vertical bars, changing the pattern of the bar.  
If another one hundred blank intervals passes
with this new update interval, the update interval is again increased
by a factor of ten and another space is inserted between update bars.
The bars wrap around the row and continue drawing through the previous
patterns.  Thus, by the speed of the change of the pattern, the user
can tell graphically a speedup is occurring. This continues until the
end of the blank time is reached and the clock interval is reset to
its original value.  
It is interesting to note how the
Speedometer changes during this time.  After the clock interval is
increased by a factor of ten, \fBtimeline\fR is changing it's
representation of real time: it will take ten times less display time
to display one second.
Thus, during a long interval, as the update clock interval increases,
the speedometer will gradually display a smaller and smaller real time
slowing factor for every program second.
.PP
Since the clock interval can be changed at any time by the user, it
can be extensively used to control the granularity of inspection.
With larger clock intervals, time stamps become less frequent, as more
events occur within an interval.  With smaller clock intervals,
time stamps become more frequent, and a close inspection of time spent
can be undertaken.  This, coupled with the time warp speedometer can
give the user an accurate feel of clock time during examination of a
trace file.
.NH 1
The Cray Compatibility Library
.PP
Alliant's shared memory multiprocessing is not at all compatible with
the Cray Research, Inc (CRI) definitions [5] or LLNL definitions [4].  
The CRI/LLNL multitasking model is based on subroutine level
parallelism (tasks are created to execute a subroutine and terminate 
when the return statement is encountered) and shared memory is
accomplished automatically via Fortran common blocks.
That is to say, the Fortran compiler and loader automatically put
all variables declared in common blocks in shared memory.
No user intervention is necessary (other than correctly scoping the
variables).
In order for each task to have a \*Qprivate common block\*U, CRI
initiated the Fortran extension \fItask common\fR.
.KF
.nf
.ft CO
.sp
.ps 8
.vs 10
.ce
vecaddcom.f
      subroutine vecaddcom
c			Load in the library BEGINNING book end common block
      include '/usr/local/include/tskcombgn.h'
c			Load in the user common block(s)
      include 'vecadd.h'
c			Load in the library ENDING book end common block
      include '/usr/local/include/tskcomend.h'
      end
.ps 12
.vs 14
.sp
.ft
.ce 2
Example 1: Setting up shared memory
Dummy Fortran routine to define the shared memory region using \fIcommon\fR. 
.sp
.fi
.KE
.KF
.nf
.sp
.ft CO
.ps 8
.vs 10
.ce
vecadd.h
      parameter (LENBUF=16000)
      common /logbuf/ logbufr(LENBUF), my_tskarry(3)
      common /xy/ x(1024), y(1024), z(1024)
.ps 12
.vs 14
.sp
.ft
.ce 3
Example 2: Common blocks for the vecadd program
Include file with the definitions of the common blocks for the vecadd program.
Note that the user must declare shared memory for circular trace buffer.
.sp
.fi
.KE
.KF
.nf
.sp
.ft CO
.ps 8
.vs 10
.ce
vecadd.f
      program vecadd
cvd$g noconcur
      include 'vecadd.h'
      external pvecadd
c         Set up the multiprocessing environment.
      CALL LGENABLE
      CALL LGOPEN( logbufr, LENBUF, 'trace.vecadd', -1 )
      CALL LGON
c         Initilize the data
      x(1:1024) = 3.14159E22
      y(1:1024) = 2.71828E11
c         Do the fork.
      my_tskarry(1) = 3
      my_tskarry(3) = 2
      CALL TSKSTART( my_tskarry(1), pvecadd, 1024, my_tskarry(3), 2 )
      call pvecadd( 1024, 1, 2 )
c         Do the Join.
      CALL TSKWAIT( my_tskarry(1) )
c         Check the answers.
      do 10 i = 1, 1024
         if( z(i).ne.x(i)+y(i) ) then
            write(6,*) ' VECADD: error i,x,y,z=',i,x(i),y(i),z(i)
         endif
 10   continue
c         Terminate the multiprocessing event logging.
      CALL LGOFF
      CALL LGCLOSE
      call exit( 0 )
      end
      subroutine pvecadd( lenvec, iproc, nproc)
      include 'vecadd.h'
c
      do 10 i = iproc, lenvec, nproc
         z(i) = x(i) + y(i)
 10   continue
      return
      end
.ps 12
.vs 14
.sp
.ft
.ce 3
Example 3: Multitasking job for the Alliant FX/8.
This job adds two vectors together in parallel with
two tasks.  The MAT/GMAT logging is also utilized.
.sp
.fi
.KE
.KF
.nf
.sp
.ft CO
.ps 8
.vs 10
.ce
makefile
.sp
#	Vector addition example
FFLAGS=-O -g
LDFLAGS=-lCRAY -lcommon
INCDIR=/usr/local/include/
vecaddcom.o : vecaddcom.f vecadd.h $(INCDIR)tskcombgn.h $(INCDIR)tskcomend.h
	$(FC) ${FFLAGS} -c vecaddcom.f
vecadd.o : vecadd.f vecadd.h
	$(FC) ${FFLAGS} -c vecadd.f
vecadd : vecaddcom.o vecadd.o
	$(FC) ${FFLAGS} vecaddcom.o vecadd.o -o vecadd ${LDFLAGS}
.ps 12
.vs 14
.sp
.ft
.ce 4
Example 4: Makefile for the vector addition example
The vecaddcom.o depends on the source, the user common block(s) and
the library common blocks.  It is \fBvery\fR important to have
vecaddcom.o be the first binary on the load line.
.sp
.fi
.KE
.PP
Alliant follows the standard Unix model for multiprogramming: fork
independent jobs (job as defined in the introduction).
Microtasking and vectorization are automatically supported by the
Fortran compiler.
The Unix system routine, \fIfork\fR,  creates a separate virtual memory
\*Qcore image\*U and starts another process running at the return from
the fork call.   
Two major ramifications of this model are: 1) multitasking at the
subroutine boundary is not directly supported and 2) special steps
have to be taken by the programmer in order to have processes share
some of their virtual memory space.
The latter is accomplished on a virtual memory page by page basis via
a \fIcreate_shared_region\fR system call.
This call requires that all variables, that should be in shared regions,
be grouped together and the groups begin and end on page boundaries.
Also, the Alliant symbolic debugger \fIdbx\fR does not function
properly with processes that fork themselves.
.PP
Even with these difficulties a Cray model of multitasking can be
emulated to a high degree (if your willing to give up the debugger).
It is fairly straightforward to write a Fortran subroutine that
results in subroutine parallelism (\fItskstart\fR).
Getting around the shared memory problem is a bit harder.
Supporting \fItask common\fR can not be accomplished without
modifications to the Fortran compiler and loader (which is outside the
scope of a compatibility library).
This is where we have to give up any hope of 100% application source
code compatibility between the Cray library (NSYSLIB) and the Alliant
compatibility library (libCRAY.a).  
.PP
In order to share memory the Fortran programmer must put all of the
common blocks (that should be shared) in a dummy subroutine (see
Examples 1 and 2). 
Also the programmer must include two common blocks from the library
in the dummy routine (subroutine vecaddcom, in Example 1) that act as
book marks and give the Cray compatibility library the ability to
compute the boundaries of the shared region and the number of pages to
be shared.
.PP
One can achieve the functionality of \fItask common\fR via common
blocks that are not declared in the dummy fortran routine between the
library book marks, although one looses source code compatibility.
.PP
To complete this example we turn to the \*Qdriver\*U routine source
code (Example 3) and the UNIX makefile (Example 4).
Note that the routine to be executed in parallel (\fIpvecadd\fR) is
declared external in \fIprogram vecadd\fR and that the
\fItskstart\fR/\fItskwait\fR pair constitutes a fork/join operation.
The MAT/GMAT trace mechanism is activated via the \fIlgenable\fR,
\fIlgopen\fR and \fIlgon\fR sequence of calls.
The \fIlgoff\fR, \fIlgclose\fR calls terminate the logging and dump
out the tracefile to the file user specified tracefile,
\fItrace.vecadd\fR in the example.
The important thing to note about the makefile is that \fIvecaddcom.o\fR
appears as the first object module on the load line.
The Cray compatibility library itself is installed in the directory
\fI/usr/local/lib\fR (a place the loader will search) and hence can be
accessed via the \fI-lCRAY\fR loader option.
The library also requires the Alliant supplied \fIcommon\fR library
and hence the \fI-lcommon\fR loader option is also needed. 
.PP
In the implementation of the Cray compatibility library 
all of the CRI defined routines [5] are supported as well as a few of
the LLNL/NSYSLIB extensions [4]. 
In particular the MAT logging extensions (\fIlgclose\fR,
\fIlgdsable\fR, \fIlgenable\fR, \fIlgentry\fR, \fIlgoff\fR, \fIlgon\fR
and \fIlgopen\fR) have been added to libCRAY.a.  
Also the barrier routines (\fIbarasgn\fR, \fIbarrel\fR and
\fIbarsync\fR) have been added, as they are sorely lacking in the CRI
implementation. 
The following table lists the routines in the Cray compatibility
library, their parameters and a short description of the routine.
.SH 
NAME
.br
evasgn \- Initialize an Event synchronization variable
.SH 
SYNOPSIS
.br
\fBcall evasgn( event [,value] )\fR
.SH 
ARGUMENTS
.br
\fBevent\fR
.RS
Integer variable (in shared memory) to be used as an event.
The user must not modify an event variable after the call to
\fIevasgn\fR until that variable is released by a call to \fIevrel\fR.
.RE
\fBvalue\fR
.RS
This variable is ignored by libCRAY.a and is maintained for
compatibility purposes only.
.RE
.SH 
DESCRIPTION
.PP
\fIEvasgn\fR identifies an integer variable that the program intends to
use as an event.
The \fIevasgn\fR subroutine must be called for each event variable
before its use with any other event routines.
An event is given the initial state of \*Qcleared\*U.
An event remains cleared until it is posted with \fIevpost\fR.
The first call assigns the event; further calls are ignored.
.SH 
NAME
.br
evclear \- Clear an Event so that it may be posted again
.SH 
SYNOPSIS
.br
\fBcall evclear( event )\fR
.SH 
ARGUMENTS
.br
\fBevent\fR
.RS
Integer variable (in shared memory) used as an event.
.RE
.SH 
DESCRIPTION
.PP
\fIEvclear\fR clears and event and returns control to the calling task.
The result of this is that tasks subsequently performing \fIevwait\fR
calls must wait.
If the results of an \fIevpost\fR is not cleared, the posted condition
remains outstanding.
When a single event post is required (a simple signal) call
\fIevclear\fR immediately after \fIevwait\fR to indicate that the
posting of the event has been detected.
.SH 
NAME
.br
evpost \- Post an Event
.SH 
SYNOPSIS
.br
\fBcall evpost( event )\fR
.SH 
ARGUMENTS
.br
\fBevent\fR
.RS
Integer variable (in shared memory) used as an event.
.RE
.SH 
DESCRIPTION
.PP
\fIEvpost\fR posts an event and returns control to the calling task.
Posting an event has the side effect that other tasks waiting on the
event (via a call to \fIevwait\fR) resume execution.
Posting a posted event has no effect (posts are not queued) and should
be avoided.
.SH 
NAME
.br
evrel \- Release an Event synchronization variable
.SH 
SYNOPSIS
.br
\fBcall evrel( event )\fR
.SH 
ARGUMENTS
.br
\fBevent\fR
.RS
Integer variable (in shared memory) used as an event.
.RE
.SH 
DESCRIPTION
.PP
\fIEvrel\fR releases an integer variable assigned as an event.
If a task is waiting for the event, an error results.
This subroutine is useful in detecting erroneous uses of an event
outside the region of the program planned for it.
The event variable can be reused following another call to \fIevasgn\fR
.SH 
NAME
.br
evtest \- Test the state of an Event
.SH 
SYNOPSIS
.br
\fBlogical evtest, istat\fR
.br
\fBistat = evtest( event )\fR
.SH 
ARGUMENTS
.br
\fBevent\fR
.RS
Integer variable (in shared memory) used as an event.
.RE
.SH 
RETURNS
.br
\fBistat\fR
.RS
A Fortran logical .TRUE. if the indicated event is posted.
A Fortran logical .FALSE. if the indicated event has never been posted
or is cleared.
The event variable's state is unaffected by this call.
.RE
.SH 
DESCRIPTION
.PP
This routine tests the state of the indicated event.
If the event has never been posted or has been cleared the routine
returns a Fortran .FALSE. to the calling task.
If the event is posted the routine returns a Fortran .TRUE. to the
calling task.
.PP
Care should be taken when using this routine in a multiprocessing job.
The value returned may be invalidated shortly after, or just before it
is returned.
That is to say the returned value is valid at the time the
\fIevtest\fR checks the appropriate data structure,
however since concurrent execution is possible, the status
may have changed by the time the routine returns the status.
.SH 
NAME
.br
evwait \- Wait for an Event to be posted
.SH 
SYNOPSIS
.br
\fBcall evwait( event )\fR
.SH 
ARGUMENTS
.br
\fBevent\fR
.RS
Integer variable (in shared memory) used as an event.
.RE
.SH 
DESCRIPTION
.PP
\fIEvwait\fR waits until the indicated event is posted.  
If the event is already posted, the task continues execution without
waiting.
\fIEvwait\fR does not change the state of an event.
.SH 
NAME
.br
lockasgn \- Initialize a Lock synchronization variable
.SH 
SYNOPSIS
.br
\fBcall lockasgn( lock [,value] )\fR
.SH 
ARGUMENTS
.br
\fBlock\fR
.RS
Integer variable (in shared memory) to be used as a lock.
The user must not modify a lock variable after the call to
\fIlockasgn\fR until that variable is released by a call to \fIlockrel\fR. 
.RE
\fBvalue\fR
.RS
This variable is ignored by libCRAY.a and is maintained for
compatibility purposes only.
.RE
.SH 
DESCRIPTION
.PP
\fIlockasgn\fR identifies an integer variable that the program intends to
use as a lock.
The \fIlockasgn\fR subroutine must be called for each lock variable
before its use with any other lock routines.
A lock is given the initial state of \*Qcleared\*U.
The first call assigns the lock; further calls are ignored.
.SH 
NAME
.br
lockoff \- Clears a Lock
.SH 
SYNOPSIS
.br
\fBcall lockoff( lock )\fR
.SH 
ARGUMENTS
.br
\fBlock\fR
.RS
Integer variable (in shared memory) to be used as a lock.
.RE
.SH 
DESCRIPTION
.PP
\fIlockoff\fR clears the lock and returns control to the calling task.
Clearing the lock may allow another task to resume execution, but this
is transparent to the task calling \fIlockoff\fR.
.SH 
NAME
.br
lockon \- Sets a Lock
.SH 
SYNOPSIS
.br
\fBcall lockon( lock )\fR
.SH 
ARGUMENTS
.br
\fBlock\fR
.RS
Integer variable (in shared memory) to be used as a lock.
.RE
.SH 
DESCRIPTION
.PP
\fIlockon\fR sets the lock and returns control to the calling task.
If the lock is already set, the task spin waits until the lock is
cleared by another task.
In either case, the task sets the lock when it next resumes execution
of user code.
This means placing \fIlockon\fR before a critical region ensures that
the code in that region is executed only when the task has unique
access to the lock.
.SH 
NAME
.br
lockrel \- Release a Lock synchronization variable
.SH 
SYNOPSIS
.br
\fBcall lockrel( lock )\fR
.SH 
ARGUMENTS
.br
\fBlock\fR
.RS
Integer variable (in shared memory) used as a lock.
.RE
.SH 
DESCRIPTION
.PP
\fIlockrel\fR releases the integer variable that the program
identified as a lock.
If a task is waiting on the lock, or if the lock is set, an error
results.
This subroutine is useful for detecting errors that arise from a task
waiting on a lock that is never cleared.
The lock variable can be reused following another call to \fIlockasgn\fR.
.SH 
NAME
.br
locktest \- Unconditionally set a Lock and return its original state.
.SH 
SYNOPSIS
.br
\fBlogical locktest, istat\fR
.br
\fBistat = locktest( lock )\fR
.SH 
ARGUMENTS
.br
\fBlock\fR
.RS
Integer variable (in shared memory) used as a lock.
.RE
.SH 
RETURNS
.br
\fBistat\fR
.RS
A Fortran logical .TRUE. if the indicated lock was originally set.
A Fortran logical .FALSE. if the indicated lock was originally in the
clear (unlocked) state.
The lock variable's state is always set to locked upon return.
.RE
.SH 
DESCRIPTION
.PP
This routine tests if a lock is in the locked (set) state.
\fILocktest\fR acts the same as \fIlockon\fR, except the task never
waits.
A task using \fIlocktest\fR must always look at the return value
before continuing.
.SH 
NAME
.br
tskstart \- initiate a multiprocessing task at the subroutine level
.SH 
SYNOPSIS
.br
\fBexternal taskhead\fR
.br
\fBcall tskstart( taskarry, taskhead [, list] )\fR
.br
\fBinteger taskarry(\&3)\fR
.SH 
ARGUMENTS
.br
\fBtaskarry\fR
.RS
Task control array for this task (should be in shared memory).
Word 1 must be set to the integer length of this array and must be
either 2 or 3 (3 in the example above).  Word 3, if used, may be set
to any value by the user.
On return, word 2 will contain a unique task identifier that may be
used to perform other functions on the created task.
The unique identifier must be maintained if any subsequent interaction
with the task is desired.
.RE
\fBtaskhead\fR
.RS
External entry point at which task execution begins.
This name must be declared \fIexternal\fR in the routine making the
call to \fItskstart\fR.
.RE
\fBlist\fR
.RS
Optional list of (at most 20) arguments passed to the new task.
This list must correspond to the argument list of the external entry
point taskhead
.RE
.SH 
DESCRIPTION
.PP
This routine initiates a UNIX process via a call to \fIfork\fR and
associates a task with that process by initializing the necessary data
structures (ie, the task descriptor).
Upon return from this routine, the requested task will be ready to
execute in the UNIX process.  
There is no implicit execution order between the calling task and the
tasks it creates.
Subsequent references to a specific task are made utilizing the task's
task control array, \fItaskarry\fR.
.PP
Shared memory is set up on the first call that a user job makes to
\fItskstart\fR via a call to \fIcreate_shared_region\fR.  
The calling program must have been linked with a dummy Fortran
subroutine that declares all common blocks that are to be shared.
This routine must also include the Cray compatibility library book
mark common blocks contained in the include files in
\fI/usr/local/include/tskcombgn.h\fR and
\fI/usr/local/include/tskcomend.h\fR. 
The sequence of things is important.
The \fItaskcombgn.h\fR comes first then the user common blocks and then
\fItskcomend.h\fR (see the example, below).
There must be no other executable statements in the dummy routine.
On the load line the dummy routine's object file must be the first
file in the object file list.
A typical dummy routine looks like the following
.DS
.ft CO
.ps 8
.vs 10
subroutine dummy
include '/usr/local/include/tskcombgn.h'
include 'mycommons.h'
include '/usr/local/include/tskcomend.h'
end
.ps
.vs
.ft
.DE
.RE
where mycommons.h is an include file which declares all of the common
blocks that should be in shared memory.
.SH 
NAME
.br
tsktest \- Test for the existence of a task
.SH 
SYNOPSIS
.br
\fBlogical tsktest, istat\fR
.br
\fBistat = tsktest( taskarry )\fR
.SH 
ARGUMENTS
.br
\fBtaskarry\fR
.RS
Task control array for the task to be inspected.
Word 2 of this array is the unique identifier for the task (as
returned by \fItskstart\fR).
.RE
.SH 
RETURNS
.br
\fBistat\fR
.RS
A Fortran logical .TRUE. if the indicated task exists.
A Fortran logical .FALSE. if the indicated task has completed
execution.
.RE
.SH 
DESCRIPTION
.PP
This routine test for the existence of the task indicated by
\fBtaskarry\fR.
If the indicated task is complete, does not exist, or neve existed the
routine will return .FALSE. to the calling task.
If the indicated task exists and has not completed, the routine
returns .TRUE. to the calling task.
.PP
Care should be taken when using this routine in a multiprocessing job.
The value returned may be invalidated shortly after, or just before it
is returned.
That is to say the returned value is valid at the time the
\fItsktest\fR checks the appropriate data structure,
however since concurrent execution is possible, the status
may have changed by the time the routine returns the status.
.SH 
NAME
.br
tsktune \- Dummy routine for compatibility with NSYSLIB
.SH 
SYNOPSIS
.br
\fBcall tsktune( keyword, value, ... )\fI
.SH 
ARGUMENTS
.br
\fBkeyword\fR
.RS
This argument is ignored.  
\fITsktune\fR has purely source code compatibility purposes on the
Alliant FX/8 system.
.RE
\fBvalue\fR
.RS
This argument is ignored.  
\fITsktune\fR has purely source code compatibility purposes on the
Alliant FX/8 system.
.RE
.SH 
DESCRIPTION
.PP
This routine has purely source code compatibility purposes on the
Alliant FX/8 system.
It performs no other useful function.
.SH 
NAME
.br
tskvalue \- Retrieve the calling task's user value
.SH 
SYNOPSIS
.br
\fBcall tskvalue( tvalue )\fR
.br
\fBinteger tvalue\fR
.SH 
ARGUMENTS
.br
\fBtvalue\fR
.RS
Integer argument in which the value of the user supplied word 3 of the
task control array when the calling task was created is returned.
A zero is returned if the task control array length was less than 3 or
if the calling task is the ROOT task (the first task in a
multiprocessing job).
.RE
.SH 
DESCRIPTION
.PP
This routine returns the value provided by the user to the library in
word 3 of the task control array when the calling task was created via
\fItskstart\fR.
If the task array length was less than 3 or if the calling task is the
ROOT task (the first task in a multiprocessing job), a zero value is
returned.
.PP
This can be a useful mechanism as the value is passed to the task as a
value rather than a pointer to the word containing the value.
The value is maintained by the task from the time it is created until
it terminates.
Multiple calls to this routine will always yield the same return value.
.SH 
NAME
.br
tskwait \- Wait for the indicated task to complete
.SH 
SYNOPSIS
.br
\fBcall tskwait( taskarry )\fR
.br
\fBinteger taskarry(3)\fR
.SH 
ARGUMENTS
.br
\fBtaskarry\fR
.RS
Integer task control array (in shared memory) for the task to be
waited upon.
Word 2 of this array is the unique identifier for the task (as
returned by \fItskstart\fR).
.RE
.SH 
DESCRIPTION
.PP
This routine causes the calling task to wait for completion of the
task indicated by \fBtskarry\fR.
If the indicated task is complete, does not exist, or never existed
the routine returns to the calling task.
If the indicated task exists and has not completed, the calling tasks
spin waits.
The calling task will be released when the indicated task completes,
at which time this routine returns to the calling task.
.SH 
NAME
.br
barasgn \- Initialize a Barrier synchronization variable
.SH 
SYNOPSIS
.br
\fBcall barasgn( barrier , bar_count )\fR
.br
\fBinteger barrier, bar_count\fR
.SH 
ARGUMENTS
.br
\fBbarrier\fR
.RS
Integer variable (in shared memory) to be used as a barrier.
The user must not modify an barrier variable after the call to
\fIbarasgn\fR until that variable is released by a call to \fIbarrel\fR.
.RE
\fBbar_count\fR
.RS
The integer number of tasks that must invoke the barrier
synchronization routine with this barrier variable in order for it to
change to an \*Qopen\*U state.
.RE
.SH 
DESCRIPTION
.PP
\fIBarasgn\fR identifies an integer variable that the program intends to
use as a barrier.
The \fIbarasgn\fR subroutine must be called for each barrier variable
before its use with any other barrier routines.
A barrier is given the initial state of \*Qclosed\*U.
A barrier remains \*Qclosed\*U until its count is met, ie, it has been
invoked by as many tasks as its defined count indicates.
.SH 
NAME
.br
barrel \- Release a Barrier synchronization variable
.SH 
SYNOPSIS
.br
\fBcall barrel( barrier )\fR
.br
\fBinteger barrier\fR
.SH 
ARGUMENTS
.br
\fBbarrier\fR
.RS
Integer variable (in shared memory) to be used as a barrier.
.RE
.SH 
DESCRIPTION
.PP
\fIBarrel\fR releases the identifier assigned to the barrier.
If a task is waiting for passage through the barrier, an error
results.
This routine is useful to detect errors that can arise when a task is
waiting on a barrier whose count will never be met.
The barrier variable can be reused following another call to
\fIbarasgn\fR.
.SH 
NAME
.br
barsync \- Synchronize on the indicated barrier.
.SH 
SYNOPSIS
.br
\fBcall barsync( barrier )\fR
.br
\fBinteger barrier\fR
.SH 
ARGUMENTS
.br
\fBbarrier\fR
.RS
Integer variable (in shared memory) used as a barrier.
.RE
.SH 
DESCRIPTION
.PP
\fIBarsync\fR registers a task as waiting for passage through the
barrier.
If the barrier is \*Qclosed\*U (ie, the barrier's count has not been
met yet) the calling task is put into a spin wait until the barrier is
\*Qopen\*Ued.
Only the task which satisfies the barrier's count will not be
hung out to spin (this is considered to be the last task).
When the last task invokes \fIbasync\fR, it and the spinning tasks are
allowed passage through the barrier and the barrier state is reset to
\*Qclosed\*U.
.SH 
NAME
.br
lgclose \- Close the MAT/GMAT history logging file
.SH 
SYNOPSIS
.br
\fBcall lgclose\fR
.SH 
DESCRIPTION
.PP
This routine closes the file for the history buffer and deallocates
the buffer.
This routine must be called before the termination of the
multiprocessing program if the information is to make its way from the
internal shared memory circular buffer to the output file.
.SH 
NAME
.br
lgdsable \- Globally disable the MAT/GMAT logging routines
.SH 
SYNOPSIS
.br
\fBcall lgdsable\fR
.SH 
DESCRIPTION
.PP
This routine disables the logging mechanism.
If logging has been disabled, only the call to \fIlgenable\fR has any effect.
.SH 
NAME
.br
lgenable \- Globally ensable the MAT/GMAT logging routines
.SH 
SYNOPSIS
.br
\fBcall lgenable\fR
.SH 
DESCRIPTION
.PP
This routine enables the logging mechanism.
If logging is not enabled, calls to other logging routines have no effect.
.SH 
NAME
.br
lgentry \- Log an event in the MAT/GMAT history buffer
.SH 
SYNOPSIS
.br
\fBcall lgentry( nevent, info, length )\fR
.br
\fBinteger nevent, info(length), length\fR
.SH 
ARGUMENTS
.br
\fBnevent\fR
.RS
The integer event number for the entry.
This may be a number in the range 0 to 127 of the user's choice.
.RE
\fBinfo\fR
.RS
The integer array containing information to be logged with the event.
The array may be from 0 to 13 words long.
.RE
\fBlength\fR
.RS
The integer length of the info array (word length).
.RE
.SH 
DESCRIPTION
.PP
This routine logs events in the shared memory internal circular trace
buffer.
Events 0 through 127 are reserved for use by the user.
The association of an event number to the user event is arbitrary and
is left to the user (be creative!).
Event specific information may be logged with the standard event
information.
The standard event information (garnished by the \fIlgentry\fR routine)
includes: the event number, the process 
number, the task identifier and a real time clock stamp.
.SH 
NAME
.br
lgoff \- Globally disable the logging of user events
.SH 
SYNOPSIS
.br
\fBcall lgoff\fR
.SH 
DESCRIPTION
.PP
This routine turns off the logging of user events until the next call
to \fIlgon\fR.
.SH 
NAME
.br
lgon \- Globally enables the logging of user events
.SH 
SYNOPSIS
.br
\fBcall lgon\fR
.SH 
DESCRIPTION
.PP
This routine turns on the logging of user events.  This routine is
part of the initialization for the MAT/GMAT tracing mechanism.
.SH 
NAME
.br
lgopen \- Open the MAT/GMAT history buffer with terminal file storage.
.SH 
SYNOPSIS
.br
\fBcall lgopen( ibufr, length, nfile, icreate )\fR
.br
\fBinteger ibufr(length), length, icreate\fR
.br
\fBcharacter* nfile\fR
.SH 
ARGUMENTS
.br
\fBibufr\fR
.RS
An integer array for the circular buffer (declared by the user in
shared memory) to store the logged information.
.RE
\fBlength\fR
.RS
The integer length (in words) of the user supplied (shared memory)
buffer.
.RE
\fBnfile\fR
.RS
Character string holding the pathname of a file to dump out the trace
buffer information when \fIlgclose\fR is called.
.RE
\fBicreate\fR
.RS
An integer flag which signals the file creation policy.  Icreate = -1
will destroy an existing file.  Icreate = 0 will cause an error
message and program termination if the file exists.
.RE
.SH 
DESCRIPTION
.PP
This routine sets up the internal buffers for the event logging
routines.  If a pathname is supplied the internal history buffer will
be dumped out to disk when the call to \fIlgclose\fR is made.  No disk
accesses are made during the logging process (this minimizes the
overhead).
.NH 1
Conclusions
.PP
We have discussed here a set of tools (and supporting libraries) for
the graphical analysis of multitasking and microtasking applications.
These tools allow people to view trace files generated (with only
minor user intervention) by applications running on either the Alliant
FX/8 running UNIX 4.2 or the Cray X-MP's running NLTSS.
Two types of displayes are given: state graph (inheritance tree with
nodes displaying the state of tasks) and time line (time based history
of tasks).
.PP
It is our experiance that tracking down synchronization and control
bugs in multi/microtasking applications using \fBstategraph\fR and
\fBtimeline\fR is easier than conventional methods.
Although graphical multiprocessing analysis (GMAT) is only in its
infancy, it is clear that such an approach holds great promise for
improving programmer productivity in the harsh multiprocessing
environments.
.NH 1
Acknowledgements
.PP
We would like to thank Jack Dongarra and Danny Sorenson of the Argonne
National Laboratory for the sources to the Schedule display tool.
It was from this tool that we began our development.
Though only two routines from that tool remain in this source, we are
very grateful to Dongarra and Sorenson for their support.
This helped greatly as an example from which we could learn how to use
the SunView windowing system, and in turn greatly accelerated our
development effort.
.PP
The multiprocessing event data that is generated for MAT and GMAT is
patterned after a similar tool designed by Cray Research, Inc. and
called MTDUMP.  The format of the event data, library routines to 
collect the data on the Livermore Computer Center's (LCC) X-MP
multiprocessing machines (NLTSS system), and the MAT tool that
produces an ASCII display of the data were designed and implemented
by Bruce Kelly and Robert Strout.
The microtasking mechanism supported at LCC was modified to produce
trace information by Nancy Werner.
.NH 1
Bibliography
.XP
[1] Kelly, Bruce, \*QMAT Multitasking Analysis Tool,\*U LCSD-347,
Lawrence Livermore National Laboratory, PO Box 808, Livermore, CA 94550.
.XP
[2] Dongarra, Jack and Sorensen, Danny, \*QSchedule: Tools for
Developing and Analyzing Parallel Fortran Programs,\*U
ANL/MCS-TM-86, Argonne National Laboratory, Math and Computer Sciences
Division, 97000 South Cass Ave., Argonne, Il 90439.
.XP
[3] Sun Microsystems, \*QSunView Programmer's Reference Manual,\*U
Part No. 800-1345-02, Sun Microsystems, Inc., 2550 Garcia Ave.,
Mountain View, CA 94043.
.XP
[4] Crispin, Kent and Strout, Robert, \*QNSYSLIB Library Reference
Manual,\*U LCSD-912, Lawrence Livermore National Laboratory, PO Box
808, Livermore, CA 94550. 
.XP
[5] Cray Research, \*QMultitasking User Guide,\*U, SN-0222, Cray
Research, Inc., 1440 Northland Dr., Mendota Heights, MN 55120.
.XP
[6] Dongarra, Jack and Grosse, Eric, \*QDistribution of Mathematical
Software Via Electronic Mail\*U, ANL/MCS-TM-48, Argonne National
Laboratory, Math and Computer Sciences Division, 97000 South Cass
Ave., Argonne, Il 90439.
.bp
.NH 1
Figures
.NH 2
List of Stategraph Symbols 
.PP
.sp 7.0i
.ce
Figure 1: List of Stategraph Symbols
.PP
The above figure illustrates the various symbols that are used in the
GMAT \fBstategraph\fR.  A display of these symbols is available from within
the program via the \*QKey\*U button in the control panel.
.bp
.NH 2
Stategraph Graphical User Interface
.PP
.sp 7.0i
.ce
Figure 2:  \fBStategraph\fR Graphical User Interface
.PP
The above figure shows the GMAT \fBstategraph\fR in Static Mode.  Note the
four control areas - the panel, the text subwindow, the overview
canvas, and the primary canvas.  The node information window was
brought up by positioning the cursor on node 2 and depressing the
right mouse button.
.bp
.NH 2
List of Timeline Symbols
.sp 7.0i
.ce
Figure 3: List of Timeline Symbols
.PP
This figure illustrates the symbols encountered in the \fBtimeline\fR tool.
The first section demonstrates the connecting lines between posted
event boxes, the second section outlines all of the posted event
boxes, and the third section contains all of the process icons.
.bp
.NH 2
Timeline Graphical User Interface
.sp 7.0i
.ce
Figure 4: Timeline Graphical User Interface
.PP
This figure illustrates a typical \fBtimeline\fR run.  
The upper left hand corner is the panel, with all of the user defined
controls. 
The upper right hand corner contains the text subwindow, where status
and error messages occur.  
The rest of the screen is taken up by the actual graph.  
Each column of the screen represents one task.  
Note the connecting lines illustrating the the task's state, the
posted event boxes with identifying strings, and the process icons at
the right hand edge of the graph.
