https://www.multicians.org/thvv/hellandizing.html

Software Engineering Stories

1998

Hellandizing

I had the privilege of working with # Pat Helland at Tandem Computers
in the 1980s. His intelligence, creativity, good humor, and
ebullience were a rare gift to his colleagues. (Pat had done great
work at # BTI Computer Systems, now largely forgotten.)

In 1984, Pat invented a technique to help test server programs. We
called the method Hellandizing a program to be tested. It involves
placing test points at all significant locations in a program. When
the program reaches a test point, it consults a master debug-mode
switch to see if testing is enabled, and just continues if not. If
testing is turned on, the test point checks a control vector for
instructions. Each test point has an entry in the vector, which
instructs the program to perform one of a set of actions:

  * Continue
  * Abort the program
  * Reset the test point and abort the program
  * Crash the CPU
  * Reset the test point and crash the CPU
  * Sleep for a short time
  * Wait for a signal
  * Invoke the debugger
  * Enter a trace message in the debug log
  * Count occurrences of the test point

and so on. An interface program allows the programmer to examine and
change the test vector from outside the program, by means depending
on the program environment; for processes operating in privileged
mode, low-level message facilities were used.

The particular program Pat was testing was a fault tolerant server
consisting of two processes running on different CPUs. As the primary
process executed, it sent checkpoint messages to the backup process,
so that the backup process could take over if the primary process or
its CPU died. (Tandem called these "NonStop" programs.) Checkpoint
calls were placed just before every call by which a process affected
the rest of the world, whether by file I/O or by sending a message to
another process.

Pat placed test points at each checkpoint. For a program that doesn't
have a backup, the principle is the same: place a test point just
before every operation that reveals the internal state of the
program.

Hellandizing makes disciplined testing possible for server programs
even though their state is concealed, their logic complex, and their
timing dependent on interaction with other processes. The test points
identify the places where the program interacts with the outside
world and give the tester a handle on each one. We used to test
NonStop programs by "walking" a breakpoint down the code of a program
100 locations at a time, and forcing a take-over at each breakpoint:
but why 100? Forcing a takeover at each test point ensures that every
distinguishable state of the process has been tried. To test timing
dependencies, we can set the control vector to stop or delay the
program at each critical point.

In a system of interacting processes, the ability to have a process
under test wait for a signal from a utility program invoked from a
test driver means that every critical race can be opened up, and all
important sequences of events can be reproducibly tried. The test
script can simulate intervening events or cause external failures
before allowing the program under test to continue. We can use this
approach to answer questions like "what if the communications line
failed before we reached this point in the code?"

Test points make it possible to design systematic test libraries
which explore every combination of object state and operation,
provided that the test designers can get deep understanding of the
code. Thorough analysis of the program being tested is still
necessary in order to enumerate the number of cases that must be
tested, and to understand what cases each test covers.

Copyright (c) 1998 by Tom Van Vleck

Home | Resume | Multics | CTSS
SE Stories | Mulvaney | Mushrooms | Photo | Borges

[         ][mail Tom Van Vleck]