Newsgroups: comp.unix.wizards
Path: utzoo!henry
From: henry@utzoo.uucp (Henry Spencer)
Subject: Re: Ultrix tape job is unkillable!
Message-ID: <1988Dec18.023931.28730@utzoo.uucp>
Organization: U of Toronto Zoology
References: <476@larry.UUCP> <43200057@uicsrd.csrd.uiuc.edu>
Date: Sun, 18 Dec 88 02:39:31 GMT

In article <43200057@uicsrd.csrd.uiuc.edu> kai@uicsrd.csrd.uiuc.edu writes:
>This problem is not specific to Ultrix.  I've found the exact same thing
>occurs on VAX BSD unix, Sequent Dynix, and Alliant Concentrix.  The only
>thing that seems to work is a reboot.  I warned everyone here that
>interrupting a tape job by putting the drive offline is a tremendous
>mistake.

Hardware permitting, it is sometimes possible to break a system out of
this sort of hang by taking the drive offline, spacing the tape forward,
initiating a rewind, and then putting the drive online before the rewind
finishes.  On at least some drives, this generates the interrupt that the
driver is waiting for (although the driver may detect enough anomalies in
how this happened to complain).  Of course, if your drive won't let you
do this without software cooperation, you're sorta stuck...

Many, many device drivers unfortunately don't observe a general rule of
robustness:  unless there is legitimate reason for a device operation to
take an unbounded time to complete (e.g. a read from a terminal), drivers
should *never* sleep waiting for a device without setting a timeout.
This applies to *all* devices, since hardware failures should be handled
more gracefully than by just hanging, but devices that can wander off into
limbo due to human intervention are particularly important cases.
-- 
"God willing, we will return." |     Henry Spencer at U of Toronto Zoology
-Eugene Cernan, the Moon, 1972 | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
