Subj : Re: Deadlock theorem
To   : comp.programming.threads
From : David Hopwood
Date : Thu May 05 2005 02:33 am

David Schwartz wrote:
> "Uenal Mutlu" <520001085531-0001@t-online.de> wrote:
> 
>>>Your hierarchy does not allow you to relock o2 while you still hold 
>>>the o1 lock. So this is not legal.
> 
>>No, Sir. This scenario I'm using so often that it is "natural" for me.
[...]
>     Your theorem states:
> 
>    c) If there are multiple threads then the set of objects to lock within
>       a thread must not contain the same object more than once.
>       (Unlocking all in the set and starting over again (either the same
>       or a different set of objects), either in the same block or in a
>       different block, is ok.)
> 
>     This prohibits locking o2 again while the o1 lock is still held. In this 
> case, only part of the set has been unlocked, not all of it.

Reading between the lines, what Uenal may have meant is:

      c) If there are multiple threads then the set of objects to lock
         within a *block* (method, function, etc.) must not contain the
         same object more than once.

This is safe/correct (assuming that blocks are properly nested, i.e. in
a language without continuations), but almost useless: it allows unlocks
within a block to be reordered relative to conventional hierarchical
locking, but within a block there is little or no reason to reorder
unlocks.

>>I'm currently extending it for detecting violations of the theorem (ie. an
>>automatic deadlock detector conditionally compiled only in debug build.
>>IMO it even could be used in release build since the overhead seems not 
>>that much).
> 
>     It's very expensive to do this in a release build, but it's *great* for 
> a debug build.

Automatic deadlock detection is not expensive.

First let's consider the case of global deadlock, where *all* threads
of a program are waiting on locks. There is no significant performance
overhead involved in detecting this state (i.e. the state when the
scheduler has no runnable threads and no thread is blocked for another
reason, such as I/O). The complexity cost of doing so is also very
small. So there is no excuse at all for concurrent systems not to provide
this level of deadlock detection for all builds.

Detecting mutual deadlock between some subset of threads, as opposed to
global deadlock (i.e. where it would be impossible for the threads that
are still running to resolve the deadlock) is a bit more complicated.
However, it need not be more expensive, because of the following
observation:

Except in some hard real-time systems, it's usually sufficient to
detect deadlock after some delay -- for example a second. So it is
possible to invoke a deadlock detection algorithm only when some lock
has already been held for this delay. The algorithm might have overhead
proportional to the number of threads, but since it should never be
invoked in a properly running program (that holds each lock only for
a short time), and is in the worst case invoked at most once per
second, this is quite acceptable.

In real-time and safety-critical systems, it's also useful to detect
when a deadline is going to be missed (i.e. write the program to meet
deadlines, but also trigger logging or a safe shutdown if a bug or
unexpected hardware delay would cause a missed deadline). This can
be combined with the scheme above, so that the log will show whether
the missed deadline was due to a deadlock and what threads were
involved.

(There is a possible further refinement that involves putting threads
into categories that don't usually block on the same locks. It still
detects any deadlocks after a delay, but also detects deadlocking of
all threads within a category immediately.)

-- 
David Hopwood <david.nospam.hopwood@blueyonder.co.uk>

.