From dillon@apollo.backplane.com  Mon Feb 15 11:43:59 1999
Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id LAA20791
          for <FreeBSD-gnats-submit@freebsd.org>; Mon, 15 Feb 1999 11:43:58 -0800 (PST)
          (envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.9.3/8.9.1) id LAA18818;
	Mon, 15 Feb 1999 11:43:57 -0800 (PST)
	(envelope-from dillon)
Message-Id: <199902151943.LAA18818@apollo.backplane.com>
Date: Mon, 15 Feb 1999 11:43:57 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Reply-To: dillon@apollo.backplane.com
To: FreeBSD-gnats-submit@freebsd.org
Subject: inode vs exec_map interlock
X-Send-Pr-Version: 3.2

>Number:         10107
>Category:       kern
>Synopsis:       interlock situation with exec_map and a program binary inode
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    remko
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Feb 15 11:50:01 PST 1999
>Closed-Date:    Thu Nov 16 08:31:14 GMT 2006
>Last-Modified:  Thu Nov 16 08:31:14 GMT 2006
>Originator:     Matthew Dillon
>Release:        FreeBSD 4.0-CURRENT i386
>Organization:
none
>Environment:

	Heavily loaded test machine artificially limited to 16MB of main
	memory, NFS swap, running a buildworld -j10.

>Description:

    I found an interesting interlock situation between what I believe to
    be a program binaries inode and the exec_map.  The machine locked up
    trying to exec new programs.

    This was running a make -j10 buildworld on a machine with 16MB of ram
    configured, while testing my new VM system.  I don't think the lockup is 
    due to my VM system, though.  It took it 7 hours of extremely heavy
    paging before it locked up.

    When I broke the machine out into DDB and did a ps, all of the cc's
    were stuck in 'inode' wait, while a single ld program was stuck in
    'thrd_sleep'.

    I tracked 'thrd_sleep' down to a vm_map lock and the map down to
    the exec_map.  I tracked down the inode lock to the 'cc' program binary.
    The inode had one shared lock and 7 waiters.  The exec_map appears to
    own one shared lock with 6 waiters ( but most of the waiters are due to
    me trying to run other programs before breaking into the DDB ).

    I am guessing that there is an interlock situation with exec_map and
    a program inode where one process locks exec_map followed by the program
    inode, and another locks the program inode followed by exec_map.  But 
    I'm not familiar with that section of the code so I would appreciate
    any help.

>How-To-Repeat:

    The problem was found by running a make -j10 buildworld on a machine
    artificially limited to 16MB of main memory, with NFS swap.  The problem
    occured approximately 7 hours into the buildworld so it is presumably
    difficult to recreate and represents a small window somewhere.

>Fix:
	
	Unknown as yet.

>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->dillon 
Responsible-Changed-By: johan 
Responsible-Changed-When: Thu Aug 10 23:43:57 PDT 2000 
Responsible-Changed-Why:  
Let Matt handle his own PRs. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=10107 
Responsible-Changed-From-To: dillon->freebsd-bugs 
Responsible-Changed-By: keramida 
Responsible-Changed-When: Sat Feb 22 18:13:46 PST 2003 
Responsible-Changed-Why:  
Back to the free pool. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=10107 
State-Changed-From-To: open->feedback 
State-Changed-By: remko 
State-Changed-When: Sun Nov 12 08:34:14 UTC 2006 
State-Changed-Why:  
Hello Matthew, 

Can you tell me whether this got solved during the last 'time'. 


http://www.freebsd.org/cgi/query-pr.cgi?pr=10107 
Responsible-Changed-From-To: freebsd-bugs->remko 
Responsible-Changed-By: remko 
Responsible-Changed-When: Sun Nov 12 08:35:33 UTC 2006 
Responsible-Changed-Why:  
I'll take it. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=10107 
State-Changed-From-To: feedback->closed 
State-Changed-By: remko 
State-Changed-When: Thu Nov 16 08:31:10 UTC 2006 
State-Changed-Why:  
Matthew reports that this problem is likely to be fixed already after 6 
years. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=10107 
>Unformatted:
