From matsunaw@ja2.so-net.ne.jp  Mon May 15 11:15:12 2000
Return-Path: <matsunaw@ja2.so-net.ne.jp>
Received: from mgate03.so-net.ne.jp (mgate03.so-net.ne.jp [210.139.254.150])
	by hub.freebsd.org (Postfix) with ESMTP id 1F58A37B89D
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 15 May 2000 11:15:02 -0700 (PDT)
	(envelope-from matsunaw@ja2.so-net.ne.jp)
Received: from mail.ja2.so-net.ne.jp (mail.ja2.so-net.ne.jp [210.139.254.26])
	by mgate03.so-net.ne.jp (8.8.8+3.0Wbeta9/3.6W00042420) with ESMTP id DAA01988
	for <FreeBSD-gnats-submit@freebsd.org>; Tue, 16 May 2000 03:14:58 +0900 (JST)
Received: from tsuneko.local (p8bc4af.nigtcc01.ap.so-net.ne.jp [210.139.196.175])
	by mail.ja2.so-net.ne.jp (8.8.8/3.7W99081617) with ESMTP id DAA17242
	for <FreeBSD-gnats-submit@freebsd.org>; Tue, 16 May 2000 03:14:52 +0900 (JST)
Message-Id: <200005151400.XAA00613@tsuneko.local>
Date: Mon, 15 May 2000 23:00:33 +0900 (JST)
From: matsunaw@ja2.so-net.ne.jp
Reply-To: matsunaw@ja2.so-net.ne.jp
To: FreeBSD-gnats-submit@freebsd.org
Subject: /var/run as mount_mfs w/ soft-updates
X-Send-Pr-Version: 3.2

>Number:         18572
>Category:       kern
>Synopsis:       /var/run as mount_mfs w/ soft-updates
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon May 15 11:20:00 PDT 2000
>Closed-Date:    Tue May 29 11:23:19 PDT 2001
>Last-Modified:  Tue May 29 11:23:28 PDT 2001
>Originator:     Hitoshi Matsunawa
>Release:        FreeBSD 4.0-RELEASE i386
>Organization:
none
>Environment:

[HW]
CPU: Pentium II/Pentium II Xeon/Celeron (400.91-MHz 686-class CPU)
real memory  = 134152192 (131008K bytes)
ad0: 9671MB <IBM-DTTA-351010> [19650/16/63] at ata0-master using UDMA33

[Running daemons]
    0  ??  DLs    0:00.00  (swapper)
    1  ??  ILs    0:00.01 /sbin/init --
    2  ??  DL     0:00.01  (pagedaemon)
    3  ??  DL     0:00.00  (vmdaemon)
    4  ??  DL     0:00.01  (bufdaemon)
    5  ??  DL     0:00.35  (syncer)
   27  ??  ILs    0:00.01 mfs -s 8192 /dev/ad0s3b /var/run (mount_mfs)
   29  ??  ILs    0:00.02 mfs -s 65536 /dev/ad0s3b /tmp (mount_mfs)
   35  ??  Is     0:00.00 adjkerntz -i
   81  ??  Ss     0:00.05 syslogd
   84  ??  Is     0:00.07 named -q
  105  ??  Is     0:00.01 inetd -lwW
  107  ??  Is     0:00.03 cron
  125  ??  I<s    0:00.00 apmd

[Mounted file systems]
/dev/ad0s2a on / (ufs, local,...)
/dev/ad0s2e on /var (ufs, local, soft-updates,...)
/dev/ad0s3f on /var/data (ufs, local, soft-updates,...)
mfs:27 on /var/run (mfs, asynchronous, local)
mfs:29 on /tmp (mfs, asynchronous, local)
/dev/ad0s2f on /usr (ufs, local, soft-updates,...)
/dev/ad0s3h on /usr/local/src (ufs, local, soft-updates,...)

>Description:

If ``halt''-ed immediately after creating some number of dirty buffer
(may required spread across several file systems), system will fail to
shutdown.

w/ ``sysctl -w debug.busyprt=1'', following messages reported on console:

---->
vflush: busy vnode: 0xc8770e00: type VREG, usecount 1, writecount 0, refcount 1, flag (VOBJBUF)
	tag VT_UFS, ino 4, on dev MFS0(253,0)

[in non-deadlock-ed case, two more busy vnode (type of VSOCK) are reported]

message to wait for ``bufdaemon''
message to wait for ``syncer''

syncing disks... 2 2 2 2 2 2 ....

1. dev:MFS0, flags: 00100000	blkno: 48, lblkno: 48
2. dev:MFS0, flags: 01020024	blkno: 64, lblkno: 64
<----

This seems to be caused by ``rundown'' race between mfs process and
other daemons who use "/var/run", and resulted in deadlock.  kernel
DDB shows named waiting on "biord" while mfs/syslogd on "inode".

If soft-updates is disabled, no problems occur (in my case).  But
soft-updates may only makes the problem easy to happen.


I tried several way to figure out what's going on and/or find out a
work-around:

(1) in mfs_start() (of src/ufs/mfs/mfs_vfops.c), ignoring signals except
    SIGKILL. 
(2) in mfs_start(), exucutes dounmount() in other execution vehicle
    (using kthread_create()).
(3) in biowait() (of src/kern/vfs_bio.c), add a time-out parameter on
    tsleep()-call and return EINTER if timed-out.

all of above seems solve the problem for me, but:

(1) user visible changes.
(2) kthread_create() seems not appropriate for this kind purpose.
(3) tried just for curiosity and to support kernel DDB's idea of "who's
    waiting on where".

There may be more choices and trade-offs.

>How-To-Repeat:

It's depend on timing. In my case:

  % cd $INDEX_of_NAMAZU_as_WEB_SEARCH_ENGINE
  % rm *
  % sync; sync; sync
  % su
  # cd /usr/src/sys/compile/$IDENT
  # mv kernel /kernel.00
  % mknmz -a ~/src/linux-2.3.51/Documentation/
  % super halt

where, LANG=ja_JP.EUC, perl (mknmz is written in perl) used is bundled
one.

>Fix:

I only found a few work around (described in [>Description:]).

I think fundamental problem is that once mfs process enters into
dounmount(), mfs i/o won't be served for other processes (other than
mfs process itself). I couldn't find how it's related w/
soft-updates. It's may all about timing but may not.

>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->closed 
State-Changed-By: phk 
State-Changed-When: Tue May 29 11:23:19 PDT 2001 
State-Changed-Why:  
MFS is deprecated 


http://www.FreeBSD.org/cgi/query-pr.cgi?pr=18572 
>Unformatted:
