From nobody@FreeBSD.org  Thu Dec 24 14:05:20 2009
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E71AA106566B
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 24 Dec 2009 14:05:20 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id D73F28FC19
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 24 Dec 2009 14:05:20 +0000 (UTC)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id nBOE5KiF080774
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 24 Dec 2009 14:05:20 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id nBOE5KLH080773;
	Thu, 24 Dec 2009 14:05:20 GMT
	(envelope-from nobody)
Message-Id: <200912241405.nBOE5KLH080773@www.freebsd.org>
Date: Thu, 24 Dec 2009 14:05:20 GMT
From: David Naylor <naylor.b.david@gmail.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: [lor] ufs/unionfs(/ufs)
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         141950
>Category:       kern
>Synopsis:       [unionfs] [lor] ufs/unionfs/ufs Lock order reversal
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-fs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Dec 24 14:10:01 UTC 2009
>Closed-Date:    
>Last-Modified:  Wed Jul 03 00:59:35 UTC 2013
>Originator:     David Naylor
>Release:        FreeBSD 9-Current
>Organization:
Private
>Environment:
FreeBSD  9.0-CURRENT FreeBSD 9.0-CURRENT #0: Sat Dec 19 11:03:35 SAST 2009     root@dragon.dg:/tmp/i386/usr/src/sys/GENERIC  i386
>Description:
I have observed the following two LOR while running a script (see attached) that makes extensive use of unionfs.  The first LOR was produced with all mounts using atime while the second was using noatime.  

A while after the LOR appear the system freezes up, nothing responds and requires a hard reset.  

(The LOR were hand copied from a screenshot, I still have the pngs if needed)

(with unionfs+atime)
lock order reversal:
 1st 0xc4f49058 unionfs (unionfs) @ /usr/src/sys/modules/unionfs/../../fs/unionfs/union_subr.c:356
 2nd 0xc4f49168 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2188
KDB: stack backtrace:
db_trace_self_wrapper(c0c97944,c43d4814,c08cfc95.c08c094b,c0c9a84d,...) at db_trace_self_wrapper+0x26
kdb_backtrace(c08c094b,c0c9a84d,c4530158,c452fc10,c43d4870,...) at kdb_backtrace+0x29
_witness_debugger(c0c9a84d,c4c49168,c0c8d094,c452fc10,c0ca1a95,...) at _witness_debugger+0x25
witness_checkorder(c4f49168.9,c0ca1a95,88c,0,...) ar witness_checkorder+0x839
__lockmgr_args(c4f49168,80100,c4f49188,0,0,...) at __lockmgr_args+0x824
ffs_lock(c43d4994,c082fa3b,c0ca1a95,80100,c4f49110,...) ar ffs_lock+0x8a
VOP_LOCK1_APV(c0d9ff20,c43d4994,c087d663,c0dba860,c4f49110,...) at VOP_LOCK1_APV+0xb5
_vn_lock(c4f49110,80100,c0ca1a95,88c,4,...) at _vn_lock+0x5e
vrele(c4cf49110,0,c5aeb208,c4,0,...) at vrele+0x137
unionfs_noderem(c4f49000,c4bd4240,c43d4a54,c0bd9135,c43d4a74,...) at unionfs_noderem+0x1e5
unionfs_reclaim(c43d4a75,1,0,c4f49000,c43d4a98,...) at unionfs_reclaim+0x1b
VOP_RECLAIM_APV(c6aec7e0,c43d4a74,0,0,c4f49078,...) at VOP_RECLAIM_APV+0xa5
vgonel(c4f49078,0,c0ca1a95,9c5,c43d4afc,...) at vgonel+0x1a4
vrecycle(c4f49000,c4bd4240,c43d4ae4,c0db9215,c43d4afc,...) at vrecycle+0x4a
unionfs_inactive(c43d4afc,c4f49078,c4f49000,c4f49078,c43d4b14,...) at unionfs_inactive+0x28
VOP_INACTIVE_APV(c5aec7e0,c43d4afc,c0ca1a95,924,c0dba820,...) at VOP_INACTIVE_APV+0xa5
vinactive(c5aec7e0,c43d4b30,c0ca1a95,8aa,c1869380,...) at vinactive+0x8e
vput(c4f49000,ffffffdf,c43d4c00,c43d4b6c,0,...) at vput+0x1cd
kern_mkdirat(c4db4240,ffffff9c,80513e0,0,1c0,...) at kern_mkdirat+0x25a
kern_mkdir(c4db4240,80513e0,0,1c0,c43d4d2c,...) at kern_mkdir+0x2e
mkdir(c4db4240,c43d4cf8,8,c0cac248,c0d7f300,...) at mkdir+0x29
syscall(c43d4d38) at syscall+0x2a3
Xint0x80_syscall() at Xint0x80_syscall+0x20
--- syscall (136, FreeBSD ELF32, mkdir), eip - 0x281814f3, esp = 0xbfvfe9cc, ebp = 0xbfbfea58 ---

(with unionfs+noatime)
lock order reversal:
 1st 0xc49288d8 ufs (ufs) @ /usr/src/sys/modules/unionfs/../../fs/unionfs/union_vnops.c:1821
 2nd 0xc501e5a8 unionfs (unionfs) @ /usr/src/sys/modules/unionfs/../../fs/unionfs/union__subr.c:356
 3rd 0xc501e8d8 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2188
KDB: stack backtrace:
db_trace_self_wrapper(c0c97944,c43b588c,c08cfc95.c08c094b,c0c9a866,...) at db_trace_self_wrapper+0x26
kdb_backtrace(c08c094b,c0c9a866,c4530158,c452fc78,c43d58e8,...) at kdb_backtrace+0x29
_witness_debugger(c0c9a866,c501e8d8,c0c8d094,c452fc78,c0ca1a95,...) at _witness_debugger+0x25
witness_checkorder(c501e8d8.9,c0ca1a95,88c,0,...) ar witness_checkorder+0x839
__lockmgr_args(c501e8d8,80100,c501e8f8,0,0,...) at __lockmgr_args+0x824
ffs_lock(c43b5a0c,c0c92c86,c0ca1a95,80100,c501e880,...) ar ffs_lock+0x8a
VOP_LOCK1_APV(c0d9ff20,c43b5a0c,c087d663,c0dba860,c501e880,...) at VOP_LOCK1_APV+0xb5
_vn_lock(c501e880,80100,c0ca1a95,88c,c0dba1f1,...) at _vn_lock+0x5e
vrele(c501e880,c43b5a0c,c501e5c8,0,0,...) at vrele+0x137
unionfs_noderem(c501e550,c489ad80,c43d5acc,c0bd9135,c43d5aec,...) at unionfs_noderem+0x1e5
unionfs_reclaim(c43d5aec,c501e550,c489ad80,c501e550,c43d5b10,...) at unionfs_reclaim+0x1b
VOP_RECLAIM_APV(c579d7e0,c43d4aec,c0ca1a95.a0d,c501e5c8,...) at VOP_RECLAIM_APV+0xa5
vgonel(c501e5c8,0,c0ca1a95,9c5,c43d5bc0,...) at vgonel+0x1a4
vgone(c501e550,c43b5bc0,c0ca1a95,9a3,0,...) at vgone+0x39
vflush(c48bf000,1,0,c489ad80,c51913c0,...) at vflush+0x4ba
unionfs_unmount(c48bf000,80000000,c0ca128f,4f9,80,...) at unionfs_unmount+0x51
dounmount(c48bf000,80000000,c489ad80,47e,8,...) at dounmount+0x46d
unmount(c489ad80,c43b5cf8,8,c489ad80,c0d7e688,...) at unmount+0x2ff
syscall(c43d5d38) at syscall+0x2a3
Xint0x80_syscall() at Xint0x80_syscall+0x20
--- syscall (136, FreeBSD ELF32, unmount), eip - 0x280da13f, esp = 0xbfbfe53c, ebp = 0xbfbfe608 ---
>How-To-Repeat:
Run the script.  It attempts to build Xorg using unionfs to maintain a clean environment.  
>Fix:


Patch attached with submission follows:

BUILDDIR=/build
LOCALBASE=/usr/local
PORTSDIR=/usr/ports
PKGDIR=$BUILDDIR/packages

set -e

mkdir -p $BUILDDIR $LOCALBASE $PKGDIR

port2name() {

  echo $1 | sed 's|[/.-]|_|g'

}

port2pkg() {

  local pkg_name
  local port

  port=$1; shift
  eval pkg_name=PKG$(port2name $port)
  eval pkg=\$$pkg_name
  if [ -z "$pkg" ]
  then
    pkg=$(make -C $port -V PKGNAME)
    eval $pkg_name=$pkg
  fi

}

depends() {

  local depend=
  local depends_name=
  local _deps=
  local name=
  local port=

  port=$1

  eval depends_name=DEPEND$(port2name $port)
  eval deps=\"\$$depends_name\"

  if [ -z "$deps" ]
  then
    echo "Getting dependencies for $port" > /dev/stderr

    depend_list="$(make -C $port -V BUILD_DEPENDS -V LIB_DEPENDS -V RUN_DEPENDS)"
    for depend in $depend_list
    do
      name=$(echo $depend | cut -f 2 -d ':')
      depends $name
      _deps="$_deps $deps $name"
    done

    deps=$(for depend in $_deps
    do
      echo $depend
    done | sort -u)

    depends_name=$depends_name
    eval $depends_name=\"$deps \"
  fi

}

build() {

  local _deps
  local dep
  local port=
  local pkg=

  port=$1

  echo "Building port $port..."

  depends $port
  echo $deps
  _deps="$deps"
  echo $_deps
  for dep in $_deps
  do
    port2pkg $dep
    if [ ! -d $BUILDDIR/$pkg ]
    then
      if ! build $dep
      then
        echo "Port $port failed due to dependency $dep"
        return 255
      fi
    fi
    echo $_deps
  done

  for pkg in $_deps
  do
    port2pkg $pkg
    mount -t unionfs -r -o noatime $BUILDDIR/$pkg $LOCALBASE
  done
  port2pkg $port
  mkdir -p $BUILDDIR/$pkg
  mount -t unionfs -o noatime $BUILDDIR/$pkg $LOCALBASE

  set +e
  trap "true" INT TERM EXIT
  make -C $port build install package clean -DNO_DEPENDS -DBATCH PACKAGES=$PKGDIR
  status=$?
  trap - INT TERM EXIT
  set -e

  umount $LOCALBASE
  for pkg in $(echo $_deps | sort -r)
  do
    port2pkg $pkg
    umount $LOCALBASE
  done

  if [ $status -ne 0 ]
  then
    echo "Port $port failed to build"
    rm -rf $BUILDDIR/$pkg || (chflags -R 0 $BUILDDIR/$pkg; rm -rf $BUILDDIR/$pkg)
  fi

  return $status

}

build /usr/ports/x11/xorg


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-fs 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Thu Dec 24 15:17:40 UTC 2009 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=141950 

From: David Naylor <naylor.b.david@gmail.com>
To: bug-followup@freebsd.org
Cc:  
Subject: Re: kern/141950: [unionfs] [lor] ufs/unionfs/ufs Lock order reversal
Date: Wed, 10 Feb 2010 20:44:58 +0200

 --nextPart2650448.KVKX8H28Pz
 Content-Type: Text/Plain;
   charset="us-ascii"
 Content-Transfer-Encoding: quoted-printable
 
 A recent -current kernel has produced a different backtrace.  I am still=20
 experiencing io freezes involving the unionfs mounts. =20
 
 lock order reversal:
  1st 0xffffff017423bbd8 unionfs (unionfs) @=20
 /usr/src/sys/modules/unionfs/../../fs/unionfs/union_subr.c:356
  2nd 0xffffff0113c2a9f8 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2204
 KDB: stack backtrace:
 db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
 _witness_debugger() at _witness_debugger+0x49
 witness_checkorder() at witness_checkorder+0x7ea
 __lockmgr_args() at __lockmgr_args+0xd43
 ffs_lock() at ffs_lock+0x8c
 VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
 _vn_lock() at _vn_lock+0x50
 vputx() at vputx+0x285
 unionfs_noderem() at unionfs_noderem+0x1c4
 unionfs_reclaim() at unionfs_reclaim+0x11
 vgonel() at vgonel+0xf6
 vrecycle() at vrecycle+0x58
 unionfs_inactive() at unionfs_inactive+0x20
 vinactive() at vinactive+0x6b
 vputx() at vputx+0x267
 kern_mkdirat() at kern_mkdirat+0x2e7
 syscall() at syscall+0x102
 Xfast_syscall() at Xfast_syscall+0xe1
 =2D-- syscall (136, FreeBSD ELF64, mkdir), rip =3D 0x80083f8dc, rsp =3D=20
 0x7fffffffe778, rbp =3D 0x800a3a180 ---
 
 --nextPart2650448.KVKX8H28Pz
 Content-Type: application/pgp-signature; name=signature.asc 
 Content-Description: This is a digitally signed message part.
 
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (FreeBSD)
 
 iEYEABECAAYFAkty/q0ACgkQUaaFgP9pFrLK7QCeNZfotOiudE3qbm6ct/U8usV/
 tUAAn16IFjAEgw9i0A50CL/yMNXTQfuK
 =F4NZ
 -----END PGP SIGNATURE-----
 
 --nextPart2650448.KVKX8H28Pz--
Responsible-Changed-From-To: freebsd-fs->daichi 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Sat Apr 17 06:25:09 UTC 2010 
Responsible-Changed-Why:  
Over to maintainer. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=141950 

From: Glenn Chambers <gchamber@bright.net>
To: bug-followup@FreeBSD.org, naylor.b.david@gmail.com
Cc:  
Subject: Re: kern/141950: [unionfs] [lor] ufs/unionfs/ufs Lock order
 reversal
Date: Sun, 14 Aug 2011 19:51:40 -0400

 I'm seeing the same issue without unionfs being involved.
 
 If I'm parsing the console traces correctly, I have two instances to
 report:
 
 1st 0xd3606ee0 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2658
 2nd 0xc43f5400 dirhash (dirhash)
 @ /usr/src/syste/ufs/ufs/ufs_dirhash.c:284
 
 1st 0xc484f7c8 ufs (ufs) @ /usr/src/sys/kern/vfs_subr:2134
 2nd 0xd3606ee0 bufwait (bufwait) @ /usr/src/sys/ufs/ffs/ffs_vnops.c:261
 3rd 0xc48ee058 ufs (ufs) @ /usr/src/sys/kern/vfs_subr:2134
 
 Errors occurred while doing a 'portsnap fetch' and a 'portsnap extract'
 
 uname -a:
 
 FreeBSD toucan 9.0-BETA1 FreeBSD 9.0-BETA1 #0 Thu Jul 28 16:34:16 UTC
 2011
 root@obrian.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386
 
 
State-Changed-From-To: open->open 
State-Changed-By: linimon 
State-Changed-When: Wed Jul 3 00:50:32 UTC 2013 
State-Changed-Why:  
commit bit has been taken in for safekeeping. 


Responsible-Changed-From-To: daichi->freebsd-fs 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Wed Jul 3 00:50:32 UTC 2013 
Responsible-Changed-Why:  

http://www.freebsd.org/cgi/query-pr.cgi?pr=141950 
>Unformatted:
