From nobody@FreeBSD.org  Thu Nov  4 15:56:49 2010
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7537C1065673
	for <freebsd-gnats-submit@FreeBSD.org>; Thu,  4 Nov 2010 15:56:49 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 495008FC12
	for <freebsd-gnats-submit@FreeBSD.org>; Thu,  4 Nov 2010 15:56:49 +0000 (UTC)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id oA4Fumrv029171
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 4 Nov 2010 15:56:48 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id oA4FumJL029170;
	Thu, 4 Nov 2010 15:56:48 GMT
	(envelope-from nobody)
Message-Id: <201011041556.oA4FumJL029170@www.freebsd.org>
Date: Thu, 4 Nov 2010 15:56:48 GMT
From: Andreas Longwitz <longwitz@incore.de>
To: freebsd-gnats-submit@FreeBSD.org
Subject: FreeBSD RELENG_6 server freezes during create of a snapshot on a disk with mpt
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         151941
>Category:       kern
>Synopsis:       [mpt] [hang] FreeBSD RELENG_6 server freezes during create of a snapshot on a disk with mpt
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Nov 04 16:00:19 UTC 2010
>Closed-Date:    Sun Mar 06 07:34:56 UTC 2011
>Last-Modified:  Sun Mar 06 07:34:56 UTC 2011
>Originator:     Andreas Longwitz
>Release:        RELENG_6
>Organization:
Data Service Stockelsdorf
>Environment:
FreeBSD dssbkp1.incore 6.4-STABLE FreeBSD 6.4-STABLE #2: Wed Nov  3 18:18:31 CET 2010     root@dssbkp1.incore:/usr/src/sys/i386/compile/SERVER  i386
>Description:
An actual FreeBSD RELENG_6 system stops working on the command
      mount -u -o snapshot /prod/.snap/fscktest prod
where prod is a 1 TB partition on a scsi disk /dev/da0p1 connected
to mpt. The machine is semi-dead: All user processes are sleeping,
all cpus idle, only ping and ddb is possible. Giant is the only lock
shown by ps in ddb. The trace of the mount process causing the problem
looks like this:

Tracing command mount pid 7871 tid 100190 td 0xd1070480
sched_switch(d1070480,0,1) at sched_switch+0x14b
mi_switch(1,0,d1070480,f3408348,c03e6d10,...) at mi_switch+0x1ba
sleepq_switch(c0631cc4) at sleepq_switch+0x87
sleepq_wait(c0631cc4,0,d1070480,4,0,...) at sleepq_wait+0x5c
msleep(c0631cc4,c0631ce0,50,c05b02c2,0) at msleep+0x269
getnewbuf(0,0,4000,4000) at getnewbuf+0x6ce
getblk(d08de440,4cb7f440,0,4000,0,...) at getblk+0x360
breadn(d08de440,4cb7f440,0,4000,0,...) at breadn+0x31
bread(d08de440,4cb7f440,0,4000,0,f34084a8) at bread+0x20
ffs_alloccg(d134cad4,d5c,132dfce8,0,4000) at ffs_alloccg+0x13d
ffs_hashalloc(d134cad4,d5c,132dfce8,0,4000,...) at ffs_hashalloc+0x28
ffs_alloc(d134cad4,281ec6f,0,132dfce8,0,4000,d051d400,f34085e8,d134cad4,281ec6f,0,463,e8cae000)   
at ffs_alloc+0x20d
ffs_balloc_ufs2(d12dd880,7b1bc000,a0,4000,d051d400,0,f34087d8) at ffs_balloc_ufs2+0x16fc
ffs_snapshot(d099a2bc,d130e8a0,d130e8a0,d0982600,d08de440,...) at ffs_snapshot+0x89b
ffs_mount(d099a2bc,d1070480,10201000,0,d0520a80,...) at ffs_mount+0x991
vfs_domount(d1070480,d1125750,d0f8a250,11010000,d1125330) at vfs_domount+0x728
vfs_donmount(d1070480,11010000,f3408c04) at vfs_donmount+0x415
kernel_mount(d06599c0,11010000,804e040,0,fffffffe,...) at kernel_mount+0x38
ffs_cmount(d06599c0,bf7fdec0,11010000,d1070480,c05f84e0,...) at ffs_cmount+0x5d
mount(d1070480,f3408d04) at mount+0x18e
syscall(3b,3b,3b,804af21,bf7fe974,...) at syscall+0x2bf
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (21, FreeBSD ELF32, mount), eip = 0x880bfbcb, esp = 0xbf7fde9c, ebp = 0xbf7fdf38 ---

The problem arises both on i386 and amd64 server. Creating snapshots an 300 GB disks connected to amr controller work without any problems.
>How-To-Repeat:
see above
>Fix:
The reason for the problem is the update from 1.50.2.2 to 1.50.2.3 of the source ffs_balloc.c (SVN rev 196973 on 2009-09-08 14:19:14; MFC r180758).
If I revert this change from the kernel the problem disappears.

>Release-Note:
>Audit-Trail:

From: Remko Lodder <remko@elvandar.org>
To: Andreas Longwitz <longwitz@incore.de>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/151941: FreeBSD RELENG_6 server freezes during create of a snapshot on a disk with mpt
Date: Thu, 4 Nov 2010 17:20:18 +0100

 Hello Andreas,
 
 Can you please confirm that this also occurs on later versions? 6 is =
 near end of life, and given the timeframe I doubt it would be feasible =
 to look at this
 just for 6.x. BUT if this also occurs on later versions, then it has =
 more weight in the scale and is viable for investigation.
 
 Thanks
 Remko
 
 On Nov 4, 2010, at 4:56 PM, Andreas Longwitz wrote:
 
 >=20
 >> Number:         151941
 >> Category:       kern
 >> Synopsis:       FreeBSD RELENG_6 server freezes during create of a =
 snapshot on a disk with mpt
 >> Confidential:   no
 >> Severity:       critical
 >> Priority:       medium
 >> Responsible:    freebsd-bugs
 >> State:          open
 >> Quarter:       =20
 >> Keywords:      =20
 >> Date-Required:
 >> Class:          sw-bug
 >> Submitter-Id:   current-users
 >> Arrival-Date:   Thu Nov 04 16:00:19 UTC 2010
 >> Closed-Date:
 >> Last-Modified:
 >> Originator:     Andreas Longwitz
 >> Release:        RELENG_6
 >> Organization:
 > Data Service Stockelsdorf
 >> Environment:
 > FreeBSD dssbkp1.incore 6.4-STABLE FreeBSD 6.4-STABLE #2: Wed Nov  3 =
 18:18:31 CET 2010     =
 root@dssbkp1.incore:/usr/src/sys/i386/compile/SERVER  i386
 >> Description:
 > An actual FreeBSD RELENG_6 system stops working on the command
 >      mount -u -o snapshot /prod/.snap/fscktest prod
 > where prod is a 1 TB partition on a scsi disk /dev/da0p1 connected
 > to mpt. The machine is semi-dead: All user processes are sleeping,
 > all cpus idle, only ping and ddb is possible. Giant is the only lock
 > shown by ps in ddb. The trace of the mount process causing the problem
 > looks like this:
 >=20
 > Tracing command mount pid 7871 tid 100190 td 0xd1070480
 > sched_switch(d1070480,0,1) at sched_switch+0x14b
 > mi_switch(1,0,d1070480,f3408348,c03e6d10,...) at mi_switch+0x1ba
 > sleepq_switch(c0631cc4) at sleepq_switch+0x87
 > sleepq_wait(c0631cc4,0,d1070480,4,0,...) at sleepq_wait+0x5c
 > msleep(c0631cc4,c0631ce0,50,c05b02c2,0) at msleep+0x269
 > getnewbuf(0,0,4000,4000) at getnewbuf+0x6ce
 > getblk(d08de440,4cb7f440,0,4000,0,...) at getblk+0x360
 > breadn(d08de440,4cb7f440,0,4000,0,...) at breadn+0x31
 > bread(d08de440,4cb7f440,0,4000,0,f34084a8) at bread+0x20
 > ffs_alloccg(d134cad4,d5c,132dfce8,0,4000) at ffs_alloccg+0x13d
 > ffs_hashalloc(d134cad4,d5c,132dfce8,0,4000,...) at ffs_hashalloc+0x28
 > =
 ffs_alloc(d134cad4,281ec6f,0,132dfce8,0,4000,d051d400,f34085e8,d134cad4,28=
 1ec6f,0,463,e8cae000)  =20
 > at ffs_alloc+0x20d
 > ffs_balloc_ufs2(d12dd880,7b1bc000,a0,4000,d051d400,0,f34087d8) at =
 ffs_balloc_ufs2+0x16fc
 > ffs_snapshot(d099a2bc,d130e8a0,d130e8a0,d0982600,d08de440,...) at =
 ffs_snapshot+0x89b
 > ffs_mount(d099a2bc,d1070480,10201000,0,d0520a80,...) at =
 ffs_mount+0x991
 > vfs_domount(d1070480,d1125750,d0f8a250,11010000,d1125330) at =
 vfs_domount+0x728
 > vfs_donmount(d1070480,11010000,f3408c04) at vfs_donmount+0x415
 > kernel_mount(d06599c0,11010000,804e040,0,fffffffe,...) at =
 kernel_mount+0x38
 > ffs_cmount(d06599c0,bf7fdec0,11010000,d1070480,c05f84e0,...) at =
 ffs_cmount+0x5d
 > mount(d1070480,f3408d04) at mount+0x18e
 > syscall(3b,3b,3b,804af21,bf7fe974,...) at syscall+0x2bf
 > Xint0x80_syscall() at Xint0x80_syscall+0x1f
 > --- syscall (21, FreeBSD ELF32, mount), eip =3D 0x880bfbcb, esp =3D =
 0xbf7fde9c, ebp =3D 0xbf7fdf38 ---
 >=20
 > The problem arises both on i386 and amd64 server. Creating snapshots =
 an 300 GB disks connected to amr controller work without any problems.
 >> How-To-Repeat:
 > see above
 >> Fix:
 > The reason for the problem is the update from 1.50.2.2 to 1.50.2.3 of =
 the source ffs_balloc.c (SVN rev 196973 on 2009-09-08 14:19:14; MFC =
 r180758).
 > If I revert this change from the kernel the problem disappears.
 >=20
 >> Release-Note:
 >> Audit-Trail:
 >> Unformatted:
 > _______________________________________________
 > freebsd-bugs@freebsd.org mailing list
 > http://lists.freebsd.org/mailman/listinfo/freebsd-bugs
 > To unsubscribe, send any mail to =
 "freebsd-bugs-unsubscribe@freebsd.org"
 >=20
 
 --=20
 /"\   Best regards,                        | remko@FreeBSD.org
 \ /   Remko Lodder                      |
 X    http://www.evilcoder.org/    | Quis custodiet ipsos custodes
 / \   ASCII Ribbon Campaign    | Against HTML Mail and News
 
 
 
 

From: Andreas Longwitz <longwitz@incore.de>
To: Remko Lodder <remko@elvandar.org>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/151941: FreeBSD RELENG_6 server freezes during create of
 a snapshot on a disk with mpt
Date: Mon, 08 Nov 2010 12:25:21 +0100

 Hello Remko
 
 > Can you please confirm that this also occurs on later versions? 6 is near end of life, and given the timeframe I doubt it would be feasible to look at this
 > just for 6.x. BUT if this also occurs on later versions, then it has more weight in the scale and is viable for investigation.
 > 
 > Thanks
 > Remko
 
 Booting the same server with 8.1 RELEASE does not show the reported
 problem, everything is ok.
 I agree with your '6 is near end of life' argument, but 6 STABLE should
 remain stable. Another example is SVN rev 210932: The update of
 busdma_machdep.c from 1.74.2.6 to 1.74.2.7 breaks in RELENG_6 the
 (probably buggy) de interface driver: Outgoing TCP messages not fitting
 in one packet do not work anymore.
 
 Thanks
 
 -- 
 Dr. Andreas Longwitz
 
 Data Service GmbH
 Beethovenstr. 2A
 23617 Stockelsdorf
 Amtsgericht Lübeck, HRB 318 BS
 Geschäftsführer: Wilfried Paepcke, Dr. Andreas Longwitz, Josef Flatau
 
State-Changed-From-To: open->closed 
State-Changed-By: jh 
State-Changed-When: Sun Mar 6 07:32:27 UTC 2011 
State-Changed-Why:  
RELENG_6 is no longer supported, sorry. The problem is not reproducible 
on 8.1 according to submitter. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=151941 
>Unformatted:
