From nobody@FreeBSD.org  Mon Apr  9 14:55:17 2007
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 690C516A400
	for <freebsd-gnats-submit@FreeBSD.org>; Mon,  9 Apr 2007 14:55:17 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [69.147.83.33])
	by mx1.freebsd.org (Postfix) with ESMTP id 5BDA313C44B
	for <freebsd-gnats-submit@FreeBSD.org>; Mon,  9 Apr 2007 14:55:17 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.13.1/8.13.1) with ESMTP id l39EtHqr019276
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 9 Apr 2007 14:55:17 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.13.1/8.13.1/Submit) id l39EoFdn012939;
	Mon, 9 Apr 2007 14:50:15 GMT
	(envelope-from nobody)
Message-Id: <200704091450.l39EoFdn012939@www.freebsd.org>
Date: Mon, 9 Apr 2007 14:50:15 GMT
From: riton<bla@gericos.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: panic when nfsd running
X-Send-Pr-Version: www-3.0

>Number:         111413
>Category:       kern
>Synopsis:       [nfs] panic when nfsd running
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Apr 09 15:00:08 GMT 2007
>Closed-Date:    Sun Apr 15 01:45:27 GMT 2007
>Last-Modified:  Sun Apr 15 01:50:02 GMT 2007
>Originator:     riton
>Release:        6.2-STABLE
>Organization:
>Environment:
>Description:
When nfsd is running, here is the type of panic I have (not always the same):

FreeBSD/i386 (nfs01) (ttyd0)

login: Memory modified after free 0xcc0b0a00(508) val=dead80de @ 0xcc0b0b7c
panic: Most recently used by vnodemarker

KDB: enter: panic
[thread pid 26 tid 100018 ]
Stopped at      kdb_enter+0x2c: leave
db> where
Tracing pid 26 tid 100018 td 0xc9aef780
kdb_enter(c0409c73,100,1fc,cc0b0bfc,cc0b0a00,...) at kdb_enter+0x2c
panic(c041c7a0,c041288d,c041c771,cc0b0a00,1fc,...) at panic+0x10a
mtrash_ctor(cc0b0a00,200,0,102) at mtrash_ctor+0x5f
uma_zalloc_arg(c0855a00,0,102) at uma_zalloc_arg+0x26f
malloc(110,c0440b00,102,c9d5faa4,0,...) at malloc+0x6a
__mnt_vnode_first(e873fc54,c9d5fa60,c9d5faa4,0,c041319c,b39) at __mnt_vnode_first+0x78
vfs_msync(c9d5fa60,2,c9d5fa60,e873fcf0,c9e37cc0,...) at vfs_msync+0x2d
sync_fsync(e873fcf0,c9e37d80,e873fd0c,c0303794,c0441240,...) at sync_fsync+0x145
VOP_FSYNC_APV(c0441240,e873fcf0,8,461a2b13,c9e37d3c,...) at VOP_FSYNC_APV+0x56
sched_sync(0,e873fd38,0,c03033ac,0,...) at sched_sync+0x3e8
fork_exit(c03033ac,0,e873fd38) at fork_exit+0x7d
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe873fd6c, ebp = 0 ---
db> show alllocks
Process 26 (syncer) thread 0xc9aef780 (100018)
exclusive sleep mutex Giant r = 0 (0xc045d6a0) locked @ kern/vfs_subr.c:1630
db> ps
  pid  ppid  pgrp   uid   state   wmesg     wchan    cmd
 3954     1  3954     0  Ss+     ttyin    0xc9bdb010 getty
  700     1   700     0  Ss+     ttyin    0xc9bbdc10 getty
  699     1   699     0  Ss+     ttyin    0xc9b99810 getty
  698     1   698     0  Ss+     ttyin    0xc9bab010 getty
  697     1   697     0  Ss+     ttyin    0xc9bac810 getty
  696     1   696     0  Ss+     ttyin    0xc9b91c10 getty
  695     1   695     0  Ss+     ttyin    0xc9bacc10 getty
  694     1   694     0  Ss+     ttyin    0xc9b98810 getty
  693     1   693     0  Ss+     ttyin    0xc9bab410 getty
  652     1   652     0  Ss      nanslp   0xc045df8c cron
  646     1   646    25  Ss      pause    0xc9d3a894 sendmail
  642     1   642     0  Ss      select   0xc04a8ae4 sendmail
  636     1   636     0  Ss      select   0xc04a8ae4 sshd
  578     1   576     0  S       select   0xc04a8ae4 snmpd
  567   561   561     0  S       nfslockd 0xc04ac628 rpc.lockd
  561     1   561     0  Ss      select   0xc04a8ae4 rpc.lockd
  556     1   556     0  Ss      select   0xc04a8ae4 rpc.statd
  551   546   546     0  S       -        0xc9f81400 nfsd
  550   546   546     0  S       -        0xc9f81600 nfsd
  549   546   546     0  S       -        0xc9f81800 nfsd
  548   546   546     0  S       -        0xc9f81a00 nfsd
  546     1   546     0  Ss      accept   0xc9e68892 nfsd
  538     1   538     0  Ss      select   0xc04a8ae4 mountd
  498     1   498     0  Ss      select   0xc04a8ae4 rpcbind
  473     1   473     0  Ss      select   0xc04a8ae4 syslogd
  421     1   421     0  Ss      select   0xc04a8ae4 devd
  119     1   119     0  Ss      pause    0xc9b93cc4 adjkerntz
   29     0     0     0  SL      -        0xea467cfc [schedcpu]
   28     0     0     0  SL      sdflush  0xc04b1974 [softdepflush]
   27     0     0     0  SL      vlruwt   0xc9b4e648 [vnlru]
   26     0     0     0  RL      CPU 0               [syncer]
   25     0     0     0  SL      psleep   0xc04a9068 [bufdaemon]
    9     0     0     0  SL      pgzero   0xc04b28e4 [pagezero]
    8     0     0     0  SL      psleep   0xc04b2434 [vmdaemon]
    7     0     0     0  SL      psleep   0xc04b23f0 [pagedaemon]
   24     0     0     0  WL                          [swi0: sio]
   23     0     0     0  WL                          [irq1: atkbd0]
   22     0     0     0  WL                          [irq19: atapci1]
   21     0     0     0  WL                          [irq15: ata1]
   20     0     0     0  WL                          [irq14: ata0]
   19     0     0     0  WL                          [irq17: em1]
   18     0     0     0  WL                          [irq16: em0]
    6     0     0     0  SL      -        0xc9b6f600 [kqueue taskq]
   17     0     0     0  WL                          [swi6: task queue]
   16     0     0     0  WL                          [swi6: Giant taskq]
    5     0     0     0  SL      -        0xc9b2d680 [thread taskq]
   15     0     0     0  WL                          [swi5: +]
   14     0     0     0  SL      -        0xc045aa20 [yarrow]
    4     0     0     0  SL      -        0xc045b428 [g_down]
    3     0     0     0  SL      -        0xc045b424 [g_up]
    2     0     0     0  SL      -        0xc045b41c [g_event]
   13     0     0     0  WL                          [swi1: net]
   12     0     0     0  WL                          [swi3: vm]
   11     0     0     0  RL                          [swi4: clock sio]
   10     0     0     0  RL                          [idle]
    1     0     1     0  SLs     wait     0xc9af2000 [init]
    0     0     0     0  WLs                         [swapper]
db>

When nfsd is not running the server never crashes else the uptime is never more than 2 days.

The server is new and not in production.





>How-To-Repeat:
Reboot the box, and simply wait for the crash.
>Fix:
kill nfsd from start (not very useful for a NFS server).
>Release-Note:
>Audit-Trail:

From: Kris Kennaway <kris@obsecurity.org>
To: riton <bla@gericos.com>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/111413: panic when nfsd running
Date: Mon, 9 Apr 2007 23:48:41 -0400

 On Mon, Apr 09, 2007 at 02:50:15PM +0000, riton wrote:
 
 > When nfsd is running, here is the type of panic I have (not always the same):
 > 
 > FreeBSD/i386 (nfs01) (ttyd0)
 > 
 > login: Memory modified after free 0xcc0b0a00(508) val=dead80de @ 0xcc0b0b7c
 > panic: Most recently used by vnodemarker
 
 You will need to configure DEBUG_MEMGUARD to watch this malloc type.
 Unfortunately this requires a tiny change to the source so it is not
 completely trivial, but see memguard(9).
 
 Once you have done this, you will hopefully get a different panic when
 the memory is first accessed after it was freed, and we can proceed
 from there.
 
 Kris
State-Changed-From-To: open->feedback 
State-Changed-By: remko 
State-Changed-When: Tue Apr 10 05:11:07 UTC 2007 
State-Changed-Why:  
Kris asked the submitter to do additional things to get to a point where we 
can start debugging, reflect that state. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=111413 

From: <bla@gericos.com>
To: "'Kris Kennaway'" <kris@obsecurity.org>
Cc: <freebsd-gnats-submit@FreeBSD.org>
Subject: RE: kern/111413: panic when nfsd running
Date: Sun, 15 Apr 2007 03:00:45 +0200

 Hi Kris,
 
 The NFS server is totally KO. I cannot use it anymore, the server always
 crashes differently before launch any process at boot time.
 I'm certain it's a hardware problem. I will check it next week.
 
 You can close this case.
 
 Thanks.
 
 -----Message d'origine-----
 De=A0: Kris Kennaway [mailto:kris@obsecurity.org]=20
 Envoy=E9=A0: mardi 10 avril 2007 05:49
 =C0=A0: riton
 Cc=A0: freebsd-gnats-submit@FreeBSD.org
 Objet=A0: Re: kern/111413: panic when nfsd running
 
 On Mon, Apr 09, 2007 at 02:50:15PM +0000, riton wrote:
 
 > When nfsd is running, here is the type of panic I have (not always the
 same):
 >=20
 > FreeBSD/i386 (nfs01) (ttyd0)
 >=20
 > login: Memory modified after free 0xcc0b0a00(508) val=3Ddead80de @
 0xcc0b0b7c
 > panic: Most recently used by vnodemarker
 
 You will need to configure DEBUG_MEMGUARD to watch this malloc type.
 Unfortunately this requires a tiny change to the source so it is not
 completely trivial, but see memguard(9).
 
 Once you have done this, you will hopefully get a different panic when
 the memory is first accessed after it was freed, and we can proceed
 from there.
 
 Kris
 
State-Changed-From-To: feedback->closed 
State-Changed-By: kris 
State-Changed-When: Sun Apr 15 01:44:57 UTC 2007 
State-Changed-Why:  
Submitter reports the problem appears to be due to broken hardware. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=111413 

From: Kris Kennaway <kris@obsecurity.org>
To: bla@gericos.com
Cc: 'Kris Kennaway' <kris@obsecurity.org>, freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/111413: panic when nfsd running
Date: Sat, 14 Apr 2007 21:45:53 -0400

 On Sun, Apr 15, 2007 at 03:00:45AM +0200, bla@gericos.com wrote:
 > Hi Kris,
 > 
 > The NFS server is totally KO. I cannot use it anymore, the server always
 > crashes differently before launch any process at boot time.
 > I'm certain it's a hardware problem. I will check it next week.
 > 
 > You can close this case.
 
 Thanks, good luck sorting out the hardware.
 
 Kris
>Unformatted:
