From nobody@FreeBSD.org  Thu Oct 19 14:19:10 2006
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id A86BE16A407
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 19 Oct 2006 14:19:10 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [216.136.204.117])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 9021343D5E
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 19 Oct 2006 14:18:52 +0000 (GMT)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.13.1/8.13.1) with ESMTP id k9JEIjls098055
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 19 Oct 2006 14:18:45 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.13.1/8.13.1/Submit) id k9JEIjxS098047;
	Thu, 19 Oct 2006 14:18:45 GMT
	(envelope-from nobody)
Message-Id: <200610191418.k9JEIjxS098047@www.freebsd.org>
Date: Thu, 19 Oct 2006 14:18:45 GMT
From: Mark Kamichoff<prox@prolixium.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: panic w/zebra
X-Send-Pr-Version: www-3.0

>Number:         104569
>Category:       kern
>Synopsis:       panic w/zebra
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    glebius
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Oct 19 14:20:19 GMT 2006
>Closed-Date:    Mon Apr 11 11:17:59 UTC 2011
>Last-Modified:  Mon Apr 11 11:17:59 UTC 2011
>Originator:     Mark Kamichoff
>Release:        FreeBSD 6.2-PRERELEASE
>Organization:
>Environment:
FreeBSD starfire.prolixium.com 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #2: Thu Oct 12 17:54:27 EDT 2006     root@starfire.prolixium.com:/usr/obj/usr/src/sys/STARFIRE  i386
>Description:
Over the past few weeks, I've observed several reboots, which are apparently
kernel panics due to something related to zebra.

This machine is a router, and terminates several OpenVPN tunnels as well as
IPv6-in-IPv4 tunnels.  I'm using Quagga to run OSPFv2 and OSPFv3.

This problem first started in 6.1-RELEASE (box was stable w/6.0), so I
upgraded to 6-STABLE, recently, but the problem still persists.

The crash follows:

Fatal trap 12: page fault while in kernel mode
fault virtual address	= 0x78
fault code		= supervisor read, page not present
instruction pointer	= 0x20:0xc058c693
stack pointer	        = 0x28:0xdea99a64
frame pointer	        = 0x28:0xdea99a68
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, def32 1, gran 1
processor eflags	= resume, IOPL = 0
current process		= 1256 (zebra)
trap number		= 12
panic: page fault

System info:

FreeBSD 6.2-PRERELEASE (cvs of 6-STABLE from Sept 20, iirc)
quagga-0.99.4_2

I have included the kgdb output, as well as other information about the
system, here:

http://www.prolixium.com/share/txt/freebsd/zebra/

I can allow the pf.conf to be viewed, if absolutely needed.  It's fairly plain.

Please let me know what else I can provide!

Thanks.

- Mark
>How-To-Repeat:
Unknown.  This happens randomly every couple of days.
>Fix:
Unknown.
>Release-Note:
>Audit-Trail:

From: Mark Kamichoff <prox@prolixium.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/104569: panic w/zebra
Date: Mon, 11 Dec 2006 08:23:33 -0500

 Is there anything else I can provide that will help tracking this down?
 It is still happening:
 
 Unread portion of the kernel message buffer:
 kernel trap 12 with interrupts disabled
 
 
 Fatal trap 12: page fault while in kernel mode
 fault virtual address   = 0x78
 fault code              = supervisor read, page not present
 instruction pointer     = 0x20:0xc0554bcb
 stack pointer           = 0x28:0xdea8ea64
 frame pointer           = 0x28:0xdea8ea68
 code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, def32 1, gran 1
 processor eflags        = resume, IOPL = 0
 current process         = 1548 (zebra)
 trap number             = 12
 panic: page fault
 Uptime: 2d5h52m33s
 Dumping 510 MB (2 chunks)
   chunk 0: 1MB (159 pages) ... ok
   chunk 1: 510MB (130544 pages) 494 478 462 446 430 414 398 382 366 350 334 318 302 286 270 254 238 222 206 190 174 158 142 126 110 94 78 62 46 30 14
 
 #0  doadump () at pcpu.h:165
 165             __asm __volatile("movl %%fs:0,%0" : "=r" (td));
 (kgdb) bt
 #0  doadump () at pcpu.h:165
 #1  0xc052f46e in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
 #2  0xc052f778 in panic (fmt=0xc0709d51 "%s") at /usr/src/sys/kern/kern_shutdown.c:565
 #3  0xc06e5d2d in trap_fatal (frame=0xdea8ea24, eva=0) at /usr/src/sys/i386/i386/trap.c:837
 #4  0xc06e5445 in trap (frame=
       {tf_fs = -629538808, tf_es = -1066139608, tf_ds = 40, tf_edi = -1015486796, tf_esi = -1014488704, tf_ebp = -559355288, tf_isp = -559355312, tf_ebx = -1015492032, tf_edx = -1014488704, tf_ecx = 4, tf_eax = 4, tf_trapno = 12, tf_err = 0, tf_eip = -1068151861, tf_cs = 32, tf_eflags = 65543, tf_esp = -1014488704, tf_ss = -559355252}) at /usr/src/sys/i386/i386/trap.c:270
 #5  0xc06d27ca in calltrap () at /usr/src/sys/i386/i386/exception.s:139
 #6  0xc0554bcb in turnstile_setowner (ts=0xc378d240, owner=0x4)
     at /usr/src/sys/kern/subr_turnstile.c:432
 #7  0xc0554ef7 in turnstile_wait (lock=0xc38c6504, owner=0x4)
     at /usr/src/sys/kern/subr_turnstile.c:591
 #8  0xc0524ddb in _mtx_lock_sleep (m=0xc38c6504, tid=3280478592, opts=0, file=0x0, line=0)
     at /usr/src/sys/kern/kern_mutex.c:579
 #9  0xc05bcb44 in rtrequest1 (req=2, info=0xdea8eb24, ret_nrt=0xdea8eb10)
     at /usr/src/sys/net/route.c:703
 #10 0xc05be7e5 in route_output (m=0xc55fa800, so=0xc3553164) at /usr/src/sys/net/rtsock.c:391
 #11 0xc05bbb12 in raw_usend (so=0x4, flags=0, m=0xc3882180, nam=0x0, control=0x4, 
     td=0xc3882180) at /usr/src/sys/net/raw_usrreq.c:263
 #12 0xc05be457 in rts_send (so=0x4, flags=4, m=0x4, nam=0x4, control=0x4, td=0x4)
     at /usr/src/sys/net/rtsock.c:269
 #13 0xc057136c in sosend (so=0xc3553164, addr=0x0, uio=0xdea8ecb0, top=0xc55fa800, 
     control=0x0, flags=0, td=0xc3882180) at /usr/src/sys/kern/uipc_socket.c:836
 #14 0xc055d2b8 in soo_write (fp=0x4, uio=0xdea8ecb0, active_cred=0xc33d2c00, flags=0, 
     td=0xc3882180) at /usr/src/sys/kern/sys_socket.c:118
 #15 0xc05569e0 in dofilewrite (td=0xc3882180, fd=4, fp=0xc37b0c18, auio=0xdea8ecb0, offset=Unhandled dwarf expression opcode 0x93
 )
     at file.h:252
 #16 0xc0556817 in kern_writev (td=0xc3882180, fd=6, auio=0x4)
     at /usr/src/sys/kern/sys_generic.c:402
 ---Type <return> to continue, or q <return> to quit---
 #17 0xc05566e9 in write (td=0x4, uap=0x4) at /usr/src/sys/kern/sys_generic.c:326
 #18 0xc06e60e3 in syscall (frame=
       {tf_fs = 672006203, tf_es = 672006203, tf_ds = -1078001605, tf_edi = -1077941792, tf_esi = -1077942328, tf_ebp = -1077941864, tf_isp = -559354524, tf_ebx = 20, tf_edx = -1077942496, tf_ecx = 0, tf_eax = 4, tf_trapno = 0, tf_err = 2, tf_eip = 673045383, tf_cs = 51, tf_eflags = 514, tf_esp = -1077942516, tf_ss = 59}) at /usr/src/sys/i386/i386/trap.c:983
 #19 0xc06d281f in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200
 #20 0x00000033 in ?? ()
 Previous frame inner to this frame (corrupt stack?)
 (kgdb) 
 
 Thanks!
 
 - Mark
 
 -- 
 Mark Kamichoff
 prox@prolixium.com
 http://prolixium.com/
 Rensselaer Polytechnic Institute, Class of 2004

From: Mark Kamichoff <prox@prolixium.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/104569: panic w/zebra
Date: Sun, 17 Dec 2006 21:20:09 -0500

 --0F1p//8PRICkK4MW
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 Content-Transfer-Encoding: quoted-printable
 
 Just a note, this looks similar to a problem back from 4.6.2-RELEASE:
 
 http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/42030
 
 Perhaps something in 6.X removed the "band-aid"?
 
 - Mark
 
 --=20
 Mark Kamichoff
 prox@prolixium.com
 http://prolixium.com/
 Rensselaer Polytechnic Institute, Class of 2004
 
 --0F1p//8PRICkK4MW
 Content-Type: application/pgp-signature; name="signature.asc"
 Content-Description: Digital signature
 Content-Disposition: inline
 
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.6 (GNU/Linux)
 
 iD8DBQFFhfrZ0TYC9KtF8BMRAvjwAJwL2Jk/2WQRHiJGvEsiNNW7sC7QFwCgiNql
 9m9O5R2Jq4cwYNjTg/PK9Gk=
 =nhAa
 -----END PGP SIGNATURE-----
 
 --0F1p//8PRICkK4MW--

From: Mark Kamichoff <prox@prolixium.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/104569: panic w/zebra
Date: Sun, 14 Jan 2007 18:25:24 -0500

 This is *almost* reproducible, now.  If I cause a number of OSPFv2/v3
 adjacencies to time out (by unplugging cables, etc.), FreeBSD has a
 50-60% chance of panicking on the spot.  I assume this is due to the
 zebra process changing around routes.
 
 - Mark
 
 -- 
 Mark Kamichoff
 prox@prolixium.com
 http://prolixium.com/
 Rensselaer Polytechnic Institute, Class of 2004

From: Mark Kamichoff <prox@prolixium.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/104569: panic w/zebra
Date: Fri, 26 Jan 2007 09:21:36 -0500

 Can this please be looked at?  It is still happening with 6.2-STABLE as
 of Jan 24th.  I suspect this affects any FreeBSD 6.x-based router
 running Quagga/IPv6, so perhaps the PR priority should be increased.
 
 Thanks.
 
 - Mark
 
 -- 
 Mark Kamichoff
 prox@prolixium.com
 http://prolixium.com/
 Rensselaer Polytechnic Institute, Class of 2004

From: Mark Kamichoff <prox@prolixium.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/104569: panic w/zebra
Date: Tue, 6 Mar 2007 13:14:39 -0500

 --0OAP2g/MAC+5xKAE
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 Content-Transfer-Encoding: quoted-printable
 
 This happens with zebra-0.95a, too, same thing:
 
 Fatal trap 12: page fault while in kernel mode
 fault virtual address   =3D 0x78
 fault code              =3D supervisor read, page not present
 instruction pointer     =3D 0x20:0xc0555579
 stack pointer           =3D 0x28:0xdea4ba64
 frame pointer           =3D 0x28:0xdea4ba68
 code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                         =3D DPL 0, pres 1, def32 1, gran 1
 processor eflags        =3D resume, IOPL =3D 0
 current process         =3D 9454 (zebra)
 trap number             =3D 12
 panic: page fault
 
 - Mark
 
 --=20
 Mark Kamichoff
 prox@prolixium.com
 http://prolixium.com/
 Rensselaer Polytechnic Institute, Class of 2004
 
 --0OAP2g/MAC+5xKAE
 Content-Type: application/pgp-signature; name="signature.asc"
 Content-Description: Digital signature
 Content-Disposition: inline
 
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.6 (GNU/Linux)
 
 iD8DBQFF7a+P0TYC9KtF8BMRAkY4AKCW4rMlcrMzOs8cDELLw95XtW6OdwCdEBbC
 bj22utu4w6Qszz7pF31N8r0=
 =Yhi8
 -----END PGP SIGNATURE-----
 
 --0OAP2g/MAC+5xKAE--

From: Kris Kennaway <kris@obsecurity.org>
To: Mark Kamichoff <prox@prolixium.com>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/104569: panic w/zebra
Date: Sun, 11 Mar 2007 05:27:09 -0400

 Please obtain a debugging traceback as documented in the developers
 handbook chapter on kernel debugging.  This information is required to
 proceed further.
 
 Kris

From: Mark Kamichoff <prox@prolixium.com>
To: Kris Kennaway <kris@obsecurity.org>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/104569: panic w/zebra
Date: Sun, 11 Mar 2007 08:30:17 -0400

 --TB36FDmn/VVEgNH/
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 Content-Transfer-Encoding: quoted-printable
 
 I have included several tracebacks in the PR.  Here is the most current
 one:
 
 Fatal trap 12: page fault while in kernel mode
 fault virtual address =3D 0x78
 fault code =3D supervisor read, page not present
 instruction pointer =3D 0x20:0xc0554bcb
 stack pointer =3D 0x28:0xdea8ea64
 frame pointer =3D 0x28:0xdea8ea68
 code segment =3D base 0x0, limit 0xfffff, type 0x1b
 =3D DPL 0, pres 1, def32 1, gran 1
 processor eflags =3D resume, IOPL =3D 0
 current process =3D 1548 (zebra)
 trap number =3D 12
 panic: page fault
 Uptime: 2d5h52m33s
 Dumping 510 MB (2 chunks)
 chunk 0: 1MB (159 pages) ... ok
 chunk 1: 510MB (130544 pages) 494 478 462 446 430 414 398 382 366 350 334 3=
 18 302 286 270 254 238 222 206 190 174 158 142 126 110 94 78 62 46 30 14
 
 #0 doadump () at pcpu.h:165
 165 __asm __volatile("movl %%fs:0,%0" : "=3Dr" (td));
 (kgdb) bt
 #0 doadump () at pcpu.h:165
 #1 0xc052f46e in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:409
 #2 0xc052f778 in panic (fmt=3D0xc0709d51 "%s") at /usr/src/sys/kern/kern_sh=
 utdown.c:565
 #3 0xc06e5d2d in trap_fatal (frame=3D0xdea8ea24, eva=3D0) at /usr/src/sys/i=
 386/i386/trap.c:837
 #4 0xc06e5445 in trap (frame=3D
 {tf_fs =3D -629538808, tf_es =3D -1066139608, tf_ds =3D 40, tf_edi =3D -101=
 5486796, tf_esi =3D -1014488704, tf_ebp =3D -559355288, tf_isp =3D -5593553=
 12, tf_ebx =3D -1015492032, tf_edx =3D -1014488704, tf_ecx =3D 4, tf_eax =
 =3D 4, tf_trapno =3D 12, tf_err =3D 0, tf_eip =3D -1068151861, tf_cs =3D 32=
 , tf_eflags =3D 65543, tf_esp =3D -1014488704, tf_ss =3D -559355252}) at /u=
 sr/src/sys/i386/i386/trap.c:270
 #5 0xc06d27ca in calltrap () at /usr/src/sys/i386/i386/exception.s:139
 #6 0xc0554bcb in turnstile_setowner (ts=3D0xc378d240, owner=3D0x4)
 at /usr/src/sys/kern/subr_turnstile.c:432
 #7 0xc0554ef7 in turnstile_wait (lock=3D0xc38c6504, owner=3D0x4)
 at /usr/src/sys/kern/subr_turnstile.c:591
 #8 0xc0524ddb in _mtx_lock_sleep (m=3D0xc38c6504, tid=3D3280478592, opts=3D=
 0, file=3D0x0, line=3D0)
 at /usr/src/sys/kern/kern_mutex.c:579
 #9 0xc05bcb44 in rtrequest1 (req=3D2, info=3D0xdea8eb24, ret_nrt=3D0xdea8eb=
 10)
 at /usr/src/sys/net/route.c:703
 #10 0xc05be7e5 in route_output (m=3D0xc55fa800, so=3D0xc3553164) at /usr/sr=
 c/sys/net/rtsock.c:391
 #11 0xc05bbb12 in raw_usend (so=3D0x4, flags=3D0, m=3D0xc3882180, nam=3D0x0=
 , control=3D0x4,
 td=3D0xc3882180) at /usr/src/sys/net/raw_usrreq.c:263
 #12 0xc05be457 in rts_send (so=3D0x4, flags=3D4, m=3D0x4, nam=3D0x4, contro=
 l=3D0x4, td=3D0x4)
 at /usr/src/sys/net/rtsock.c:269
 #13 0xc057136c in sosend (so=3D0xc3553164, addr=3D0x0, uio=3D0xdea8ecb0, to=
 p=3D0xc55fa800,
 control=3D0x0, flags=3D0, td=3D0xc3882180) at /usr/src/sys/kern/uipc_socket=
 =2Ec:836
 #14 0xc055d2b8 in soo_write (fp=3D0x4, uio=3D0xdea8ecb0, active_cred=3D0xc3=
 3d2c00, flags=3D0,
 td=3D0xc3882180) at /usr/src/sys/kern/sys_socket.c:118
 #15 0xc05569e0 in dofilewrite (td=3D0xc3882180, fd=3D4, fp=3D0xc37b0c18, au=
 io=3D0xdea8ecb0, offset=3DUnhandled dwarf expression opcode 0x93
 )
 at file.h:252
 #16 0xc0556817 in kern_writev (td=3D0xc3882180, fd=3D6, auio=3D0x4)
 at /usr/src/sys/kern/sys_generic.c:402
 ---Type <return> to continue, or q <return> to quit---
 #17 0xc05566e9 in write (td=3D0x4, uap=3D0x4) at /usr/src/sys/kern/sys_gene=
 ric.c:326
 #18 0xc06e60e3 in syscall (frame=3D
 {tf_fs =3D 672006203, tf_es =3D 672006203, tf_ds =3D -1078001605, tf_edi =
 =3D -1077941792, tf_esi =3D -1077942328, tf_ebp =3D -1077941864, tf_isp =3D=
  -559354524, tf_ebx =3D 20, tf_edx =3D -1077942496, tf_ecx =3D 0, tf_eax =
 =3D 4, tf_trapno =3D 0, tf_err =3D 2, tf_eip =3D 673045383, tf_cs =3D 51, t=
 f_eflags =3D 514, tf_esp =3D -1077942516, tf_ss =3D 59}) at /usr/src/sys/i3=
 86/i386/trap.c:983
 #19 0xc06d281f in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s=
 :200
 #20 0x00000033 in ?? ()
 Previous frame inner to this frame (corrupt stack?)
 (kgdb)
 
 - Mark
 
 --=20
 Mark Kamichoff
 prox@prolixium.com
 http://prolixium.com/
 Rensselaer Polytechnic Institute, Class of 2004
 
 --TB36FDmn/VVEgNH/
 Content-Type: application/pgp-signature; name="signature.asc"
 Content-Description: Digital signature
 Content-Disposition: inline
 
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.6 (GNU/Linux)
 
 iD8DBQFF8/ZZ0TYC9KtF8BMRAle4AJ94XjxVJ2w/164irMiKWBNAn/BUWACbBQoH
 CQHZdEJuPB3PFXHKttUxBfo=
 =jORy
 -----END PGP SIGNATURE-----
 
 --TB36FDmn/VVEgNH/--
Responsible-Changed-From-To: freebsd-bugs->glebius 
Responsible-Changed-By: glebius 
Responsible-Changed-When: Wed Jun 6 16:38:42 UTC 2007 
Responsible-Changed-Why:  
Can't promise to fix this, but I will try to look at this one. However, a more precise reproduce recipe would be appreciated. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=104569 
State-Changed-From-To: open->closed 
State-Changed-By: glebius 
State-Changed-When: Mon Apr 11 11:16:14 UTC 2011 
State-Changed-Why:  
There was a number of fixes to routing code since. I believe this 
issue is fixed. 

Since submitter did not post any followups, I'll move this 
PR to closed state. However, submitter can reopen it. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=104569 
>Unformatted:
