From nobody@FreeBSD.org  Fri Aug 13 04:55:36 2004
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id BFCCD16A4CE
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 13 Aug 2004 04:55:36 +0000 (GMT)
Received: from www.freebsd.org (www.freebsd.org [216.136.204.117])
	by mx1.FreeBSD.org (Postfix) with ESMTP id A6C4243D46
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 13 Aug 2004 04:55:36 +0000 (GMT)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.12.11/8.12.11) with ESMTP id i7D4taJb019789
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 13 Aug 2004 04:55:36 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.12.11/8.12.11/Submit) id i7D4tac1019788;
	Fri, 13 Aug 2004 04:55:36 GMT
	(envelope-from nobody)
Message-Id: <200408130455.i7D4tac1019788@www.freebsd.org>
Date: Fri, 13 Aug 2004 04:55:36 GMT
From: Sangwoo Shim <ssw@neo.redjade.org>
To: freebsd-gnats-submit@FreeBSD.org
Subject: Panic in nd6_slowtimo (pflog related?)
X-Send-Pr-Version: www-2.3

>Number:         70393
>Category:       kern
>Synopsis:       [panic] in nd6_slowtimo (pflog related?)
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    mlaier
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Aug 13 05:00:43 GMT 2004
>Closed-Date:    Mon Sep 20 15:27:55 GMT 2004
>Last-Modified:  Mon Sep 20 15:27:55 GMT 2004
>Originator:     Sangwoo Shim
>Release:        5-current
>Organization:
>Environment:
FreeBSD ssw 5.2-CURRENT FreeBSD 5.2-CURRENT #1: Thu Aug 12 07:08:05 KST 2004     root@ssw:/usr/obj/usr/src/sys/SSW-SMP  i386
>Description:
      I recently got this panic. 1~2 times in a day.
It seems that pflog is the culprit..  pflog0's if_afdata contains
nothing but null. I couldn't reproduce the panic with pf.ko unloaded. 
option INET6 is in kernel configuration.
The machine is SMP. If you need more information, please let me know.
I'm using FreeBSD-current of Aug 12.

panic messages:
---
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 01
fault virtual address   = 0x8
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc056ec72
stack pointer           = 0x10:0xd53efcb8
frame pointer           = 0x10:0xd53efcc4
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 37 (swi5: clock sio)
Dumping 511 MB
 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 3
52 368 384 400 416 432 448 464 480 496
---
#0  doadump () at pcpu.h:159
159     pcpu.h: No such file or directory.
        in pcpu.h
doadump () at pcpu.h:159
159     in pcpu.h
(kgdb) bt
#0  doadump () at pcpu.h:159
#1  0xc043b83a in db_fncall (dummy1=0, dummy2=0, dummy3=-717292800,
    dummy4=0xd53efae8 "\034\xfb\xbe\xd5\xa2) at /usr/src/sys/ddb/db_command.c:53
1
#2  0xc043b648 in db_command (last_cmdp=0xc069cea4, cmd_table=0x0,
    aux_cmd_tablep=0xc066cc44, aux_cmd_tablep_end=0xc066cc48)
    at /usr/src/sys/ddb/db_command.c:349
#3  0xc043b710 in db_command_loop () at /usr/src/sys/ddb/db_command.c:455
#4  0xc043d289 in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_main.c:221
#5  0xc04d9020 in kdb_trap (type=12, code=0, tf=0xd53efc78)
    at /usr/src/sys/kern/subr_kdb.c:401
#6  0xc062795d in trap_fatal (frame=0xd53efc78, eva=8)
    at /usr/src/sys/i386/i386/trap.c:807
#7  0xc06276bb in trap_pfault (frame=0xd53efc78, usermode=0, eva=8)
    at /usr/src/sys/i386/i386/trap.c:730
#8  0xc06272d1 in trap (frame=
      {tf_fs = -1045626856, tf_es = -717357040, tf_ds = -717357040, tf_edi = -10
45585920, tf_esi = -1045508608, tf_ebp = -717292348, tf_isp = -717292380, tf_ebx
 = 23040, tf_edx = 1474, tf_ecx = -1066723816, tf_eax = 0, tf_trapno = 12, tf_er
r = 0, tf_eip = -1068045198, tf_cs = 8, tf_eflags = 66182, tf_esp = 6, tf_ss = 4
}) at /usr/src/sys/i386/i386/trap.c:417
#9  0xc0615b1a in calltrap () at /usr/src/sys/i386/i386/exception.s:140
#10 0xc1ad0018 in ?? ()
#11 0xd53e0010 in ?? ()
#12 0xd53e0010 in ?? ()
#13 0xc1ada000 in ?? ()
#14 0xc1aece00 in ?? ()
#15 0xd53efcc4 in ?? ()
#16 0xd53efca4 in ?? ()
#17 0x00005a00 in ?? ()
#18 0x000005c2 in ?? ()
#19 0xc06b1618 in arc4_sbox ()
#20 0x00000000 in ?? ()
#21 0x0000000c in ?? ()
#22 0x00000000 in ?? ()
#23 0xc056ec72 in nd6_slowtimo (ignored_arg=0x0)
    at /usr/src/sys/netinet6/nd6.c:1800
#24 0xc04cd05b in softclock (dummy=0x0) at /usr/src/sys/kern/kern_timeout.c:259
#25 0xc04ab6bd in ithread_loop (arg=0xc1977c00)
    at /usr/src/sys/kern/kern_intr.c:546
#26 0xc04aa7fd in fork_exit (callout=0xc04ab564 <ithread_loop>,
    arg=0xc1977c00, frame=0xd53efd48) at /usr/src/sys/kern/kern_fork.c:819
#27 0xc0615b7c in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:209
(kgdb) up 23
#23 0xc056ec72 in nd6_slowtimo (ignored_arg=0x0)
    at /usr/src/sys/netinet6/nd6.c:1800
1800                    nd6if = ND_IFINFO(ifp);
(kgdb) l
1795
1796            callout_reset(&nd6_slowtimo_ch, ND6_SLOWTIMER_INTERVAL * hz,
1797                nd6_slowtimo, NULL);
1798            IFNET_RLOCK();
1799            for (ifp = TAILQ_FIRST(&ifnet); ifp; ifp = TAILQ_NEXT(ifp, if_li
st)) {
1800                    nd6if = ND_IFINFO(ifp);
1801                    if (nd6if->basereachable && /* already initialized */
1802                        (nd6if->recalctm -= ND6_SLOWTIMER_INTERVAL) <= 0) {
1803                            /*
1804                             * Since reachable time rarely changes by router
(kgdb) p *ifp
$1 = {if_softc = 0xc1ada000, if_link = {tqe_next = 0xc1ae1800,
    tqe_prev = 0xc1adb004},
  if_xname = "pflog0\000\000\000\000\000\000\000\000\000",
  if_dname = 0xc077ee0d "pflog", if_dunit = 0, if_addrhead = {
    tqh_first = 0xc1ae3e00, tqh_last = 0xc1ae3e60}, if_klist = {
    slh_first = 0x0}, if_pcount = 0, if_carp = 0x0, if_bpf = 0x0,
  if_index = 4, if_timer = 0, if_nvlans = 0, if_flags = 0,
  if_capabilities = 0, if_capenable = 0, if_linkmib = 0x0, if_linkmiblen = 0,
  if_data = {ifi_type = 246 '\xf6\xa7, ifi_physical = 0 '\0', ifi_addrlen = 0 '\
0',
    ifi_hdrlen = 48 '0', ifi_link_state = 0 '\0', ifi_recvquota = 0 '\0',
    ifi_xmitquota = 0 '\0', ifi_mtu = 33208, ifi_metric = 0, ifi_baudrate = 0,
    ifi_ipackets = 0, ifi_ierrors = 0, ifi_opackets = 0, ifi_oerrors = 0,
    ifi_collisions = 0, ifi_ibytes = 0, ifi_obytes = 0, ifi_imcasts = 0,
    ifi_omcasts = 0, ifi_iqdrops = 0, ifi_noproto = 0, ifi_hwassist = 0,
    ifi_unused = 0, ifi_lastchange = {tv_sec = 1, tv_usec = 10464}},
  if_multiaddrs = {tqh_first = 0x0, tqh_last = 0xc1ada0a8}, if_amcount = 0,
  if_output = 0xc077d738, if_input = 0, if_start = 0xc077d69c,
  if_ioctl = 0xc077d760, if_watchdog = 0, if_init = 0, if_resolvemulti = 0,
  if_snd = {ifq_head = 0x0, ifq_tail = 0x0, ifq_len = 0, ifq_maxlen = 50,
    ifq_drops = 0, ifq_mtx = {mtx_object = {lo_class = 0xc067db3c,
        lo_name = 0xc1ada00c "pflog0", lo_type = 0xc0657e7d "if send queue",
        lo_flags = 196608, lo_list = {tqe_next = 0x0, tqe_prev = 0x0},
        lo_witness = 0x0}, mtx_lock = 4, mtx_recurse = 0}, ifq_drv_head = 0x0,
    ifq_drv_tail = 0x0, ifq_drv_len = 0, ifq_drv_maxlen = 0, altq_type = 0,
    altq_flags = 0, altq_disc = 0x0, altq_ifp = 0xc1ada000, altq_enqueue = 0,
    altq_dequeue = 0, altq_request = 0, altq_clfier = 0x0, altq_classify = 0,
    altq_tbr = 0x0, altq_cdnr = 0x0}, if_broadcastaddr = 0x0, lltables = 0x0,
  if_label = 0x0, if_prefixhead = {tqh_first = 0x0, tqh_last = 0xc1ada150},
  if_afdata = {0x0 <repeats 37 times>}, if_afdata_initialized = 1,
  if_afdata_mtx = {mtx_object = {lo_class = 0xc067db3c,
      lo_name = 0xc0657e6d "if_afdata", lo_type = 0xc0657e6d "if_afdata",
      lo_flags = 196608, lo_list = {tqe_next = 0x0, tqe_prev = 0x0},
      lo_witness = 0x0}, mtx_lock = 4, mtx_recurse = 0}, if_starttask = {
    ta_link = {stqe_next = 0x0}, ta_pending = 0, ta_priority = 0,
    ta_func = 0xc0527fb4 <if_start_deferred>, ta_context = 0xc1ada000}}  

>How-To-Repeat:
      On SMP machine (I'm not sure, but my other machines, which are non-SMP don
't exhibit the problem), kldload pf at boot time. You should have "option INET6"
 in kernel configuration. Wait for about an hour, then you will encounter the pa
nic.
>Fix:
      
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->mlaier 
Responsible-Changed-By: mlaier 
Responsible-Changed-When: Fri Aug 13 22:56:25 GMT 2004 
Responsible-Changed-Why:  
pf* summons mlaier ;) ... I'll take this one, thanks. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=70393 

From: sebastian ssmoller <sebastian.ssmoller@gmx.net>
To: freebsd-gnats-submit@FreeBSD.org, ssw@neo.redjade.org
Cc:  
Subject: Re: kern/70393: Panic in nd6_slowtimo (pflog related?)
Date: Sat, 14 Aug 2004 09:34:33 +0200

 this problem exists on single cpu systems too.
 
 have a look at:
 http://lists.freebsd.org/pipermail/freebsd-current/2004-January/019524.html
 
 the system is:
 
 CPU: AMD Duron(tm) (800.03-MHz 686-class CPU)
   Origin = "AuthenticAMD"  Id = 0x631  Stepping = 1
  
 Features=0x183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,
 CMOV,PAT,PSE36,MMX,FXSR>  AMD Features=0xc0440000<RSVD,AMIE,DSP,3DNow!>
 real memory  = 536805376 (511 MB)
 avail memory = 511860736 (488 MB)
 Pentium Pro MTRR support enabled
     ACPI-0277: *** Warning: Invalid checksum in table [FACP] (8e, sum a5
 is not zero)
 
 seb

From: Max Laier <max@love2party.net>
To: freebsd-gnats-submit@FreeBSD.org, ssw@neo.redjade.org
Cc: yongari@freebsd.org
Subject: Re: kern/70393: Panic in nd6_slowtimo (pflog related?)
Date: Mon, 16 Aug 2004 22:02:06 +0200

 [ Some off-band communication for the record ]
 
 Tests from yongari@ suggest that this happens only in connection with loading 
 via loader.conf.
 
 Can you verify/falsify this? For normal operation kldload from rc.d/pf is 
 suiteable.
 
 It seems that both built-in and kldload'ed work, while loader.conf fails. This 
 might concern all pseudo interface modules. Tests with gif/faith or the like 
 must be conducted.
 
 -- 
 	Max

From: Max Laier <max@love2party.net>
To: freebsd-gnats-submit@FreeBSD.org, ssw@neo.redjade.org
Cc: yongari@freebsd.org
Subject: Re: kern/70393: Panic in nd6_slowtimo (pflog related?)
Date: Sun, 22 Aug 2004 23:16:38 +0200

 --Boundary-00=_30QKB9smYDHOnby
 Content-Type: text/plain;
   charset="us-ascii"
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline
 
 Can you try attached diff, please? It seems to fix the issue for me and at the 
 same time allows to move if_loop to SI_SUB_PSEUDO again (where it should be 
 really).
 
 I've put the patch to: 
  http://people.freebsd.org/~mlaier/fix_if_attachdomain1.diff
 for easier access.
 
 Thanks.
 
 -- 
  Max
 
 --Boundary-00=_30QKB9smYDHOnby
 Content-Type: text/x-diff;
   charset="us-ascii";
   name="fix_if_attachdomain1.diff"
 Content-Transfer-Encoding: 7bit
 Content-Disposition: attachment;
 	filename="fix_if_attachdomain1.diff"
 
 Index: if.c
 ===================================================================
 RCS file: /usr/store/mlaier/fcvs/src/sys/net/if.c,v
 retrieving revision 1.198
 diff -u -r1.198 if.c
 --- if.c	6 Aug 2004 09:08:33 -0000	1.198
 +++ if.c	22 Aug 2004 17:14:24 -0000
 @@ -480,9 +480,16 @@
  	s = splnet();
  
  	/*
 -	 * Since dp->dom_ifattach calls malloc() with M_WAITOK, we
 +	 * XXX: Since dp->dom_ifattach calls malloc() with M_WAITOK, we
  	 * cannot lock ifp->if_afdata initialization, entirely.
 +	 *
 +	 * If there are no domains do not set if_afdata_initialized to allow
 +	 * initialization later (via SI_SUB_PROTO_IFATTACHDOMAIN).
  	 */
 +	if (domains == NULL) {
 +		splx(s);
 +		return;
 +	}
  	if (IF_AFDATA_TRYLOCK(ifp) == 0) {
  		splx(s);
  		return;
 
 --Boundary-00=_30QKB9smYDHOnby--
State-Changed-From-To: open->feedback 
State-Changed-By: mlaier 
State-Changed-When: Sun Aug 22 21:23:36 GMT 2004 
State-Changed-Why:  
Patch needs testing ... 

http://www.freebsd.org/cgi/query-pr.cgi?pr=70393 

From: Sangwoo Shim <ssw@neo.redjade.org>
To: Max Laier <max@love2party.net>
Cc: freebsd-gnats-submit@FreeBSD.org, ssw@neo.redjade.org
Subject: Re: kern/70393: Panic in nd6_slowtimo (pflog related?)
Date: Sat, 4 Sep 2004 16:06:40 +0900

 Unfortunately the patch didn't prevent my machine from the panic.
 I'm using 6.0-CURRENT of AUG 31.
 
 
 ssw:~ $ ident /usr/src/sys/net/if.c
 /usr/src/sys/net/if.c:
      $FreeBSD: src/sys/net/if.c,v 1.201 2004/08/30 06:29:26 brooks Exp $
 ssw:~ $ uname -a
 FreeBSD somewhere 6.0-CURRENT FreeBSD 6.0-CURRENT #1: Tue Aug 31 21:31:54 KST 2004     root@somewhere:/usr/obj/usr/src/sys/SSW-SMP  i386
 ssw:~ $ cd /usr/src/sys/net
 ssw:/usr/src/sys/net $ diff -u if.c.orig if.c
 --- if.c.orig   Tue Aug 31 02:40:01 2004
 +++ if.c        Tue Aug 31 21:15:14 2004
 @@ -483,9 +483,16 @@
         s = splnet();
  
         /*
 -        * Since dp->dom_ifattach calls malloc() with M_WAITOK, we
 +        * XXX: Since dp->dom_ifattach calls malloc() with M_WAITOK, we
          * cannot lock ifp->if_afdata initialization, entirely.
 +        *
 +        * If there are no domains do not set if_afdata_initialized to allow
 +        * initialization later (via SI_SUB_PROTO_IFATTACHDOMAIN).
          */
 +       if (domains == NULL) {
 +               splx(s);
 +               return;
 +       }
         if (IF_AFDATA_TRYLOCK(ifp) == 0) {
                 splx(s);
                 return;
 ssw:/usr/src/sys/net $ 
 
 
 
 On Sun, Aug 22, 2004 at 11:16:38PM +0200, Max Laier wrote:
 > Can you try attached diff, please? It seems to fix the issue for me and at the 
 > same time allows to move if_loop to SI_SUB_PSEUDO again (where it should be 
 > really).
 > 
 > I've put the patch to: 
 >  http://people.freebsd.org/~mlaier/fix_if_attachdomain1.diff
 > for easier access.
 > 
 > Thanks.
 > 
 > -- 
 >  Max
 
 > Index: if.c
 > ===================================================================
 > RCS file: /usr/store/mlaier/fcvs/src/sys/net/if.c,v
 > retrieving revision 1.198
 > diff -u -r1.198 if.c
 > --- if.c	6 Aug 2004 09:08:33 -0000	1.198
 > +++ if.c	22 Aug 2004 17:14:24 -0000
 > @@ -480,9 +480,16 @@
 >  	s = splnet();
 >  
 >  	/*
 > -	 * Since dp->dom_ifattach calls malloc() with M_WAITOK, we
 > +	 * XXX: Since dp->dom_ifattach calls malloc() with M_WAITOK, we
 >  	 * cannot lock ifp->if_afdata initialization, entirely.
 > +	 *
 > +	 * If there are no domains do not set if_afdata_initialized to allow
 > +	 * initialization later (via SI_SUB_PROTO_IFATTACHDOMAIN).
 >  	 */
 > +	if (domains == NULL) {
 > +		splx(s);
 > +		return;
 > +	}
 >  	if (IF_AFDATA_TRYLOCK(ifp) == 0) {
 >  		splx(s);
 >  		return;
 
State-Changed-From-To: feedback->patched 
State-Changed-By: mlaier 
State-Changed-When: Tue Sep 14 03:13:12 GMT 2004 
State-Changed-Why:  
A temporary workaround has been committed. Please check CURRENT and report 
problems, thanks. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=70393 

From: "Giovanni P. Tirloni" <gpt@tirloni.org>
To: freebsd-gnats-submit@FreeBSD.org, ssw@neo.redjade.org
Cc:  
Subject: Re: kern/70393: [panic] in nd6_slowtimo (pflog related?)
Date: Thu, 16 Sep 2004 11:29:09 -0300

 This still happens in 5.3BETA4 (RELENG_5).
 
 FreeBSD packet 5.3-BETA4 FreeBSD 5.3-BETA4 #6: Wed Sep 15 12:37:36 BRT 
 2004     root@packet:/usr/src/sys/i386/compile/PACKET  i386
 
  > trace
 nd6_slowtimo(0)
 softclock(0)
 ithread_loop(c1196480,c583fd48)
 fork_exit(c054615c,c1196480,c583fd48)
 fork_trampoline()
 --- trap 0x1, eip=0, esp=0xc583fd7c, ebp=0 ---
 
 The crashes are random but they don't take too much time to happen.
 
 The box is running on a IPV4-only network and it's a fresh install 
 updated through CVSup. Please ask for any other detail that you find useful.

From: Sangwoo Shim <ssw@neo.redjade.org>
To: "Giovanni P. Tirloni" <gpt@tirloni.org>
Cc: freebsd-gnats-submit@freebsd.org
Subject: Re: kern/70393: [panic] in nd6_slowtimo (pflog related?)
Date: Fri, 17 Sep 2004 10:44:03 +0900

 On Thu, Sep 16, 2004 at 11:29:09AM -0300, Giovanni P. Tirloni wrote:
 > This still happens in 5.3BETA4 (RELENG_5).
 
 Yes, but in -CURRENT, that is work-arounded.
 I verified it with 6.0-CURRENT of Sep 16. There was no panic for 8 hours.
 Mr. Max laier said in his commit message MT5 will happen soon. (5days after?)

From: "Giovanni P. Tirloni" <gpt@tirloni.org>
To: Sangwoo Shim <ssw@neo.redjade.org>
Cc: freebsd-gnats-submit@freebsd.org
Subject: Re: kern/70393: [panic] in nd6_slowtimo (pflog related?)
Date: Fri, 17 Sep 2004 10:02:25 -0300

 Sangwoo Shim wrote:
 > On Thu, Sep 16, 2004 at 11:29:09AM -0300, Giovanni P. Tirloni wrote:
 > 
 >>This still happens in 5.3BETA4 (RELENG_5).
 > 
 > 
 > Yes, but in -CURRENT, that is work-arounded.
 > I verified it with 6.0-CURRENT of Sep 16. There was no panic for 8 hours.
 > Mr. Max laier said in his commit message MT5 will happen soon. (5days after?)
 
  Sorry, I probably missed that part about the MT5.
 
  Thank you.
 
 --
 Giovanni
State-Changed-From-To: patched->closed 
State-Changed-By: mlaier 
State-Changed-When: Mon Sep 20 15:27:25 GMT 2004 
State-Changed-Why:  
Committed to RELENG_5 as well. Thanks. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=70393 
>Unformatted:
