From land@gx.dnepr.net  Wed Jul 17 01:02:42 2002
Return-Path: <land@gx.dnepr.net>
Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 901C337B400
	for <FreeBSD-gnats-submit@freebsd.org>; Wed, 17 Jul 2002 01:02:42 -0700 (PDT)
Received: from gx.dnepr.net (gx.dnepr.net [217.198.131.109])
	by mx1.FreeBSD.org (Postfix) with ESMTP id B0E0B43E3B
	for <FreeBSD-gnats-submit@freebsd.org>; Wed, 17 Jul 2002 01:02:11 -0700 (PDT)
	(envelope-from land@gx.dnepr.net)
Received: from gx.dnepr.net (localhost.dnepr.net [127.0.0.1])
	by gx.dnepr.net with ESMTP id g6H7njgj055241
	for <FreeBSD-gnats-submit@freebsd.org>; Wed, 17 Jul 2002 10:49:45 +0300 (EEST)
	(envelope-from land@gx.dnepr.net)
Received: (from land@localhost)
	by gx.dnepr.net id g6H7njJ3055240;
	Wed, 17 Jul 2002 10:49:45 +0300 (EEST)
	(envelope-from land)
Message-Id: <200207170749.g6H7njJ3055240@gx.dnepr.net>
Date: Wed, 17 Jul 2002 10:49:45 +0300 (EEST)
From: land@dnepr.net
Reply-To: land@dnepr.net
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: Kernel panics when ripd or ospfd (zebra) killed
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         40680
>Category:       kern
>Synopsis:       Kernel panics when ripd or ospfd (zebra) killed
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    gnats-admin
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Jul 17 01:10:01 PDT 2002
>Closed-Date:    Thu Jul 18 23:56:02 PDT 2002
>Last-Modified:  Wed Oct 26 04:54:28 GMT 2005
>Originator:     land@dnepr.net
>Release:        FreeBSD 4.6-RELEASE-p2 i386
>Organization:
>Environment:
System: FreeBSD 4.6-RELEASE-p2 Sun Jul 14 12:18:03 EEST 2002

>Description:

I use FreeBSD 4.6-R with zebra routing software (zebra-0.93a).      
Both ripd and ospfd are running. With non-zero probability, when I kill ripd
or ospfd process, system panics with the following diagnostics:      
                                                                           
Fatal trap 12: page fault while in kernel mode                   
fault virtual address   = 0x6                                   
fault code              = supervisor read, page not present   
instruction pointer     = 0x8:0xc01856c7                        
stack pointer           = 0x10:0xca01bc90                       
frame pointer           = 0x10:0xca01bca4                          
code segment            = base 0x0, limit 0xfffff, type 0x1b          
                        = DPL 0, pres 1, def32 1, gran 1               
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 629 (ripd)
interrupt mask          = net
trap number             = 12
panic: page fault

syncing disks... 40 2 1 1 1 1 1 1 1
done

I found that such panics occurs only on machines with vlan interfaces.
All of machines have fxp NIC's.

ifconfig:
fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        inet x.x.x.x netmask 0xffffffe0 broadcast x.x.x.x
        ether 00:03:47:xx:xx:xx
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active
vlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        inet x.x.x.x netmask 0xffffffe0 broadcast x.x.x.x
        ether 00:03:47:xx:xx:xx
        vlan: 5 parent interface: fxp0
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
        inet 127.0.0.1 netmask 0xff000000

Here is output from gdb -k:

(kgdb) where
#0  dumpsys () at ../../kern/kern_shutdown.c:487
#1  0xc01445bf in boot (howto=256) at ../../kern/kern_shutdown.c:316
#2  0xc01449e4 in poweroff_wait (junk=0xc0211d6c, howto=-1071572849)
    at ../../kern/kern_shutdown.c:595
#3  0xc01eb71e in trap_fatal (frame=0xca01bc50, eva=6)
    at ../../i386/i386/trap.c:966
#4  0xc01eb3f1 in trap_pfault (frame=0xca01bc50, usermode=0, eva=6)
    at ../../i386/i386/trap.c:859
#5  0xc01eafdb in trap (frame={tf_fs = -1071448048, tf_es = 6422544,
      tf_ds = -1066074096, tf_edi = -1066046208, tf_esi = 1,
      tf_ebp = -905855836, tf_isp = -905855876, tf_ebx = -1053640192,
      tf_edx = 6, tf_ecx = -905855812, tf_eax = 2, tf_trapno = 12, tf_err = 0,
      tf_eip = -1072146745, tf_cs = 8, tf_eflags = 66050,
      tf_esp = -1053640192, tf_ss = -1052190624}) at ../../i386/i386/trap.c:458
#6  0xc01856c7 in rt_msg1 (type=16, rtinfo=0xca01bcbc)
    at ../../net/rtsock.c:613
#7  0xc0185b35 in rt_newmaddrmsg (cmd=16, ifma=0xc148d860)
    at ../../net/rtsock.c:848
#8  0xc018020c in if_delmulti (ifp=0xc132ba00, sa=0xca01bd3c)
    at ../../net/if.c:1507
#9  0xc01818f5 in vlan_setmulti (ifp=0xc132b400) at ../../net/if_vlan.c:154
#10 0xc0182416 in vlan_ioctl (ifp=0xc132b400, cmd=2149607730, data=0x0)
    at ../../net/if_vlan.c:704
#11 0xc01802e6 in if_delmulti (ifp=0xc132b400, sa=0xc0724040)
    at ../../net/if.c:1548
#12 0xc0188b6f in in_delmulti (inm=0xc14c4820) at ../../netinet/in.c:893
#13 0xc019352c in ip_freemoptions (imo=0xc14fba00)
    at ../../netinet/ip_output.c:1886
#14 0xc01894ad in in_pcbdetach (inp=0xc93dbfc0) at ../../netinet/in_pcb.c:567
#15 0xc019b418 in udp_detach (so=0xc931e940) at ../../netinet/udp_usrreq.c:871
#16 0xc0162511 in soclose (so=0xc931e940) at ../../kern/uipc_socket.c:320
#17 0xc0156a56 in soo_close (fp=0xc14ad600, p=0xc890d6c0)
    at ../../kern/sys_socket.c:195
#18 0xc013a2df in fdrop (fp=0xc14ad600, p=0xc890d6c0) at ../../sys/file.h:217
#19 0xc013a227 in closef (fp=0xc14ad600, p=0xc890d6c0)
    at ../../kern/kern_descrip.c:1277
#20 0xc0139629 in close (p=0xc890d6c0, uap=0xca01bf80)
    at ../../kern/kern_descrip.c:581
#21 0xc01eb9cd in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
      tf_edi = -1077937712, tf_esi = 0, tf_ebp = -1077938364,
      tf_isp = -905855020, tf_ebx = 134973184, tf_edx = 134754364,
      tf_ecx = 134956992, tf_eax = 6, tf_trapno = 12, tf_err = 2,
      tf_eip = 672846696, tf_cs = 31, tf_eflags = 659, tf_esp = -1077938408,
      tf_ss = 47}) at ../../i386/i386/trap.c:1167
#22 0xc01dfe15 in Xint0x80_syscall ()
#23 0x8049ab8 in ?? ()
#24 0xbfbfffac in ?? ()
#25 0x8049d47 in ?? ()
#26 0x8049909 in ?? ()

(kgdb) up 5
#5  0xc01eafdb in trap (frame={tf_fs = -1071448048, tf_es = 6422544,
      tf_ds = -1066074096, tf_edi = -1066046208, tf_esi = 1,
      tf_ebp = -905855836, tf_isp = -905855876, tf_ebx = -1053640192,
      tf_edx = 6, tf_ecx = -905855812, tf_eax = 2, tf_trapno = 12, tf_err = 0,
      tf_eip = -1072146745, tf_cs = 8, tf_eflags = 66050,
      tf_esp = -1053640192, tf_ss = -1052190624}) at ../../i386/i386/trap.c:458
458                             (void) trap_pfault(&frame, FALSE, eva);

(kgdb) frame frame->tf_ebp frame->tf_eip
#0  rt_msg1 (type=16, rtinfo=0xca01bcbc) at ../../net/rtsock.c:614
614                     dlen = ROUNDUP(sa->sa_len);

(kgdb) list
609             bzero((caddr_t)rtm, len);
610             for (i = 0; i < RTAX_MAX; i++) {
611                     if ((sa = rtinfo->rti_info[i]) == NULL)
612                             continue;
613                     rtinfo->rti_addrs |= (1 << i);
614                     dlen = ROUNDUP(sa->sa_len);
615                     m_copyback(m, len, dlen, (caddr_t)sa);
616                     len += dlen;
617             }
618             if (m->m_pkthdr.len != len) {
(kgdb) print sa
$1 = (struct sockaddr *) 0x0

> I do not familiar with kernel internals and kernel debugging.                   
> So I just wondering how could sa == (struct sockaddr *) 0x0 ?               
> We explicitly checked that sa != NULL 2 lines higher:                    
>                                                                  
>        if ((sa = rtinfo->rti_info[i]) == NULL)         
>                continue;                                                

(kgdb) up
#1  0xc0185b35 in rt_newmaddrmsg (cmd=16, ifma=0xc148d860)
    at ../../net/rtsock.c:848
848             if ((m = rt_msg1(cmd, &info)) == NULL)
(kgdb) list
843             /*
844              * If a link-layer address is present, present it as a
+``gateway''
845              * (similarly to how ARP entries, e.g., are presented).
846              */
847             gate = ifma->ifma_lladdr;
848             if ((m = rt_msg1(cmd, &info)) == NULL)
849                     return;
850             ifmam = mtod(m, struct ifma_msghdr *);
851             ifmam->ifmam_index = ifp->if_index;
852             ifmam->ifmam_addrs = info.rti_addrs;
(kgdb) up
#2  0xc018020c in if_delmulti (ifp=0xc132ba00, sa=0xca01bd3c)
    at ../../net/if.c:1507
1507            rt_newmaddrmsg(RTM_DELMADDR, ifma);
(kgdb) list
1502            if (ifma->ifma_refcount > 1) {
1503                    ifma->ifma_refcount--;
1504                    return 0;
1505            }
1506
1507            rt_newmaddrmsg(RTM_DELMADDR, ifma);
1508            sa = ifma->ifma_lladdr;
1509            s = splimp();
1510            LIST_REMOVE(ifma, ifma_link);
1511            /*
(kgdb) up
#3  0xc01818f5 in vlan_setmulti (ifp=0xc132b400) at ../../net/if_vlan.c:154
154                     error = if_delmulti(ifp_p, (struct sockaddr *)&sdl);
(kgdb) list
149
150             /* First, remove any existing filter entries. */
151             while(SLIST_FIRST(&sc->vlan_mc_listhead) != NULL) {
152                     mc = SLIST_FIRST(&sc->vlan_mc_listhead);
153                     bcopy((char *)&mc->mc_addr, LLADDR(&sdl),
+ETHER_ADDR_LEN);
154                     error = if_delmulti(ifp_p, (struct sockaddr *)&sdl);
155                     if (error)
156                             return(error);
157                     SLIST_REMOVE_HEAD(&sc->vlan_mc_listhead, mc_entries);
158                     free(mc, M_VLAN);
(kgdb) up
#4  0xc0182416 in vlan_ioctl (ifp=0xc132b400, cmd=2149607730, data=0x0)
    at ../../net/if_vlan.c:704
704                     error = vlan_setmulti(ifp);
(kgdb) list
699                             error = EINVAL;
700                     }
701                     break;
702             case SIOCADDMULTI:
703             case SIOCDELMULTI:
704                     error = vlan_setmulti(ifp);
705                     break;
706             default:
707                     error = EINVAL;
708             }
(kgdb) up
#5  0xc01802e6 in if_delmulti (ifp=0xc132b400, sa=0xc0724040)
    at ../../net/if.c:1548
1548            ifp->if_ioctl(ifp, SIOCDELMULTI, 0);
(kgdb) list
1543                    return 0;
1544            }
1545
1546            s = splimp();
1547            LIST_REMOVE(ifma, ifma_link);
1548            ifp->if_ioctl(ifp, SIOCDELMULTI, 0);
1549            splx(s);
1550            free(ifma->ifma_addr, M_IFMADDR);
1551            free(sa, M_IFMADDR);
1552            free(ifma, M_IFMADDR);
(kgdb) up
#6  0xc0188b6f in in_delmulti (inm=0xc14c4820) at ../../netinet/in.c:893
893             if_delmulti(ifma->ifma_ifp, ifma->ifma_addr);
(kgdb) list
888                     ifma->ifma_protospec = 0;
889                     LIST_REMOVE(inm, inm_link);
890                     free(inm, M_IPMADDR);
891             }
892             /* XXX - should be separate API for when we have an ifma? */
893             if_delmulti(ifma->ifma_ifp, ifma->ifma_addr);
894             if (my_inm.inm_ifp != NULL)
895                     igmp_leavegroup(&my_inm);
896             splx(s);
897     }

>How-To-Repeat:

	On FreeBSD 4.6-RELEASE with fxp NIC configure several vlan's.
	Compile and install zebra-0.93a. Start zebra and ripd and several times
	kill and restart ripd (until system panics).
>Fix:

	


>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->closed 
State-Changed-By: jon 
State-Changed-When: Thu Jul 18 23:55:39 PDT 2002 
State-Changed-Why:  
duplicate pr, see kern/40723 

http://www.freebsd.org/cgi/query-pr.cgi?pr=40680 
>Unformatted:
