From nobody@FreeBSD.org  Mon Jan 10 12:43:48 2011
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 065A41065670
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 10 Jan 2011 12:43:48 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (unknown [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id CFBE48FC0A
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 10 Jan 2011 12:43:47 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.4/8.14.4) with ESMTP id p0AChlsX010421
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 10 Jan 2011 12:43:47 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.4/8.14.4/Submit) id p0AChlOT010420;
	Mon, 10 Jan 2011 12:43:47 GMT
	(envelope-from nobody)
Message-Id: <201101101243.p0AChlOT010420@red.freebsd.org>
Date: Mon, 10 Jan 2011 12:43:47 GMT
From: Petr Lampa <lampa@fit.vutbr.cz>
To: freebsd-gnats-submit@FreeBSD.org
Subject: page fault in icmp6_error2() called from nd6_llinfo_timer()
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         153841
>Category:       kern
>Synopsis:       page fault in icmp6_error2() called from nd6_llinfo_timer()
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    bz
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Jan 10 12:50:07 UTC 2011
>Closed-Date:    Tue Jan 11 11:05:27 UTC 2011
>Last-Modified:  Tue Jan 11 11:05:27 UTC 2011
>Originator:     Petr Lampa
>Release:        8.2-PRERELEASE
>Organization:
BUT brno
>Environment:
FreeBSD xxx 8.2-PRERELEASE FreeBSD 8.2-PRERELEASE #16: Tue Nov 30 12:44:18 CET 2010     rrrr@xxxx:/usr/src/sys/i386/compile/GUTA  i386

>Description:
page fault
Stopped at icmp6_error2+0xc3:   movl 0x1dc(%eax),%eax

where:
Tracing pid 11 tid 1000006 td 0xc851e000
icmp6_error2(cad86800,1,3,0,c86d4c00,...) at icmp6_error2+0xc3
nd6_llinfo_timer(cc7da400,c851e000,c838bc40,c851e870,c851e000,...) at timer+0x126
softclock(c07bd760,c851e000,0,109,3b7b822b,...) at softclock_0x22a

icmp6_error2+0xa3:      jmp     icmp6_error2+0x136
icmp6_error2+0xa8:      cmpl    $0x27,0xc(%ebx)
icmp6_error2+0xac:      jnbe    icmp6_error2+0xe0
icmp6_error2+0xae:      addl    $0x1,ip6stat+0x8
icmp6_error2+0xb5:      adcl    $0,ip6stat+0xc
icmp6_error2+0xbc:      movl    0x18(%ebx),%eax
icmp6_error2+0xbf:      testl   %eax,%eax
icmp6_error2+0xc1:      jz      icmp6_error2+0xd3
icmp6_error2+0xc3:      movl    0x1dc(%eax),%eax
icmp6_error2+0xc9:      movl    0(%eax),%eax
icmp6_error2+0xcb:      addl    $0x1,0x30(%eax)
icmp6_error2+0xcf:      adcl    $0,0x34(%eax)
icmp6_error2+0xd3:      movl    %ebx,0(%esp)
icmp6_error2+0xd6:      call    m_freem
icmp6_error2+0xdb:      jmp     icmp6_error2+0x136
icmp6_error2+0xdd:      leal    0(%esi),%esi
icmp6_error2+0xe0:      movl    0x8(%ebx),%esi
icmp6_error2+0xe3:      movl    $0,0x8(%esp)
icmp6_error2+0xeb:      movl    %edi,0x4(%esp)
icmp6_error2+0xef:      leal    0x8(%esi),%eax
icmp6_error2+0xf2:      movl    ieax,0(%esp)
icmp6_error2+0xf5:      call    in6_setscope
icmp6_error2+0xfa:      testl   %eax,%eax
icmp6_error2+0xfc:      jnz     icmp6_error2+0x136
(sorry, if there is some garbage here, this is a result of ocr)

So, the location of page fault corresponds to the last branch of
IP6_EXTHDR_CHECK() macro expanded in icmp6_error2:

     if ((m)->m_len < (off) + (hlen)) {                         \
        V_ip6stat.ip6s_tooshort++;                              \
        in6_ifstat_inc(m->m_pkthdr.rcvif, ifs6_in_truncated);   \

The content of mbuf (m) is:

0xcad86800:    0            0            cad86896     1e
0xcad86810:    0            1            0            0
0xcad86820:    5a           0            4            6
0xcad86830:    0            0            5e0000       30000

It seems, that pkthdr is not there and so m->m_pkthdr.rcvif is 0 and it's dereferenced without a check.

>How-To-Repeat:
It happened after ping6/traceroute6 (not sure) to unresponding IPv6 address, which after time started responding (probably, I'm not really sure).
>Fix:
check flags for M_PKTHDR in IP6_EXTHDR_CHECK()

>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->bz 
Responsible-Changed-By: bz 
Responsible-Changed-When: Mon Jan 10 14:23:53 UTC 2011 
Responsible-Changed-Why:  
I'll take a peek.  It might be fixed already. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=153841 
State-Changed-From-To: open->feedback 
State-Changed-By: bz 
State-Changed-When: Mon Jan 10 19:50:25 UTC 2011 
State-Changed-Why:  
Feedback requested on the excat version of nd6.c this occured with. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=153841 

From: "Bjoern A. Zeeb" <bz@FreeBSD.org>
To: bug-followup@FreeBSD.org, lampa@fit.vutbr.cz
Cc:  
Subject: Re: kern/153841: page fault in icmp6_error2() called from
 nd6_llinfo_timer()
Date: Mon, 10 Jan 2011 19:50:05 +0000 (UTC)

 Hi,
 
 could you check that you have r216022 (idelly r216277) but not too
 important to know whether this panic was fixed already or not.
 
 You can also send me the ident from your sys/netinet6/nd6.c file
 and I can check.
 
 Thanks,
 Bjoern
 
 ------------------------------------------------------------------------
 Revision 216277
 Modified Tue Dec 7 22:43:29 2010 UTC (4 weeks, 5 days ago) by bz
 
 Loosen the locking in nd6-free() again after r216022 to avoid
 a LOR and a recursed lock.
 
 ------------------------------------------------------------------------
 Revision 216022
 Modified Mon Nov 29 00:04:08 2010 UTC (6 weeks ago) by bz
 
 Plug well observed races on la_hold entries with the callout handler.
 
 Call the handler function with the lock held, return unlocked as we
 might free the entry.  Rework functions later in the call graph to be
 either called with the lock held or, only if needed, unlocked.
 
 Place asserts to document and tighten assumptions on various lle locking,
 which were not always true before.
 
 We call nd6_ns_output() unlocked and the assignment of ip6->ip6_src was
 decentralized to minimize possible complexity introduced with the formerly
 missing locking there.  This also resulted in a push down of local
 variable scopes into smaller blocks.
 ------------------------------------------------------------------------
 
 -- 
 Bjoern A. Zeeb                                 You have to have visions!
          <ks> Going to jail sucks -- <bz> All my daemons like it!
    http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/jails.html

From: "Bjoern A. Zeeb" <bz@FreeBSD.org>
To: bug-followup@FreeBSD.org, lampa@fit.vutbr.cz
Cc:  
Subject: Re: kern/153841: page fault in icmp6_error2() called from
 nd6_llinfo_timer()
Date: Mon, 10 Jan 2011 21:02:51 +0000 (UTC)

 On Mon, 10 Jan 2011, Petr Lampa wrote:
 
 > I cannot check this, source tree was cvsuped many times from the last system restart.
 > The only thing is sure, the kernel date Nov 30.
 > I generaly update sources just before the kernel rebuild.
 
 If you still have an unstripped kernel you could run
 
  	ident /boot/kernel/kernel | grep nd6.c
 
 
 > In any case, IP6_EXTHDR_CHECK macro should be fixed, it must check
 > if referenced mbuf contains pkthdr (it can panic, if this "cannot"
 > happen).
 
 Yes, there are a coupld of things to be done.
 
 /bz
 
 -- 
 Bjoern A. Zeeb                                 You have to have visions!
          <ks> Going to jail sucks -- <bz> All my daemons like it!
    http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/jails.html

From: "Bjoern A. Zeeb" <bz@FreeBSD.org>
To: bug-followup@FreeBSD.org, lampa@fit.vutbr.cz
Cc:  
Subject: Re: kern/153841: page fault in icmp6_error2() called from
 nd6_llinfo_timer()
Date: Tue, 11 Jan 2011 10:09:20 +0000 (UTC)

 On Tue, 11 Jan 2011, Petr Lampa wrote:
 
 > $FreeBSD: src/sys/netinet6/nd6.c,v 1.123.2.13 2010/11/28 16:31:39 bz Exp $
 > $FreeBSD: src/sys/netinet6/nd6_nbr.c,v 1.69.2.4 2010/11/28 17:02:02 bz Exp $
 > $FreeBSD: src/sys/netinet6/nd6_rtr.c,v 1.73.2.4 2010/05/06 06:44:19 bz Exp $
 
 Ok, that's before the relevant commits.  On RELENG_8 you want nd6.c
 rev.  1.123.2.15.  I highly assume your problem was fixed with those 2
 commits.
 
 Are you ok me closing the PR?  You could re-open it should you run into
 the problem again with the updated kernel just by sending a follow-up
 or sending me an email.
 
 /bz
 
 -- 
 Bjoern A. Zeeb                                 You have to have visions!
          <ks> Going to jail sucks -- <bz> All my daemons like it!
    http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/jails.html
State-Changed-From-To: feedback->closed 
State-Changed-By: bz 
State-Changed-When: Tue Jan 11 11:04:29 UTC 2011 
State-Changed-Why:  
The problems has supposedly be fixed since.  In agreement 
with submitter close the PR.  We'll re-open should the 
problem happen again.   Thanks a lot! 

http://www.freebsd.org/cgi/query-pr.cgi?pr=153841 
>Unformatted:
