From robert@cyrus.watson.org  Sun Feb 27 19:44:59 2000
Return-Path: <robert@cyrus.watson.org>
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by hub.freebsd.org (Postfix) with ESMTP
	id E413D37B5B6; Sun, 27 Feb 2000 19:44:56 -0800 (PST)
	(envelope-from robert@cyrus.watson.org)
Received: from fledge.watson.org (robert@fledge.pr.watson.org [192.0.2.3])
	by fledge.watson.org (8.9.3/8.9.3) with SMTP id WAA33665;
	Sun, 27 Feb 2000 22:46:01 -0500 (EST)
	(envelope-from robert@cyrus.watson.org)
Message-Id: <Pine.NEB.3.96L.1000227224040.5881J-100000@fledge.watson.org>
Date: Sun, 27 Feb 2000 22:46:01 -0500 (EST)
From: Robert Watson <robert@cyrus.watson.org>
Reply-To: Robert Watson <robert+freebsd@cyrus.watson.org>
To: csg@waterspout.com
Cc: FreeBSD-gnats-submit@freebsd.org, mdodd@freebsd.org,
	jkh@freebsd.org
In-Reply-To: <200002240550.AAA31092@squall.waterspout.com>
Subject: Re: arpintr() incorrectly checks mbuf chain size

>Number:         17030
>Category:       kern
>Synopsis:       Re: arpintr() incorrectly checks mbuf chain size
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    gnats-admin
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Feb 27 19:50:01 PST 2000
>Closed-Date:    Sun Feb 27 20:07:55 PST 2000
>Last-Modified:  Wed Oct 26 06:21:13 GMT 2005
>Originator:     
>Release:        
>Organization:
>Environment:
>Description:
 Steve,
 
 I'm having stability problems with this patch.  When an arp lookup is
 attempted, I get a hang.  The kernel is still live, and apparently in
 kernel/interrupt context.  I don't have much debugging turned onin the
 kernel, and don't have a proper debugging setup here, but here's ddb's
 stack trace (from breaking to the debugger)
 
 Debugger(c032a989) at Debugger+0x35
 scgetc(c0398de0,2,c06ae100,c03913a0,ffffffff) at scgetc+0x37e
 sckbdevent(c03913a0,0,c0398de0,c06ae100,1a0f080a) at sckbdevent+0x1b9
 atkbd_intr(c03913a0,0,c0339dac,c02c14d2,c03913a0) at atkbd_intr+0x22
 atkbd_isa_intr(c03913a0,40020000,40020010,10,c06a0010) at
 atkbd_isa_intr+0x18
 Xresume1() at Xresume1+0x2b
 --- interrupt, eip = 0xc01d2531, esp = 0xc0339d9c, ebp = 0xc0339dac ---
 arpintr(c02c1d7b,0, c0210010,40000010,ffff0010) at arpintr+0xd9
 swi_net_next() at swi_net_next
 
 My guess is you're recusively locking or something from interrupt context,
 or the like.  As the console is still live, it's not as dead as it could
 be, for example :-)
 
 Interestingly, this is (I believe) the second ARP lookup, not the first
 one, if that helps any. Prior to the hang, arp -an reported one entry, it
 was attempting to contact a second IP that caused the trouble.
 
 
 On Thu, 24 Feb 2000 csg@waterspout.com wrote:
 
 > 
 > >Submitter-Id:   current-users
 > >Originator:     C. Stephen Gunn
 > >Organization:   WaterSpout Communications, Inc.
 > >Confidential:   no
 > >Synopsis:       arpintr() incorrectly checks mbuf chain size
 > >Severity:       serious
 > >Priority:       high
 > >Category:       kern
 > >Release:        FreeBSD 3.4-STABLE i386
 > >Class:          sw-bug
 > >Environment: 
 > 
 > FreeBSD-3.4-STABLE or FreeBSD-4.0 current.
 > 
 > >Description: 
 > 
 > The NETISR_ARP handler arpintr() incorrectly checks m->m_len to
 > determine if we have a complete ARP packet.  It is possible to
 > have a packet spread across several mbufs in the chain.
 > 
 > While this case apparently doesn't happen with normal ethernet
 > interfaces, additional mbuf operations before ARP processing (for
 > 802.1Q Tagged VLANS, Bridged Ethernet over Frame Relay, or perhaps
 > LANE) can cause NETISR_ARP to be presented with a fragmented packet.
 > 
 > >How-To-Repeat: 
 > 
 > Run my yet-to-see-the-light-of-day VLAN improvements, it blows chunks
 > on ever inbound ARP packet.
 > 
 > >Fix: 
 > 
 > I've not only fixed the length comparisson, I've added several
 > diagnostic error messages to the handler for other out-of-the-norm
 > ARP packets.  This makes the error conditions easier to detect
 > and fix, and makes the code much more readable.
 > 
 > I've put patches for -STABLE and -CURRENT (which are actually
 > identical) online:
 > 
 >    http://www.waterspout.com/FreeBSD/arpintr-patch.current
 > 
 >    http://www.waterspout.com/FreeBSD/arpintr-patch.stable
 > 
 > If someone could perform a sanity check, and get these committed
 > before 4.0-R heads out the door, that would be ideal.
 > 
 >  - Steve
 > 
 > 
 
 
   Robert N M Watson 
 
 robert@fledge.watson.org              http://www.watson.org/~robert/
 PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
 TIS Labs at Network Associates, Safeport Network Services
 
 
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->closed 
State-Changed-By: steve 
State-Changed-When: Sun Feb 27 20:07:55 PST 2000 
State-Changed-Why:  
Intended as a followup to kern/16950. 
>Unformatted:
