From nobody@FreeBSD.org  Wed Sep 29 21:02:29 2004
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 130D916A4CE
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 29 Sep 2004 21:02:29 +0000 (GMT)
Received: from www.freebsd.org (www.freebsd.org [216.136.204.117])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 054E143D39
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 29 Sep 2004 21:02:29 +0000 (GMT)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.12.11/8.12.11) with ESMTP id i8TL2Sdg073977
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 29 Sep 2004 21:02:28 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.12.11/8.12.11/Submit) id i8TL2SHs073976;
	Wed, 29 Sep 2004 21:02:28 GMT
	(envelope-from nobody)
Message-Id: <200409292102.i8TL2SHs073976@www.freebsd.org>
Date: Wed, 29 Sep 2004 21:02:28 GMT
From: Mark Gooderum <mark@verniernetworks.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: em driver em_poll doesn't start send if pending packets
X-Send-Pr-Version: www-2.3

>Number:         72183
>Category:       misc
>Synopsis:       em driver em_intr doesn't start send if pending packets
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    mlaier
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Sep 29 21:10:22 GMT 2004
>Closed-Date:    Sun Oct 03 00:55:12 GMT 2004
>Last-Modified:  Sun Oct 03 00:55:12 GMT 2004
>Originator:     Mark Gooderum
>Release:        5.3 Beta 4
>Organization:
Vernier Networks, Inc.
>Environment:
FreeBSD 172.20.1.199 5.3-BETA4 FreeBSD 5.3-BETA4 #6: Tue Sep 28 21:35:19 PDT 2004     root@eagle.jumpadmin.net:/usr/build/ambit2/freebsd5/sys/i386/compile/VNISMP  i386

>Description:
     The em driver has a reversed bit of logic in em_intr() - it will only restart transmission if the queue is empty, not if it is full.  em_poll DTRT.  Under high enough loads you can wedge a particular instance of an em interface although an ifconfig down/up will clear the condition.  The bug is still present in Beta 6.

>How-To-Repeat:
      Blast a _lot_ of packets at the system with very little local activity.  We flushed this problem doing GigE<->GigE throughput tests with an Ixia.  Once the queue is full the interface will wedge on transmit because ifq_handoff() will never tickled if_start/em_start and em_intr() will incorrectly not call em_start_locked().  Downing/uping the interface will recover because the queue will be flushed - switching to polled will too as em_poll() DTRT.

If there is enough local activity the local sends will tickled em_start often enough to keep the queue from filling.  It's hard to reproduce the lockup without a piece of gear like an Ixia or Smartbits but the flaw in th elogic is obvious.


>Fix:
      The following patch has fixed the problem for us:

--- /tmp/tmp.44554.0    Wed Sep 29 15:52:32 2004
+++ /data/work/mark/5.3B4A1.i386/usr/build/ambit2/freebsd5/sys/dev/em/if_em.c   Wed Sep 29 15:51:31 2004
@@ -986,11 +986,11 @@
                         em_clean_transmit_interrupts(adapter);
                 }
                 loop_cnt--;
         }

-        if (ifp->if_flags & IFF_RUNNING && IFQ_DRV_IS_EMPTY(&ifp->if_snd))
+        if (ifp->if_flags & IFF_RUNNING && !IFQ_DRV_IS_EMPTY(&ifp->if_snd))
                 em_start_locked(ifp);

        EM_UNLOCK(adapter);
         return;
 }

>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->mlaier 
Responsible-Changed-By: rwatson 
Responsible-Changed-When: Thu Sep 30 00:27:43 GMT 2004 
Responsible-Changed-Why:  
This patch looks almost identical to a patch committed by mlaier this 
afternoon in response to the presumably identical realization. 

FWIW, I can reproduce this problem relatively easily in my test 
environment, and a couple of us have been performing intermittent 
head-against-wall banging for the last few weeks wondering what was 
happening. 


http://www.freebsd.org/cgi/query-pr.cgi?pr=72183 
State-Changed-From-To: open->patched 
State-Changed-By: mlaier 
State-Changed-When: Thu Sep 30 04:11:39 GMT 2004 
State-Changed-Why:  
An indentical fix (subbmitted by mtm@) has been committed today. 
I will hold this PR until MT5 has been done. 
Thanks! 

http://www.freebsd.org/cgi/query-pr.cgi?pr=72183 
State-Changed-From-To: patched->closed 
State-Changed-By: mlaier 
State-Changed-When: Sun Oct 3 00:54:28 GMT 2004 
State-Changed-Why:  
Committed to RELENG_5 as well now. Thanks. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=72183 
>Unformatted:
