From nobody@FreeBSD.org  Fri May 14 05:45:14 2004
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 3E89716A4CE
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 14 May 2004 05:45:14 -0700 (PDT)
Received: from www.freebsd.org (www.freebsd.org [216.136.204.117])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 052E343D41
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 14 May 2004 05:45:14 -0700 (PDT)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.12.11/8.12.11) with ESMTP id i4ECjDXl053183
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 14 May 2004 05:45:13 -0700 (PDT)
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.12.11/8.12.11/Submit) id i4ECjDKK053182;
	Fri, 14 May 2004 05:45:13 -0700 (PDT)
	(envelope-from nobody)
Message-Id: <200405141245.i4ECjDKK053182@www.freebsd.org>
Date: Fri, 14 May 2004 05:45:13 -0700 (PDT)
From: Fabien THOMAS <fabien.thomas@netasq.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: hard lock with em driver
X-Send-Pr-Version: www-2.3

>Number:         66634
>Category:       kern
>Synopsis:       [em] hard lock with em driver
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    linimon
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri May 14 05:50:29 PDT 2004
>Closed-Date:    Sun Jun 18 21:18:57 GMT 2006
>Last-Modified:  Sun Jun 18 21:18:57 GMT 2006
>Originator:     Fabien THOMAS
>Release:        4.9
>Organization:
NETASQ
>Environment:
FreeBSD build-current 4.9-RELEASE-p2 FreeBSD 4.9-RELEASE-p2 #0: Mon Mar  1 10:22:36 CET 2004     root@build-current:/usr/src/sys/compile/GENERIC  i386
>Description:
We use a lot of intel gigabit card and since the first
time we use it we experience some strange hard lock of the system
(4.9|FreeBSD-stable). We have tried several driver version (it is not
related to a version). We use the card in polling mode but it seems that
the problem can be fired even in interrupt mode.

What i found during the debugging on a fiber card:

1) original driver did not lock but when the other end is rebooted i've
around 10 linkup/linkdown

2) removing linkup/linkdown printf: driver lock each time the other end
system is rebooted!

3) removing the E1000_IMC_RXSEQ in disable_intr correct the lock but i
do not understand why:
     a) E1000_IMC_RXSEQ need to be left when disabling intr?
     b) the system completly lock (even under debugger) for just an
interrupt source enabled?

static void
em_disable_intr(struct adapter *adapter)
{
     E1000_WRITE_REG(&adapter->hw, IMC,
             (0xffffffff));/* & ~E1000_IMC_RXSEQ));*/
     return;
}


>How-To-Repeat:
use an intel dual port fiber card based on 546 chip for example
remove driver printf for linkup/linkdown
reboot the other end
>Fix:
      
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->tackerman 
Responsible-Changed-By: dwmalone 
Responsible-Changed-When: Sun May 30 09:29:54 PDT 2004 
Responsible-Changed-Why:  
Tony is now looking after the em driver. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=66634 

From: ming fu <fming@borderware.com>
To: freebsd-gnats-submit@FreeBSD.org, fabien.thomas@netasq.com
Cc:  
Subject: Re: kern/66634: hard lock with em driver
Date: Thu, 17 Jun 2004 11:35:50 -0400

 Hi,
 
 I found the following code chunk in em_watchdog strange. I believe 
 watchdog is called when the kernel found the card not responding. The 
 card hardware could be insane at the moment. Why not just go stranght to 
 stop/init.
 
 I cross checked the Linux e1000 driver, though I am no expert on device 
 driver, however, no where did the Linux driver does anything similar, 
 like fetch some bits from the hardware to decide if a stop/init is needed.
 
   /* If we are in this routine because of pause frames, then
          * don't reset the hardware.
          */
         if (E1000_READ_REG(&adapter->hw, STATUS) & E1000_STATUS_TXOFF) {
                 ifp->if_timer = EM_TX_TIMEOUT;
                 printf("em watch watchdog E1000_STATUS_TXOFF %d\n", 
 bti_cnt);
                 bti_cnt++;
                 if (bti_cnt < 5)
                      return;
                 else
                      printf("em watchdog E1000_STATUS_TXOFF for too 
 long\n");
         }
          ifp->if_flags &= ~IFF_RUNNING;
 
         em_stop(adapter);
         em_init(adapter);
 
 Any suggestions?
 
 Ming
 
 
State-Changed-From-To: open->analyzed 
State-Changed-By: tackerman 
State-Changed-When: Thu Sep 23 18:07:18 GMT 2004 
State-Changed-Why:  
The bit of code which remove RXSEQ from the Interrupt Mask Clear appears to be a 
workaround for an earlier controller.  The workaround is not required in general 
and therefore should be made device specific. 
SCR 41242 

http://www.freebsd.org/cgi/query-pr.cgi?pr=66634 
State-Changed-From-To: analyzed->feedback 
State-Changed-By: linimon 
State-Changed-When: Wed Apr 5 00:34:14 UTC 2006 
State-Changed-Why:  
Is this still a problem with recent versions of FreeBSD? 


Responsible-Changed-From-To: tackerman->linimon 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Wed Apr 5 00:34:14 UTC 2006 
Responsible-Changed-Why:  
Reset PR assigned to inactive committer. 

Hat:	gnats-admin 

http://www.freebsd.org/cgi/query-pr.cgi?pr=66634 
State-Changed-From-To: feedback->closed 
State-Changed-By: linimon 
State-Changed-When: Sun Jun 18 21:18:39 UTC 2006 
State-Changed-Why:  
Feedback timeout (> 2 months). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=66634 
>Unformatted:
