From nobody@FreeBSD.org  Fri Sep  3 20:14:37 2010
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 43A8D10656B8
	for <freebsd-gnats-submit@FreeBSD.org>; Fri,  3 Sep 2010 20:14:37 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 330648FC13
	for <freebsd-gnats-submit@FreeBSD.org>; Fri,  3 Sep 2010 20:14:37 +0000 (UTC)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id o83KEa8h038089
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 3 Sep 2010 20:14:36 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id o83KEamZ038088;
	Fri, 3 Sep 2010 20:14:36 GMT
	(envelope-from nobody)
Message-Id: <201009032014.o83KEamZ038088@www.freebsd.org>
Date: Fri, 3 Sep 2010 20:14:36 GMT
From: Vadim Fedorenko <junk@fromru.con>
To: freebsd-gnats-submit@FreeBSD.org
Subject: msk watchdog timeout
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         150257
>Category:       kern
>Synopsis:       [msk] watchdog timeout
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    yongari
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Sep 03 20:20:00 UTC 2010
>Closed-Date:    Tue Sep 06 00:49:29 UTC 2011
>Last-Modified:  Tue Sep 06 00:49:29 UTC 2011
>Originator:     Vadim Fedorenko
>Release:        7.3-STABLE
>Organization:
>Environment:
FreeBSD gateway.troyka-stavropol.ru 7.3-STABLE FreeBSD 7.3-STABLE #0: Fri Aug 13 23:24:51 MSD 2010     junk@gateway.troyka-stavropol.ru:/usr/obj/usr/src/sys/PFKERNEL  i386
>Description:
I'm using DLink DGE-560T Card.
After a couple of days on heavy load card's driver begins with:

msk0: watchdog timeout
msk0: watchdog timeout
msk0: link state changed to DOWN
msk0: link state changed to UP
msk0: watchdog timeout
msk0: link state changed to DOWN
msk0: link state changed to UP

and no traffic can pass this interface
If msk is used as module then only kldunload/kldload brings interface back online.

It seems to me to be the same bug as kern/116853 but on another hardware
pciconf -lv:
mskc0@pci0:2:0:0:       class=0x020000 card=0x4b001186 chip=0x4b001186 rev=0x13 hdr=0x00
    vendor     = 'D-Link System Inc'
    device     = 'DGE-560T PCIe Gigabit Ethernet Adapter'
    class      = network
    subclass   = ethernet

After applying yongari's patches from kern/116853 card become some more stable, but still begins with "watchdog timeout" after a week of heavy traffic.

If PCI MSI/MSI-X is disabled then bug dissappears, but it seems strange to me, because PCI-Ex cards supports MSI in specification, so the problem seems to be in software
>How-To-Repeat:
use D-Link DGE-560T adapter and try to transfer big files (DVD ISO for example) through it.
>Fix:
place 

hw.pci.enable_msix=0
hw.pci.enable_msi=0

in /boot/loader.conf

>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-net 
Responsible-Changed-By: vwe 
Responsible-Changed-When: Sat Sep 4 09:29:35 UTC 2010 
Responsible-Changed-Why:  

Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=150257 
State-Changed-From-To: open->feedback 
State-Changed-By: yongari 
State-Changed-When: Mon Oct 25 19:17:29 UTC 2010 
State-Changed-Why:  
I also have a DGE-560T but I can't reproduce this issue on my box. 
Because you said disabling MSI fixed the issue I vaguely guess it 
could be a silicon bug which controller sometimes looses Tx 
completion interrupts or could be trggered by inappropriately 
programmed event timer for the controller. It seems Yukon II 
controllers are very sensitive to internal timer values so it can 
also trigger the issue. I don't have permanent solution for these 
issues but you can try the patch at the following URL. 
http://people.freebsd.org/~yongari/msk/msk.watchdog.diff 

It does not fix the issue but it will show watchdog timeout message 
and tries to recover from that ranther than completely resetting 
controller. If all goes ok, it sometimes shows watchdog timeouts but 
it wouldn't reset controller such that you can treat the watchdog 
timeouts as information message. 


Responsible-Changed-From-To: freebsd-net->yongari 
Responsible-Changed-By: yongari 
Responsible-Changed-When: Mon Oct 25 19:17:29 UTC 2010 
Responsible-Changed-Why:  
Grab. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=150257 

From: Vadim Fedorenko <junk@fromru.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/150257: [msk] watchdog timeout
Date: Wed, 3 Nov 2010 02:39:34 +0300

 Patch doesn't work. After about 36 hours of normal work with msi=1 and
 msix=1, my box wrote
 msk0: watchdog timeout
 on console and msk0 interface stopped to work.
 Again only kldunload/kldload brought it back.
 
 The message was exactly "watchdog timeout" without "recovering"
 string. So, i think the problem is somewhere else.
 
 -- 
 WBR, Vadim 
 

From: Vadim Fedorenko <junk@fromru.com>
To: bug-followup@FreeBSD.org, junk@fromru.com
Cc:  
Subject: Re: kern/150257: [msk] watchdog timeout
Date: Tue, 23 Nov 2010 00:21:16 +0300

 Also I should mention that disabling MSI doesn't fully fixes the issue
 After about 2 months of stable work my box said "msk0: watch dog
 timeout" and only kldunload/kldload helped. So previous fix is not
 full.
 
State-Changed-From-To: feedback->open 
State-Changed-By: yongari 
State-Changed-When: Sat Jan 22 00:52:45 UTC 2011 
State-Changed-Why:  
Feedback received. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=150257 
State-Changed-From-To: open->feedback 
State-Changed-By: yongari 
State-Changed-When: Sun May 29 23:32:16 UTC 2011 
State-Changed-Why:  
I received positive reports from msk(4) users who are suffering 
from msk(4) instability and I think this issue also might be fixed 
in latest msk(4) in HEAD. 
For your easy testing, I've back-ported msk(4) of HEAD to 
8.2-RELEASE. Please download the two files at the following URL and 
rebuild kernel. Make sure to you cold-start your box before 
rebooting to new kernel(i.e. unplug power cable and wait 10-20 
seconds and boot). 
http://people.freebsd.org/~yongari/msk/8.2R/if_msk.c 
http://people.freebsd.org/~yongari/msk/8.2R/if_mskreg.h 

http://www.freebsd.org/cgi/query-pr.cgi?pr=150257 

From: YongHyeon PYUN <pyunyh@gmail.com>
To: junk@fromru.com
Cc: bug-followup@FreeBSD.org
Subject: Re: kern/150257: [msk] watchdog timeout
Date: Sun, 29 May 2011 16:52:56 -0700

 On Sun, May 29, 2011 at 11:32:44PM +0000, yongari@freebsd.org wrote:
 > Synopsis: [msk] watchdog timeout
 > 
 > State-Changed-From-To: open->feedback
 > State-Changed-By: yongari
 > State-Changed-When: Sun May 29 23:32:16 UTC 2011
 > State-Changed-Why: 
 > I received positive reports from msk(4) users who are suffering
 > from msk(4) instability and I think this issue also might be fixed
 > in latest msk(4) in HEAD.
 > For your easy testing, I've back-ported msk(4) of HEAD to
 > 8.2-RELEASE. Please download the two files at the following URL and
 > rebuild kernel. Make sure to you cold-start your box before
 > rebooting to new kernel(i.e. unplug power cable and wait 10-20
 > seconds and boot).
 > http://people.freebsd.org/~yongari/msk/8.2R/if_msk.c
 > http://people.freebsd.org/~yongari/msk/8.2R/if_mskreg.h
 > 
 > http://www.freebsd.org/cgi/query-pr.cgi?pr=150257
 
 Resend after correcting email address. Have no idea why PR database
 records wrong address.
State-Changed-From-To: feedback->closed 
State-Changed-By: yongari 
State-Changed-When: Tue Sep 6 00:48:49 UTC 2011 
State-Changed-Why:  
Feedback timeout(> 3 months). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=150257 
>Unformatted:
