From nobody@FreeBSD.org  Wed Aug  1 21:14:42 2007
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7B22A16A418
	for <freebsd-gnats-submit@FreeBSD.org>; Wed,  1 Aug 2007 21:14:42 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 6560113C469
	for <freebsd-gnats-submit@FreeBSD.org>; Wed,  1 Aug 2007 21:14:42 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.1/8.14.1) with ESMTP id l71LEgio058776
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 1 Aug 2007 21:14:42 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.1/8.14.1/Submit) id l71LEgfv058775;
	Wed, 1 Aug 2007 21:14:42 GMT
	(envelope-from nobody)
Message-Id: <200708012114.l71LEgfv058775@www.freebsd.org>
Date: Wed, 1 Aug 2007 21:14:42 GMT
From: "O. Hartmann" <ohartman@zedat.fu-berlin.de>
To: freebsd-gnats-submit@FreeBSD.org
Subject: nfe0: watchdog timeout (missed Tx interrupts) -- recovering (UP with SCHED_ULE)
X-Send-Pr-Version: www-3.0

>Number:         115126
>Category:       amd64
>Synopsis:       [nfe] nfe0: watchdog timeout (missed Tx interrupts) -- recovering (UP with SCHED_ULE)
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    yongari
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Aug 01 21:20:01 GMT 2007
>Closed-Date:    Sun Dec 05 10:38:42 UTC 2010
>Last-Modified:  Sun Dec 05 10:38:42 UTC 2010
>Originator:     O. Hartmann
>Release:        FreeBSD 7.0-CURRENT/amd64 UP (SCHED_ULE)
>Organization:
FU Berlin
>Environment:
FreeBSD zbv800.org 7.0-CURRENT FreeBSD 7.0-CURRENT #2: Tue Jul 31 00:26:18 CEST 2007     root@zbv800.org:/usr/obj/usr/src/sys/ZBV800  amd64

>Description:
Under heavy load the built-in NIC of the nVidia nForce4-X32-SLI chipset drops connection with 

nfe0: watchdog timeout (missed Tx interrupts) -- rcovering

message. While this happens when box is under heavy load (compiling world -j2 on UP system) and portupgrading and downloading some torrents, the box becomes stuck for minutes: mouse pointer gets stuck (USB mouse), terminal output seems like a slide show. de- and attaching the USB mouse shows the detach and attach message of the kernel 5(!) minutes after this action has been taken (normally the message shows up immediately).

Box uses SCHED_ULE/PREEMPTION kernel.

Below dmesg-output and pciconf -lcv and vmstat -i. Please have a look at the vmstat-i output. pcm0 shows a lot of interrupts, although the sound device is not in use!


vmstat -i
======

irq1: atkbd0                        9598          0
irq6: fdc0                            32          0
irq14: ata0                          266          0
irq18: pcm0                     21723773       1607
irq20: atapci3                       276          0
irq21: nfe0 ohci0                3913674        289
irq22: ehci0                           3          0
irq23: atapci2                   1154023         85
cpu0: timer                     40532996       2999
irq256: mskc0                    1408680        104
Total                           68743321       5087


pciconf -lvc
======

none0@pci0:0:0: class=0x050000 card=0x81d21043 chip=0x02f410de rev=0xa2 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'C51 Host Bridge'
    class      = memory
    subclass   = RAM
    cap 08[44] = HT slave
    cap 08[e0] = HT MSI address window enabled at 0xfee00000
none1@pci0:0:1: class=0x050000 card=0x81d21043 chip=0x02fa10de rev=0xa2 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'C51 Memory Controller 0'
    class      = memory
    subclass   = RAM
none2@pci0:0:2: class=0x050000 card=0x81d21043 chip=0x02fe10de rev=0xa2 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'C51 Memory Controller 1'
    class      = memory
    subclass   = RAM
none3@pci0:0:3: class=0x050000 card=0x81d21043 chip=0x02f810de rev=0xa2 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'C51 Memory Controller 5'
    class      = memory
    subclass   = RAM
none4@pci0:0:4: class=0x050000 card=0x81d21043 chip=0x02f910de rev=0xa2 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'C51 Memory Controller 4'
    class      = memory
    subclass   = RAM
none5@pci0:0:5: class=0x050000 card=0x81d21043 chip=0x02ff10de rev=0xa2 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'C51 Host Bridge'
    class      = memory
    subclass   = RAM
    cap 00[44] = unknown
none6@pci0:0:6: class=0x050000 card=0x81d21043 chip=0x027f10de rev=0xa2 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'C51 Memory Controller 3'
    class      = memory
    subclass   = RAM
none7@pci0:0:7: class=0x050000 card=0x81d21043 chip=0x027e10de rev=0xa2 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'C51 Memory Controller 2'
    class      = memory
    subclass   = RAM
pcib1@pci0:2:0: class=0x060400 card=0x000010de chip=0x02fc10de rev=0xa1 hdr=0x01
    vendor     = 'Nvidia Corp'
    device     = 'C51 PCI Express Bridge'
    class      = bridge
    subclass   = PCI-PCI
    cap 0d[40] = PCI Bridge card=0x000010de
    cap 01[48] = powerspec 2  supports D0 D3  current D0
    cap 05[50] = MSI supports 2 messages, 64 bit 
    cap 08[60] = HT MSI address window enabled at 0xfee00000
    cap 10[80] = PCI-Express 1 root port
pcib2@pci0:3:0: class=0x060400 card=0x000010de chip=0x02fd10de rev=0xa1 hdr=0x01
    vendor     = 'Nvidia Corp'
    device     = 'C51 PCI Express Bridge'
    class      = bridge
    subclass   = PCI-PCI
    cap 0d[40] = PCI Bridge card=0x000010de
    cap 01[48] = powerspec 2  supports D0 D3  current D0
    cap 05[50] = MSI supports 2 messages, 64 bit 
    cap 08[60] = HT MSI address window enabled at 0xfee00000
    cap 10[80] = PCI-Express 1 root port
pcib3@pci0:4:0: class=0x060400 card=0x000010de chip=0x02fb10de rev=0xa1 hdr=0x01
    vendor     = 'Nvidia Corp'
    device     = 'C51 PCI Express Bridge'
    class      = bridge
    subclass   = PCI-PCI
    cap 0d[40] = PCI Bridge card=0x000010de
    cap 01[48] = powerspec 2  supports D0 D3  current D0
    cap 05[50] = MSI supports 2 messages, 64 bit 
    cap 08[60] = HT MSI address window enabled at 0xfee00000
    cap 10[80] = PCI-Express 1 root port
none8@pci0:9:0: class=0x058000 card=0x815a1043 chip=0x005e10de rev=0xa4 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'nForce4 Memory Controller'
    class      = memory
    cap 08[44] = HT slave
    cap 08[e0] = HT MSI address window enabled at 0xfee00000
isab0@pci0:10:0:        class=0x060100 card=0x815a1043 chip=0x005010de rev=0xa4 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'nForce4 PCI to ISA Bridge'
    class      = bridge
    subclass   = PCI-ISA
nfsmb0@pci0:10:1:       class=0x0c0500 card=0x815a1043 chip=0x005210de rev=0xa2 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'nForce4 SMBus'
    class      = serial bus
    subclass   = SMBus
    cap 01[44] = powerspec 2  supports D0 D3  current D0
ohci0@pci0:11:0:        class=0x0c0310 card=0x815a1043 chip=0x005a10de rev=0xa2 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'nForce4 USB Controller'
    class      = serial bus
    subclass   = USB
    cap 01[44] = powerspec 2  supports D0 D1 D2 D3  current D0
ehci0@pci0:11:1:        class=0x0c0320 card=0x815a1043 chip=0x005b10de rev=0xa4 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'nForce4 USB 2.0 Controller'
    class      = serial bus
    subclass   = USB
    cap 0a[44] = EHCI Debug Port at offset 0x98 in map 0x14
    cap 01[80] = powerspec 2  supports D0 D1 D2 D3  current D0
atapci1@pci0:15:0:      class=0x01018a card=0x815a1043 chip=0x005310de rev=0xf3 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'NVidia nForce 4 SLI IDE Controller'
    class      = mass storage
    subclass   = ATA
    cap 01[44] = powerspec 2  supports D0 D3  current D0
atapci2@pci0:16:0:      class=0x010485 card=0x815a1043 chip=0x005410de rev=0xf3 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'NVidia nForce 4 SLI IDE Controller'
    class      = mass storage
    subclass   = RAID
    cap 01[44] = powerspec 2  supports D0 D3  current D0
atapci3@pci0:17:0:      class=0x010185 card=0x815a1043 chip=0x005510de rev=0xf3 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'NVidia nForce 4 SLI IDE Controller'
    class      = mass storage
    subclass   = ATA
    cap 01[44] = powerspec 2  supports D0 D3  current D0
pcib4@pci0:18:0:        class=0x060401 card=0x00000000 chip=0x005c10de rev=0xa2 hdr=0x01
    vendor     = 'Nvidia Corp'
    device     = 'nForce4 PCI Bridge'
    class      = bridge
    subclass   = PCI-PCI
nfe0@pci0:19:0: class=0x068000 card=0x81411043 chip=0x005710de rev=0xa3 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'nForce4 Ultra NVidia Network Bus Enumerator'
    class      = bridge
    cap 01[44] = powerspec 2  supports D0 D1 D2 D3  current D0
pcib5@pci0:22:0:        class=0x060400 card=0x00000000 chip=0x005d10de rev=0xa3 hdr=0x01
    vendor     = 'Nvidia Corp'
    device     = 'nForce4 PCI Express Bridge'
    class      = bridge
    subclass   = PCI-PCI
    cap 01[40] = powerspec 2  supports D0 D3  current D0
    cap 05[48] = MSI supports 2 messages, 64 bit 
    cap 08[58] = HT MSI address window enabled at 0xfee00000
    cap 10[80] = PCI-Express 1 root port
pcib6@pci0:23:0:        class=0x060400 card=0x00000000 chip=0x005d10de rev=0xa3 hdr=0x01
    vendor     = 'Nvidia Corp'
    device     = 'nForce4 PCI Express Bridge'
    class      = bridge
    subclass   = PCI-PCI
    cap 01[40] = powerspec 2  supports D0 D3  current D0
    cap 05[48] = MSI supports 2 messages, 64 bit 
    cap 08[58] = HT MSI address window enabled at 0xfee00000
    cap 10[80] = PCI-Express 1 root port
hostb0@pci0:24:0:       class=0x060000 card=0x00000000 chip=0x11001022 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices (AMD)'
    device     = 'Athlon 64 / Opteron HyperTransport Technology Configuration'
    class      = bridge
    subclass   = HOST-PCI
    cap 08[80] = HT host
hostb1@pci0:24:1:       class=0x060000 card=0x00000000 chip=0x11011022 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices (AMD)'
    device     = 'Athlon 64 / Opteron Address Map'
    class      = bridge
    subclass   = HOST-PCI
hostb2@pci0:24:2:       class=0x060000 card=0x00000000 chip=0x11021022 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices (AMD)'
    device     = 'Athlon 64 / Opteron DRAM Controller'
    class      = bridge
    subclass   = HOST-PCI
hostb3@pci0:24:3:       class=0x060000 card=0x00000000 chip=0x11031022 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices (AMD)'
    device     = 'Athlon 64 / Opteron Miscellaneous Control'
    class      = bridge
    subclass   = HOST-PCI
atapci0@pci1:0:0:       class=0x018000 card=0x819f1043 chip=0x31321095 rev=0x01 hdr=0x00
    vendor     = 'Silicon Image Inc (Was: CMD Technology Inc)'
    device     = 'SiI 3132 PCI Express (1x) to 2 Port SATA300'
    class      = mass storage
    cap 01[54] = powerspec 2  supports D0 D1 D2 D3  current D0
    cap 05[5c] = MSI supports 1 message, 64 bit 
    cap 10[70] = PCI-Express 1 legacy endpoint
mskc0@pci2:0:0: class=0x020000 card=0x81421043 chip=0x436211ab rev=0x15 hdr=0x00
    vendor     = 'Marvell Semiconductor (Was: Galileo Technology Ltd)'
    device     = 'Yukon 88E8053 PCI-E Gigabit Ethernet Controller (Copper)'
    class      = network
    subclass   = ethernet
    cap 01[48] = powerspec 2  supports D0 D1 D2 D3  current D0
    cap 03[50] = VPD
    cap 05[5c] = MSI supports 2 messages, 64 bit enabled with 2 messages
    cap 10[e0] = PCI-Express 1 legacy endpoint
vgapci0@pci3:0:0:       class=0x030000 card=0x81981043 chip=0x00f910de rev=0xa2 hdr=0x00
    vendor     = 'Nvidia Corp'
    device     = 'GeForce 6800 Series GPU [BR02.1]'
    class      = display
    subclass   = VGA
    cap 01[60] = powerspec 2  supports D0 D3  current D0
    cap 05[68] = MSI supports 1 message, 64 bit 
    cap 10[78] = PCI-Express 1 legacy endpoint
pcm0@pci4:8:0:  class=0x040100 card=0x36311412 chip=0x17241412 rev=0x01 hdr=0x00
    vendor     = 'VIA Technologies Inc (Was: IC Ensemble Inc)'
    device     = 'VT1720/24 Envy24PT/HT PCI Multi-Channel Audio Controller'
    class      = multimedia
    subclass   = audio
    cap 01[80] = powerspec 1  supports D0 D2 D3  current D0





dmesg
======

Copyright (c) 1992-2007 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.0-CURRENT #0: Sun Jul 29 03:12:12 CEST 2007
    root@zbv800.org:/usr/obj/usr/src/sys/ZBV800
ACPI APIC Table: <A_M_I_ OEMAPIC >
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm) 64 Processor 3500+ (2376.07-MHz K8-class CPU)
  Origin = "AuthenticAMD"  Id = 0x10ff0  Stepping = 0
  Features=0x78bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2>
  AMD Features=0xe2500800<SYSCALL,NX,MMX+,FFXSR,LM,3DNow!+,3DNow!>
  AMD Features2=0x1<LAHF>
usable memory = 2138107904 (2039 MB)
avail memory  = 2063650816 (1968 MB)
ioapic0 <Version 1.1> irqs 0-23 on motherboard
netsmb_dev: loaded
kbd1 at kbdmux0
cryptosoft0: <software crypto> on motherboard
acpi0: <A_M_I_ OEMXSDT> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
acpi0: reservation of 0, a0000 (3) failed
acpi0: reservation of 100000, 7ff00000 (3) failed
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x508-0x50b on acpi0
cpu0: <ACPI CPU> on acpi0
acpi_throttle0: <ACPI CPU Throttling> on cpu0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pci0: <memory, RAM> at device 0.0 (no driver attached)
pci0: <memory, RAM> at device 0.1 (no driver attached)
pci0: <memory, RAM> at device 0.2 (no driver attached)
pci0: <memory, RAM> at device 0.3 (no driver attached)
pci0: <memory, RAM> at device 0.4 (no driver attached)
pci0: <memory, RAM> at device 0.5 (no driver attached)
pci0: <memory, RAM> at device 0.6 (no driver attached)
pci0: <memory, RAM> at device 0.7 (no driver attached)
pcib1: <ACPI PCI-PCI bridge> at device 2.0 on pci0
pci1: <ACPI PCI bus> on pcib1
atapci0: <SiI 3132 SATA300 controller> port 0xcc00-0xcc7f mem 0xdddffc00-0xdddffc7f,0xdddf8000-0xdddfbfff irq 16 at device 0.0 on pci1
atapci0: [ITHREAD]
ata2: <ATA channel 0> on atapci0
ata2: [ITHREAD]
ata3: <ATA channel 1> on atapci0
ata3: [ITHREAD]
pcib2: <ACPI PCI-PCI bridge> at device 3.0 on pci0
pci2: <ACPI PCI bus> on pcib2
mskc0: <Marvell Yukon 88E8053 Gigabit Ethernet> port 0xd800-0xd8ff mem 0xddefc000-0xddefffff irq 17 at device 0.0 on pci2
msk0: <Marvell Technology Group Ltd. Yukon EC Id 0xb6 Rev 0x02> on mskc0
msk0: Ethernet address: XX:XX:XX:XX:XX:XX
miibus0: <MII bus> on msk0
e1000phy0: <Marvell 88E1111 Gigabit PHY> PHY 0 on miibus0
e1000phy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX-FDX, auto
mskc0: [FILTER]
pcib3: <ACPI PCI-PCI bridge> at device 4.0 on pci0
pci3: <ACPI PCI bus> on pcib3
vgapci0: <VGA-compatible display> mem 0xdf000000-0xdfffffff,0xc0000000-0xcfffffff,0xde000000-0xdeffffff irq 17 at device 0.0 on pci3
pci0: <memory> at device 9.0 (no driver attached)
isab0: <PCI-ISA bridge> at device 10.0 on pci0
isa0: <ISA bus> on isab0
nfsmb0: <nForce2/3/4 MCP SMBus Controller> port 0xbc00-0xbc1f,0x600-0x63f,0x700-0x73f irq 20 at device 10.1 on pci0
smbus0: <System Management Bus> on nfsmb0
smb0: <SMBus generic I/O> on smbus0
nfsmb1: <nForce2/3/4 MCP SMBus Controller> on nfsmb0
smbus1: <System Management Bus> on nfsmb1
smb1: <SMBus generic I/O> on smbus1
ohci0: <OHCI (generic) USB controller> mem 0xddcfa000-0xddcfafff irq 21 at device 11.0 on pci0
ohci0: [GIANT-LOCKED]
ohci0: [ITHREAD]
usb0: OHCI version 1.0, legacy support
usb0: SMM does not respond, resetting
usb0: <OHCI (generic) USB controller> on ohci0
usb0: USB revision 1.0
uhub0: <nVidia OHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0
uhub0: 10 ports with 10 removable, self powered
ehci0: <NVIDIA nForce4 USB 2.0 controller> mem 0xddcfbc00-0xddcfbcff irq 22 at device 11.1 on pci0
ehci0: [GIANT-LOCKED]
ehci0: [ITHREAD]
usb1: EHCI version 1.0
usb1: companion controller, 4 ports each: usb0
usb1: <NVIDIA nForce4 USB 2.0 controller> on ehci0
usb1: USB revision 2.0
uhub1: <nVidia EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb1
uhub1: 10 ports with 10 removable, self powered
atapci1: <nVidia nForce CK804 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 15.0 on pci0
ata0: <ATA channel 0> on atapci1
ata0: [ITHREAD]
ata1: <ATA channel 1> on atapci1
ata1: [ITHREAD]
atapci2: <nVidia nForce CK804 SATA300 controller> port 0xb480-0xb487,0xb400-0xb403,0xb080-0xb087,0xb000-0xb003,0xac00-0xac0f mem 0xddcf9000-0xddcf9fff irq 23 at device 16.0 on pci0
atapci2: [ITHREAD]
ata4: <ATA channel 0> on atapci2
ata4: [ITHREAD]
ata5: <ATA channel 1> on atapci2
ata5: [ITHREAD]
atapci3: <nVidia nForce CK804 SATA300 controller> port 0xa880-0xa887,0xa800-0xa803,0xa480-0xa487,0xa400-0xa403,0xa080-0xa08f mem 0xddcf8000-0xddcf8fff irq 20 at device 17.0 on pci0
atapci3: [ITHREAD]
ata6: <ATA channel 0> on atapci3
ata6: [ITHREAD]
ata7: <ATA channel 1> on atapci3
ata7: [ITHREAD]
pcib4: <ACPI PCI-PCI bridge> at device 18.0 on pci0
pci4: <ACPI PCI bus> on pcib4
pcm0: <Envy24GT audio (M-Audio Revolution 5.1)> port 0xec00-0xec1f,0xe880-0xe8ff irq 18 at device 8.0 on pci4
pcm0: [GIANT-LOCKED]
pcm0: [ITHREAD]
pcm0: system configuration
  SubVendorID: 0x1412, SubDeviceID: 0x3631
  XIN2 Clock Source: 49.152MHz(192kHz*256)
  MPU-401 UART(s) #: not implemented
  ADC #: 1
  DAC #: 3
  Multi-track converter type: I2S(with volume, 192KHz support, 24bit resolution, ID#0x0)
  S/PDIF(IN/OUT): 0/1 ID# 0x00
  GPIO(mask/dir/state): 0x3fff85/0x4000fa/0x72
nfe0: <NVIDIA nForce4 CK804 MCP9 Networking Adapter> port 0xa000-0xa007 mem 0xddcf7000-0xddcf7fff irq 21 at device 19.0 on pci0
miibus1: <MII bus> on nfe0
e1000phy1: <Marvell 88E1111 Gigabit PHY> PHY 1 on miibus1
e1000phy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX-FDX, auto
nfe0: Ethernet address: XX:XX:XX:XX:XX:XX
nfe0: [FILTER]
pcib5: <ACPI PCI-PCI bridge> at device 22.0 on pci0
pci5: <ACPI PCI bus> on pcib5
pcib6: <ACPI PCI-PCI bridge> at device 23.0 on pci0
pci6: <ACPI PCI bus> on pcib6
acpi_button0: <Power Button> on acpi0
fdc0: <floppy drive controller (FDE)> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: [FILTER]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
acpi_aiboost0: <ASUStek AIBOOSTER> on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
sio0: type 16550A
sio0: [FILTER]
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <8 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
fb0 at vga0
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
ums0: <Logitech USB-PS/2 Optical Mouse, class 0/0, rev 2.00/22.00, addr 2> on uhub0
ums0: 8 buttons and Z dir.
NULL mp in getnewvnode()
Timecounter "TSC" frequency 2376069771 Hz quality 800
Timecounters tick every 1.333 msec
Fast IPsec: Initialized Security Association Processing.
acd0: DVDR <NEC DVD RW ND-3500AG/2.1B> at ata0-master UDMA33
ad8: 238475MB <HDT722525DLA380 V44OA96A> at ata4-master SATA300
ad10: 238475MB <HDT722525DLA380 V44OA96A> at ata5-master SATA300
ad12: 190782MB <SAMSUNG SP2004C VM100-31> at ata6-master SATA300
ar0: 476950MB <nVidia MediaShield RAID0 (stripe 64 KB)> status: READY
ar0: disk0 READY using ad10 at ata5-master
ar0: disk1 READY using ad8 at ata4-master
GEOM: ad12: corrupt or invalid GPT detected.
GEOM: ad12: GPT rejected -- may not be recoverable.
GEOM_LABEL: Label for provider ar0s3 is ufs/HOMES.
GEOM_LABEL: Label for provider ar0s1d is ufs/VAR.
GEOM_LABEL: Label for provider ar0s1e is ufs/USR.
GEOM_LABEL: Label for provider ar0s1f is ufs/LOCAL.
GEOM_LABEL: Label for provider ar0s1g is ufs/OBJ.
GEOM_LABEL: Label for provider ar0s1h is ufs/SRC.
GEOM_LABEL: Label for provider ar0s2d is ufs/COMPAT.
GEOM_LABEL: Label for provider ar0s2e is ufs/DATABASE.
GEOM_LABEL: Label for provider ar0s2f is ufs/PORTS.
GEOM_LABEL: Label for provider ar0s2g is ufs/SCRATCH.
cd0 at ata0 bus 0 target 0 lun 0
cd0: <_NEC DVD_RW ND-3500AG 2.1B> Removable CD-ROM SCSI-0 device 
cd0: 33.000MB/s transfers
cd0: Attempt to query device size failed: NOT READY, Medium not present
Trying to mount root from ufs:/dev/ar0s1a
GEOM_ELI: Device ar0s1b.eli created.
GEOM_ELI: Encryption: AES-CBC 256
GEOM_ELI:     Crypto: software
GEOM_ELI: Device ar0s2b.eli created.
GEOM_ELI: Encryption: AES-CBC 256
GEOM_ELI:     Crypto: software
GEOM_LABEL: Label ufs/VAR removed.
GEOM_LABEL: Label for provider ar0s1d is ufs/VAR.
GEOM_LABEL: Label ufs/COMPAT removed.
GEOM_LABEL: Label for provider ar0s2d is ufs/COMPAT.
GEOM_LABEL: Label ufs/HOMES removed.
GEOM_LABEL: Label for provider ar0s3 is ufs/HOMES.
GEOM_LABEL: Label ufs/USR removed.
GEOM_LABEL: Label for provider ar0s1e is ufs/USR.
GEOM_LABEL: Label ufs/DATABASE removed.
GEOM_LABEL: Label for provider ar0s2e is ufs/DATABASE.
GEOM_LABEL: Label ufs/LOCAL removed.
GEOM_LABEL: Label for provider ar0s1f is ufs/LOCAL.
GEOM_LABEL: Label ufs/OBJ removed.
GEOM_LABEL: Label for provider ar0s1g is ufs/OBJ.
GEOM_LABEL: Label ufs/PORTS removed.
GEOM_LABEL: Label for provider ar0s2f is ufs/PORTS.
GEOM_LABEL: Label ufs/SRC removed.
GEOM_LABEL: Label for provider ar0s1h is ufs/SRC.
GEOM_LABEL: Label ufs/SCRATCH removed.
GEOM_LABEL: Label for provider ar0s2g is ufs/SCRATCH.
GEOM_LABEL: Label ufs/VAR removed.
GEOM_LABEL: Label ufs/COMPAT removed.
GEOM_LABEL: Label ufs/HOMES removed.
GEOM_LABEL: Label ufs/USR removed.
GEOM_LABEL: Label ufs/DATABASE removed.
GEOM_LABEL: Label ufs/LOCAL removed.
GEOM_LABEL: Label ufs/OBJ removed.
GEOM_LABEL: Label ufs/PORTS removed.
GEOM_LABEL: Label ufs/SRC removed.
GEOM_LABEL: Label ufs/SCRATCH removed.
WARNING: ZFS is considered to be an experimental feature in FreeBSD.
ZFS filesystem version 6
ZFS storage pool version 6
GEOM: ad12: corrupt or invalid GPT detected.
GEOM: ad12: GPT rejected -- may not be recoverable.
nfe0: link state changed to UP

>How-To-Repeat:

>Fix:


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-amd64->yongari 
Responsible-Changed-By: yongari 
Responsible-Changed-When: Thu Aug 2 00:16:19 UTC 2007 
Responsible-Changed-Why:  
Grab. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=115126 

From: Pyun YongHyeon <pyunyh@gmail.com>
To: "O. Hartmann" <ohartman@zedat.fu-berlin.de>
Cc: bug-followup@FreeBSD.org
Subject: Re: amd64/115126: nfe0: watchdog timeout (missed Tx interrupts) -- recovering (UP with SCHED_ULE)
Date: Thu, 2 Aug 2007 09:40:02 +0900

 On Thu, Aug 02, 2007 at 12:16:42AM +0000, yongari@FreeBSD.org wrote:
  > Synopsis: nfe0: watchdog timeout (missed Tx interrupts) -- recovering (UP with SCHED_ULE)
  > 
  > Responsible-Changed-From-To: freebsd-amd64->yongari
  > Responsible-Changed-By: yongari
  > Responsible-Changed-When: Thu Aug 2 00:16:19 UTC 2007
  > Responsible-Changed-Why: 
  > Grab.
  > 
  > http://www.freebsd.org/cgi/query-pr.cgi?pr=115126
 
 It seems that your NIC doesn't have MSI capability and the driver
 uses shared interrupt with USB. To cope with the interrupt sharing
 case try enabling polling(4) and let me know the result.
 
 Btw, I don't know why pcm0(actually, it would be envy24ht(4))
 generates lots of interrupts. You may get better answer from
 multimedia ML.
 
 -- 
 Regards,
 Pyun YongHyeon

From: Jason Bacon <jbacon@mcw.edu>
To: bug-followup@freebsd.org, ohartman@zedat.fu-berlin.de
Cc:  
Subject: Re: amd64/115126: [nfe] nfe0: watchdog timeout (missed Tx interrupts) -- recovering (UP with SCHED_ULE)
Date: Fri, 4 Jan 2008 11:04:47 -0600

 I ran into the same problem with PC-BSD 1.4.1.  Kernel rebuilt with ARG_MAX 
 increased to 1048576 and ehci disabled, because we kept losing the ulpt0 
 device (connedted to an HP LJ4250).
 
 uname -a:
 
 FreeBSD apu.neuro.mcw.edu 6.3-PRERELEASE FreeBSD 6.3-PRERELEASE #1: Wed Jan  2 
 13:38:26 CST 2008     bacon@apu.neuro.mcw.edu:/usr/obj/usr/src/sys/PCBSD-SMP  
 i386
 
 Motherboard: ASUS M2N4-SLI with onboard NIC and sound.
 
 'vmstat -i' initially showed nfe0 and pcm0 sharing an irq.  I tried turning on 
 polling:
 
 <<<ROOT@apu>>> /home/bacon 456 # ifconfig nfe0 polling
 ifconfig: polling: Invalid argument
 
 I then rebooted and disabled the on-board sound in the BIOS (which is fine 
 with me, since this machine is a file server).  Now nfe0 has it's own irq, 
 and everyone is happy.
 
 FreeBSD apu bacon ~ 73: vmstat -i
 interrupt                          total       rate
 irq1: atkbd0                         492          0
 irq6: fdc0                             9          0
 irq15: ata1                        10144          2
 irq16: dc0                             3          0
 irq21: ohci0+                      46205          9
 irq22: nfe0                       693562        137
 cpu0: timer                     10087884       1999
 cpu1: timer                     10080334       1998
 Total                           20918633       4146
 
 dmesg.boot:
 
 (dc0 is just a a Netgear PCI NIC I popped in while the system was down, in 
 case disabling the sound in the BIOS didn't do the trick.  It'll come out 
 next time I shut down if nfe0 behaves.)
 
 Copyright (c) 1992-2007 The FreeBSD Project.
 Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
 	The Regents of the University of California. All rights reserved.
 FreeBSD is a registered trademark of The FreeBSD Foundation.
 FreeBSD 6.3-PRERELEASE #1: Wed Jan  2 13:38:26 CST 2008
     bacon@apu.neuro.mcw.edu:/usr/obj/usr/src/sys/PCBSD-SMP
 ACPI APIC Table: <Nvidia AWRDACPI>
 Timecounter "i8254" frequency 1193182 Hz quality 0
 CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 5200+ (2613.41-MHz 686-class 
 CPU)
   Origin = "AuthenticAMD"  Id = 0x40f32  Stepping = 2
   
 Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
   Features2=0x2001<SSE3,CX16>
   AMD Features=0xea500800<SYSCALL,NX,MMX+,FFXSR,RDTSCP,LM,3DNow!+,3DNow!>
   AMD Features2=0x1f<LAHF,CMP,SVM,ExtAPIC,CR8>
   Cores per package: 2
 real memory  = 3756916736 (3582 MB)
 avail memory = 3673157632 (3502 MB)
 FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
  cpu0 (BSP): APIC ID:  0
  cpu1 (AP): APIC ID:  1
 ioapic0: Changing APIC ID to 2
 ioapic0 <Version 1.1> irqs 0-23 on motherboard
 kbd1 at kbdmux0
 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413)
 acpi0: <Nvidia AWRDACPI> on motherboard
 acpi0: Power Button (fixed)
 Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0
 cpu0: <ACPI CPU> on acpi0
 cpu1: <ACPI CPU> on acpi0
 acpi_button0: <Power Button> on acpi0
 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
 pci0: <ACPI PCI bus> on pcib0
 pci0: <memory> at device 0.0 (no driver attached)
 isab0: <PCI-ISA bridge> at device 1.0 on pci0
 isa0: <ISA bus> on isab0
 pci0: <serial bus, SMBus> at device 1.1 (no driver attached)
 ohci0: <OHCI (generic) USB controller> mem 0xfe02f000-0xfe02ffff irq 21 at 
 device 2.0 on pci0
 ohci0: [GIANT-LOCKED]
 usb0: OHCI version 1.0, legacy support
 usb0: SMM does not respond, resetting
 usb0: <OHCI (generic) USB controller> on ohci0
 usb0: USB revision 1.0
 uhub0: nVidia OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
 uhub0: 10 ports with 10 removable, self powered
 pci0: <serial bus, USB> at device 2.1 (no driver attached)
 atapci0: <nVidia nForce CK804 UDMA133 controller> port 
 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf000-0xf00f at device 6.0 on pci0
 ata0: <ATA channel 0> on atapci0
 ata1: <ATA channel 1> on atapci0
 atapci1: <nVidia nForce CK804 SATA300 controller> port 
 0x9f0-0x9f7,0xbf0-0xbf3,0x970-0x977,0xb70-0xb73,0xdc00-0xdc0f mem 
 0xfe02d000-0xfe02dfff irq 23 at device 7.0 on pci0
 ata2: <ATA channel 0> on atapci1
 ata3: <ATA channel 1> on atapci1
 atapci2: <nVidia nForce CK804 SATA300 controller> port 
 0x9e0-0x9e7,0xbe0-0xbe3,0x960-0x967,0xb60-0xb63,0xc800-0xc80f mem 
 0xfe02c000-0xfe02cfff irq 21 at device 8.0 on pci0
 ata4: <ATA channel 0> on atapci2
 ata5: <ATA channel 1> on atapci2
 pcib1: <ACPI PCI-PCI bridge> at device 9.0 on pci0
 pci1: <ACPI PCI bus> on pcib1
 dc0: <82c169 PNIC 10/100BaseTX> port 0xbc00-0xbcff mem 0xfdfef000-0xfdfef0ff 
 irq 16 at device 6.0 on pci1
 miibus0: <MII bus> on dc0
 bmtphy0: <BCM5201 10/100baseTX PHY> on miibus0
 bmtphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
 dc0: Ethernet address: 00:a0:cc:dc:0f:4e
 pci1: <display, VGA> at device 7.0 (no driver attached)
 nfe0: <NVIDIA nForce4 CK804 MCP9 Networking Adapter> port 0xc400-0xc407 mem 
 0xfe02b000-0xfe02bfff irq 22 at device 10.0 on pci0
 miibus1: <MII bus> on nfe0
 ukphy0: <Generic IEEE 802.3u media interface> on miibus1
 ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, auto
 nfe0: Ethernet address: 00:17:31:5d:8b:13
 nfe0: [FAST]
 pcib2: <ACPI PCI-PCI bridge> at device 11.0 on pci0
 pci2: <ACPI PCI bus> on pcib2
 pcib3: <ACPI PCI-PCI bridge> at device 12.0 on pci0
 pci3: <ACPI PCI bus> on pcib3
 pcib4: <ACPI PCI-PCI bridge> at device 13.0 on pci0
 pci4: <ACPI PCI bus> on pcib4
 pcib5: <ACPI PCI-PCI bridge> at device 14.0 on pci0
 pci5: <ACPI PCI bus> on pcib5
 acpi_tz0: <Thermal Zone> on acpi0
 fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
 fdc0: [FAST]
 fd0: <1440-KB 3.5" drive> on fdc0 drive 0
 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
 sio0: type 16550A
 ppc0: <Standard parallel printer port> port 0x378-0x37f irq 7 on acpi0
 ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode
 ppbus0: <Parallel port bus> on ppc0
 plip0: <PLIP network interface> on ppbus0
 lpt0: <Printer> on ppbus0
 lpt0: Interrupt-driven port
 ppi0: <Parallel I/O> on ppbus0
 atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
 atkbd0: <AT Keyboard> irq 1 on atkbdc0
 kbd0 at atkbd0
 atkbd0: [GIANT-LOCKED]
 pmtimer0 on isa0
 orm0: <ISA Option ROM> at iomem 0xc0000-0xc7fff on isa0
 sc0: <System console> at flags 0x100 on isa0
 sc0: VGA <16 virtual consoles, flags=0x300>
 sio1: configured irq 3 not in bitmap of probed irqs 0
 sio1: port may not be enabled
 vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
 ulpt0: Hewlett-Packard hp LaserJet 4250, rev 2.00/1.00, addr 2, iclass 7/1
 ulpt0: using bi-directional mode
 ums0: Microsoft Microsoft 3-Button Mouse with IntelliEye(TM), rev 1.10/3.00, 
 addr 3, iclass 3/1
 ums0: 3 buttons and Z dir.
 Timecounters tick every 1.000 msec
 acd0: CDROM <FX4830T/R02J> at ata1-master UDMA33
 ad0: 715404MB <Seagate ST3750640AS 3.AAE> at ata4-master SATA300
 SMP: AP CPU #1 Launched!
 cd0 at ata1 bus 0 target 0 lun 0
 cd0: <MITSUMI CD-ROM FX4830T!B R02J> Removable CD-ROM SCSI-0 device 
 cd0: 33.000MB/s transfers
 cd0: Attempt to query device size failed: NOT READY, Medium not present
 Trying to mount root from ufs:/dev/ad0s1a

From: Jason Bacon <jbacon@mcw.edu>
To: pyunyh@gmail.com
Cc: yongari@FreeBSD.org, bug-followup@FreeBSD.org
Subject: Re: amd64/115126: [nfe] nfe0: watchdog timeout (missed Tx interrupts)
 -- recovering (UP with SCHED_ULE)
Date: Mon, 07 Jan 2008 09:36:25 -0600

 Pyun YongHyeon wrote:
 > polling(4) requires kernel rebuilding with 'options DEVICE_POLLING'.
 > Without that 'ifconfig nfe0 polling' have no effect. See polling(4)
 > for more details.
 >
 >   
 Ooops, sorry.  Overlooked the synopsis...  :-(
 
 >    >  ukphy0: <Generic IEEE 802.3u media interface> on miibus1
 >     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 >
 > This is strange. It seems that you didn't apply necessary phy patches
 > as the phy hardware normally requires a dedicated driver instead of
 > generic ukphy(4). Because 6.3-PRERELEASE does not have a nfe(4) in
 > base system I guess you manually had patched your box to get nfe(4)
 > work on your box. Please check you have required phy changes to make
 > nfe(4) work on your box. See the following URL to check what phy
 > patches are needed.
 > http://www.f.csce.kyushu-u.ac.jp/~shigeaki/software/freebsd-nfe.html
 >
 >   
 Wasn't my doing.  This must be a PC-BSD enhancement.  I'll try to verify 
 that and pass this on to the PC-BSD developers if that's the case.
 
 Thanks for the feedback,
 
     Jason
 

From: Luigi Rizzo <rizzo@iet.unipi.it>
To: bug-followup@FreeBSD.org, yongari@freebsd.org
Cc: current@freebsd.org
Subject: Re: amd64/115126: [nfe] nfe0: watchdog timeout (missed Tx interrupts) -- recovering (UP with SCHED_ULE)
Date: Tue, 22 Apr 2008 09:28:39 +0200

 related to this bug, i am seeing similar problems with RELENG_7 and amd64,
 with an ASUS M2N-VM DVI motherboard
 http://www.asus.com/products.aspx?modelmenu=1&model=1841&l1=3&l2=101&l3=567&l4=0
 and an Athlon64-BE2400 dual core CPU .
 
 Under heavy load, e.g. scp-ing a large file over the local network,
 and at the same time doing a buildkernel or building a port,
 and with X11 active (using the 'vesa' xorg driver)
 the network card stalls and doesn't recover - i waited over 10 minutes
 hoping for the watchdog or some timeout to kick in, the only way
 to bring the link back up was
 
 	ifconfig nfe0 down ; ifconfig nfe0 up
 	dhclient nfe0
 
 doing only ifconfig down/up or only dhclient did not help, i needed both.
 
 vmstat -i says the network card has irq256 (???) and it is not shared with
 other devices. Ehci, sound, ohci, ata, and others have low irq numbers
 (6, 14, 20, 21, 22), some shared, some not.
 
 Changing the bios setting for PnP OS from 'yes' to 'no' or viceversa
 does not change the situation.
 
 The stall seems related to the presence of other activity - if i
 let the bulk scp transfer alone, i get an happy 10-10.5Mbytes/s
 (over a 100meg link).
 
 When the stall occurs, i see no interrupts (vmstat -i counts
 for irq256 says the same),
 Packets are still transmitted and received on the other side, it's
 the rx side of the card that becomes deaf. I don't see any
 watchdog timeout or other error messages in /var/log/messages.
 
 Also, enabling polling does not help getting traffic in
 (with a kernel built with DEVICE_POLLING,
 doing sysctl kern.polling.enable=1 and "ifconfig nfe0 polling").
 
 So i suspect that for some reason the rx ring becomes confused
 and does not recover.
 
 Hope this helps...
 
 cheers
 luigi

From: Pyun YongHyeon <pyunyh@gmail.com>
To: Luigi Rizzo <rizzo@iet.unipi.it>
Cc: bug-followup@FreeBSD.org, yongari@FreeBSD.org, current@FreeBSD.org
Subject: Re: amd64/115126: [nfe] nfe0: watchdog timeout (missed Tx interrupts) -- recovering (UP with SCHED_ULE)
Date: Wed, 23 Apr 2008 17:22:40 +0900

 On Tue, Apr 22, 2008 at 09:28:39AM +0200, Luigi Rizzo wrote:
  > related to this bug, i am seeing similar problems with RELENG_7 and amd64,
  > with an ASUS M2N-VM DVI motherboard
  > http://www.asus.com/products.aspx?modelmenu=1&model=1841&l1=3&l2=101&l3=567&l4=0
  > and an Athlon64-BE2400 dual core CPU .
  > 
  > Under heavy load, e.g. scp-ing a large file over the local network,
  > and at the same time doing a buildkernel or building a port,
  > and with X11 active (using the 'vesa' xorg driver)
  > the network card stalls and doesn't recover - i waited over 10 minutes
  > hoping for the watchdog or some timeout to kick in, the only way
  > to bring the link back up was
  > 
  > 	ifconfig nfe0 down ; ifconfig nfe0 up
  > 	dhclient nfe0
  > 
  > doing only ifconfig down/up or only dhclient did not help, i needed both.
  > 
  > vmstat -i says the network card has irq256 (???) and it is not shared with
  > other devices. Ehci, sound, ohci, ata, and others have low irq numbers
  > (6, 14, 20, 21, 22), some shared, some not.
  > 
  > Changing the bios setting for PnP OS from 'yes' to 'no' or viceversa
  > does not change the situation.
  > 
 
 Your BIOS may have an option for ASF related one for onboard NIC.
 Try toggling that option and see how it goes.
 
  > The stall seems related to the presence of other activity - if i
  > let the bulk scp transfer alone, i get an happy 10-10.5Mbytes/s
  > (over a 100meg link).
  > 
  > When the stall occurs, i see no interrupts (vmstat -i counts
  > for irq256 says the same),
  > Packets are still transmitted and received on the other side, it's
  > the rx side of the card that becomes deaf. I don't see any
  > watchdog timeout or other error messages in /var/log/messages.
  > 
  > Also, enabling polling does not help getting traffic in
  > (with a kernel built with DEVICE_POLLING,
  > doing sysctl kern.polling.enable=1 and "ifconfig nfe0 polling").
  > 
  > So i suspect that for some reason the rx ring becomes confused
  > and does not recover.
  > 
 
 Just vague guess, how about disabling MSI/MSI-X in loader.conf?
 (hw.nfe.msi_disable = "1", hw.nfe.msix_disable = "1")
 If you are using jumbo frame, try disabling it too.
 
  > Hope this helps...
  > 
 
 It would be even better if you can post verbosed boot messages
 related wiht nfe(4) and PHY driver.
 
  > cheers
  > luigi
 
 -- 
 Regards,
 Pyun YongHyeon

From: Luigi Rizzo <rizzo@iet.unipi.it>
To: Pyun YongHyeon <pyunyh@gmail.com>
Cc: bug-followup@FreeBSD.org, yongari@FreeBSD.org, current@FreeBSD.org
Subject: Re: amd64/115126: [nfe] nfe0: watchdog timeout (missed Tx interrupts) -- recovering (UP with SCHED_ULE)
Date: Wed, 23 Apr 2008 11:11:27 +0200

 On Wed, Apr 23, 2008 at 05:22:40PM +0900, Pyun YongHyeon wrote:
 > On Tue, Apr 22, 2008 at 09:28:39AM +0200, Luigi Rizzo wrote:
 >  > related to this bug, i am seeing similar problems with RELENG_7 and amd64,
 >  > with an ASUS M2N-VM DVI motherboard
 >  > http://www.asus.com/products.aspx?modelmenu=1&model=1841&l1=3&l2=101&l3=567&l4=0
 >  > and an Athlon64-BE2400 dual core CPU .
 >  > 
 >  > Under heavy load, e.g. scp-ing a large file over the local network,
 >  > and at the same time doing a buildkernel or building a port,
 >  > and with X11 active (using the 'vesa' xorg driver)
 >  > the network card stalls and doesn't recover - i waited over 10 minutes
 >  > hoping for the watchdog or some timeout to kick in, the only way
 >  > to bring the link back up was
 >  > 
 >  > 	ifconfig nfe0 down ; ifconfig nfe0 up
 >  > 	dhclient nfe0
 >  > 
 >  > doing only ifconfig down/up or only dhclient did not help, i needed both.
 ...
 > Your BIOS may have an option for ASF related one for onboard NIC.
 > Try toggling that option and see how it goes.
 ...
 > Just vague guess, how about disabling MSI/MSI-X in loader.conf?
 > (hw.nfe.msi_disable = "1", hw.nfe.msix_disable = "1")
 > If you are using jumbo frame, try disabling it too.
 > 
 >  > Hope this helps...
 >  > 
 > 
 > It would be even better if you can post verbosed boot messages
 > related wiht nfe(4) and PHY driver.
 
 will try to do all the above, but upon further investigation the
 problem appears even on i386 and really seems related to the
 receive queue filling up and the condition not being detected
 due to a race.
 
 Things like this used to happen in the past in several network drivers,
 and there is a comment suggesting the same thing in one of the
 commit logs for the openbsd nfe driver. So that's the part i am
 going to investigate (i have strong motivations with 5 such machines
 in my lab...)
 
 My preliminary question is the following: is the 'nfe' driver just
 an adaptation from some other driver (possibly trying to guess the
 way the NIC synchronizes with the CPU), or there is someone who
 carefully studied that specific issue ?
 
 	cheers
 	luigi

From: Luigi Rizzo <rizzo@iet.unipi.it>
To: Pyun YongHyeon <pyunyh@gmail.com>
Cc: current@freebsd.org, bug-followup@freebsd.org, yongari@freebsd.org
Subject: Re: amd64/115126: [nfe] nfe0: watchdog timeout (missed Tx
	interrupts) -- recovering (UP with SCHED_ULE)
Date: Wed, 23 Apr 2008 11:11:27 +0200

 On Wed, Apr 23, 2008 at 05:22:40PM +0900, Pyun YongHyeon wrote:
 > On Tue, Apr 22, 2008 at 09:28:39AM +0200, Luigi Rizzo wrote:
 >  > related to this bug, i am seeing similar problems with RELENG_7 and amd64,
 >  > with an ASUS M2N-VM DVI motherboard
 >  > http://www.asus.com/products.aspx?modelmenu=1&model=1841&l1=3&l2=101&l3=567&l4=0
 >  > and an Athlon64-BE2400 dual core CPU .
 >  > 
 >  > Under heavy load, e.g. scp-ing a large file over the local network,
 >  > and at the same time doing a buildkernel or building a port,
 >  > and with X11 active (using the 'vesa' xorg driver)
 >  > the network card stalls and doesn't recover - i waited over 10 minutes
 >  > hoping for the watchdog or some timeout to kick in, the only way
 >  > to bring the link back up was
 >  > 
 >  > 	ifconfig nfe0 down ; ifconfig nfe0 up
 >  > 	dhclient nfe0
 >  > 
 >  > doing only ifconfig down/up or only dhclient did not help, i needed both.
 ...
 > Your BIOS may have an option for ASF related one for onboard NIC.
 > Try toggling that option and see how it goes.
 ...
 > Just vague guess, how about disabling MSI/MSI-X in loader.conf?
 > (hw.nfe.msi_disable = "1", hw.nfe.msix_disable = "1")
 > If you are using jumbo frame, try disabling it too.
 > 
 >  > Hope this helps...
 >  > 
 > 
 > It would be even better if you can post verbosed boot messages
 > related wiht nfe(4) and PHY driver.
 
 will try to do all the above, but upon further investigation the
 problem appears even on i386 and really seems related to the
 receive queue filling up and the condition not being detected
 due to a race.
 
 Things like this used to happen in the past in several network drivers,
 and there is a comment suggesting the same thing in one of the
 commit logs for the openbsd nfe driver. So that's the part i am
 going to investigate (i have strong motivations with 5 such machines
 in my lab...)
 
 My preliminary question is the following: is the 'nfe' driver just
 an adaptation from some other driver (possibly trying to guess the
 way the NIC synchronizes with the CPU), or there is someone who
 carefully studied that specific issue ?
 
 	cheers
 	luigi
 _______________________________________________
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
 

From: Andriy Gapon <avg@freebsd.org>
To: bug-followup@freebsd.org, ohartman@zedat.fu-berlin.de
Cc:  
Subject: Re: amd64/115126: [nfe] nfe0: watchdog timeout (missed Tx interrupts)
 -- recovering (UP with SCHED_ULE)
Date: Sun, 05 Dec 2010 12:03:10 +0200

 What's the current status of this issue?
 Looks like there has not been any followups for > 2 years.
 
 -- 
 Andriy Gapon
State-Changed-From-To: open->closed 
State-Changed-By: avg 
State-Changed-When: Sun Dec 5 10:38:10 UTC 2010 
State-Changed-Why:  
Closing per submitter's feedback. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=115126 
>Unformatted:
