From nobody@FreeBSD.org  Mon Mar 24 13:33:16 2008
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3440C1065673
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 24 Mar 2008 13:33:16 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 2F9BF8FC29
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 24 Mar 2008 13:33:16 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.2/8.14.2) with ESMTP id m2ODX5Xs014782
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 24 Mar 2008 13:33:06 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.2/8.14.1/Submit) id m2ODX55b014779;
	Mon, 24 Mar 2008 13:33:05 GMT
	(envelope-from nobody)
Message-Id: <200803241333.m2ODX55b014779@www.freebsd.org>
Date: Mon, 24 Mar 2008 13:33:05 GMT
From: "M. E." <unexpectedvalue@yahoo.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: 6.3-RELEASE kernel crashes on high-volume & high-speed scp
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         122048
>Category:       kern
>Synopsis:       [nve] 6.3-RELEASE kernel crashes on high-volume & high-speed scp
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Mar 24 13:40:01 UTC 2008
>Closed-Date:    Tue Apr 01 06:42:50 UTC 2008
>Last-Modified:  Tue Apr 01 06:42:50 UTC 2008
>Originator:     M. E.
>Release:        6.3-RELEASE
>Organization:
>Environment:
FreeBSD XX.domain.com 6.3-RELEASE FreeBSD 6.3-RELEASE #0: Tue Feb 19 22:30:21 UTC 2008     root@:/usr/src/sys/i386/compile/CUSTOM  i386

added to kernel config before kernel recompiled:

options               IPFIREWALL
options               IPFIREWALL_VERBOSE
options               IPFIREWALL_FORWARD

options               IPDIVERT
options               TCPDEBUG
options               DUMMYNET
options               NETATALK

device                crypto
options               GEOM_ELI

device                atapicam

>Description:
The crash predictably happens during copying with scp between two
identical machines on the 100 MBit/s LAN. It takes between 0.07 and 17
Gbytes of data to get transferred before the crash happens on the machine
the data was copied FROM. scp was in the recursive mode and scp user was
on the machine the data was copied TO.

The actual transfer speed was around 70-90 mbit/sec and sshd CPU usage on
the source machine was near 50%.

We tried different individual file sizes between 0.5 and 150Mb and it did
not make a difference.

Switching the direction produces the same result - crash is allways on the
source machine.

Here is dmesg output after the crash:

Copyright (c) 1992-2008 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 6.3-RELEASE #0: Tue Feb 19 22:30:21 UTC 2008
    root@:/usr/src/sys/i386/compile/CUSTOM
ACPI APIC Table: <HPQOEM SLIC-CPC>
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 4000+ (2104.40-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x60fb1  Stepping = 1
  Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
  Features2=0x2001<SSE3,CX16>
  AMD Features=0xea500800<SYSCALL,NX,MMX+,FFXSR,RDTSCP,LM,3DNow!+,3DNow!>
  AMD Features2=0x11f<LAHF,CMP,SVM,ExtAPIC,CR8,Prefetch>
  Cores per package: 2
real memory  = 938409984 (894 MB)
avail memory = 904908800 (862 MB)
ioapic0: Changing APIC ID to 2
ioapic0 <Version 1.1> irqs 0-23 on motherboard
kbd1 at kbdmux0
ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413)
hptrr: HPT RocketRAID controller driver v1.1 (Feb 19 2008 22:30:08)
acpi0: <HPQOEM SLIC-CPC> on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0
acpi_hpet0: <High Precision Event Timer> iomem 0xfefff000-0xfefff3ff on acpi0
Timecounter "HPET" frequency 25000000 Hz quality 900
cpu0: <ACPI CPU> on acpi0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pci0: <memory, RAM> at device 0.0 (no driver attached)
pci0: <memory, RAM> at device 0.1 (no driver attached)
pci0: <memory, RAM> at device 0.2 (no driver attached)
pci0: <memory, RAM> at device 0.3 (no driver attached)
pci0: <memory, RAM> at device 0.4 (no driver attached)
pci0: <memory, RAM> at device 0.5 (no driver attached)
pci0: <memory, RAM> at device 0.6 (no driver attached)
pci0: <memory, RAM> at device 0.7 (no driver attached)
pcib1: <ACPI PCI-PCI bridge> at device 4.0 on pci0
pci1: <ACPI PCI bus> on pcib1
pci0: <display, VGA> at device 5.0 (no driver attached)
pci0: <memory, RAM> at device 9.0 (no driver attached)
isab0: <PCI-ISA bridge> at device 10.0 on pci0
isa0: <ISA bus> on isab0
pci0: <serial bus, SMBus> at device 10.1 (no driver attached)
pci0: <memory, RAM> at device 10.2 (no driver attached)
ohci0: <OHCI (generic) USB controller> mem 0xfe02f000-0xfe02ffff irq 21 at device 11.0 on pci0
ohci0: [GIANT-LOCKED]
usb0: OHCI version 1.0, legacy support
usb0: <OHCI (generic) USB controller> on ohci0
usb0: USB revision 1.0
uhub0: nVidia OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 8 ports with 8 removable, self powered
ehci0: <EHCI (generic) USB 2.0 controller> mem 0xfe02e000-0xfe02e0ff irq 22 at device 11.1 on pci0
ehci0: [GIANT-LOCKED]
usb1: EHCI version 1.0
usb1: companion controller, 8 ports each: usb0
usb1: <EHCI (generic) USB 2.0 controller> on ehci0
usb1: USB revision 2.0
uhub1: nVidia EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub1: 8 ports with 8 removable, self powered
umass0: Generic Mass Storage Device, rev 2.00/1.00, addr 2
ugen0: Ralink 802.11 bg WLAN, rev 2.00/0.01, addr 3
atapci0: <nVidia nForce MCP51 SATA300 controller> port 0x9f0-0x9f7,0xbf0-0xbf3,0x970-0x977,0xb70-0xb73,0xe400-0xe40f mem 0xfe02d000-0xfe02dfff irq 23 at device 14.0 on pci0
ata2: <ATA channel 0> on atapci0
ata3: <ATA channel 1> on atapci0
atapci1: <nVidia nForce MCP51 SATA300 controller> port 0x9e0-0x9e7,0xbe0-0xbe3,0x960-0x967,0xb60-0xb63,0xd000-0xd00f mem 0xfe02c000-0xfe02cfff irq 20 at device 15.0 on pci0
ata4: <ATA channel 0> on atapci1
ata5: <ATA channel 1> on atapci1
pcib2: <ACPI PCI-PCI bridge> at device 16.0 on pci0
pci2: <ACPI PCI bus> on pcib2
fwohci0: <Lucent FW322/323> mem 0xfdbff000-0xfdbfffff irq 16 at device 5.0 on pci2
fwohci0: OHCI version 1.0 (ROM=0)
fwohci0: No. of Isochronous channels is 8.
fwohci0: EUI64 00:11:d8:00:01:45:24:15
fwohci0: Phy 1394a available S400, 2 ports.
fwohci0: Link S400, max_rec 1024 bytes.
fwohci0: max_rec 1024 -> 2048
firewire0: <IEEE1394(FireWire) bus> on fwohci0
fwe0: <Ethernet over FireWire> on firewire0
if_fwe0: Fake Ethernet address: 02:11:d8:45:24:15
fwe0: Ethernet address: 02:11:d8:45:24:15
fwe0: if_start running deferred for Giant
sbp0: <SBP-2/SCSI over FireWire> on firewire0
fwohci0: Initiate bus reset
fwohci0: BUS reset
fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode
firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me)
firewire0: bus manager 0 (me)
pci2: <simple comms> at device 9.0 (no driver attached)
pci0: <multimedia> at device 16.1 (no driver attached)
nve0: <NVIDIA nForce MCP13 Networking Adapter> port 0xcc00-0xcc07 mem 0xfe02b000-0xfe02bfff irq 22 at device 20.0 on pci0
nve0: Ethernet address 00:1b:fc:23:b7:6e
miibus0: <MII bus> on nve0
rlphy0: <RTL8201L 10/100 media interface> on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
nve0: Ethernet address: 00:1b:fc:23:b7:6e
acpi_tz0: <Thermal Zone> on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model IntelliMouse Explorer, device ID 4
pmtimer0 on isa0
ata0 at port 0x1f0-0x1f7,0x3f6 irq 14 on isa0
ata1 at port 0x170-0x177,0x376 irq 15 on isa0
ppc0: parallel port not found.
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 8250 or not responding
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounter "TSC" frequency 2104402604 Hz quality 800
Timecounters tick every 1.000 msec
ipfw2 (+ipv6) initialized, divert enabled, rule-based forwarding enabled, default to deny, logging unlimited
hptrr: no controller detected.
ad4: 953869MB <WDC WD10EACS-32ZJB0 01.01B01> at ata2-master SATA150
acd0: DVDR <TSSTcorpCD/DVDW TS-H653L/0514> at ata3-master SATA150
acd0: FAILURE - INQUIRY ILLEGAL REQUEST asc=0x24 ascq=0x00
acd0: FAILURE - INQUIRY ILLEGAL REQUEST asc=0x24 ascq=0x00
cd0 at ata3 bus 0 target 0 lun 0
cd0: <TSSTcorp CD/DVDW TS-H653L 0514> Removable CD-ROM SCSI-0 device
cd0: 3.300MB/s transfers
cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed
da0 at umass-sim0 bus 0 target 0 lun 0
da0: <Generic USB SD Reader 1.00> Removable Direct Access SCSI-0 device
da0: 40.000MB/s transfers
da0: Attempt to query device size failed: NOT READY, Medium not present
da1 at umass-sim0 bus 0 target 0 lun 1
da1: <Generic USB CF Reader 1.01> Removable Direct Access SCSI-0 device
da1: 40.000MB/s transfers
da1: Attempt to query device size failed: NOT READY, Medium not present
da2 at umass-sim0 bus 0 target 0 lun 2
da2: <Generic USB SM Reader 1.02> Removable Direct Access SCSI-0 device
da2: 40.000MB/s transfers
da2: Attempt to query device size failed: NOT READY, Medium not present
da3 at umass-sim0 bus 0 target 0 lun 3
da3: <Generic USB MS Reader 1.03> Removable Direct Access SCSI-0 device
da3: 40.000MB/s transfers
da3: Attempt to query device size failed: NOT READY, Medium not present
Trying to mount root from ufs:/dev/ad4s1a
WARNING: / was not properly dismounted
Loading configuration files.
Entropy harvesting:
 interrupts
 ethernet
 point_to_point
 kickstart
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:

From: "Remko Lodder" <remko@elvandar.org>
To: "M. E." <unexpectedvalue@yahoo.com>
Cc: freebsd-gnats-submit@freebsd.org
Subject: Re: i386/122048: 6.3-RELEASE kernel crashes on high-volume & 
     high-speed scp
Date: Mon, 24 Mar 2008 16:01:27 +0100 (CET)

 On Mon, March 24, 2008 2:33 pm, M. E. wrote:
 
 Hello, thanks for the initial information. It doesn't tell anything so
 far, but lets proceed:
 
 Can you try to obtain kernel crashdumps from the machines? see
 http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html
 for more information on how to do that. We need the backtraces to see what
 crashes and why.
 
 Also please verify whether you have "dumpdev" and "dumpdir" configured in
 /etc/rc.conf so that the dump can be made in the first place.
 
 IF the dump results in unreadable information (for us) we might need to
 ask you to recompile the kernel with "-g" support so that we can load the
 symbols from which we should be able to see what's going on
 
 Thanks,
 remko
 
 
 -- 
 /"\   Best regards,                      | remko@FreeBSD.org
 \ /   Remko Lodder                       | remko@EFnet
  X    http://www.evilcoder.org/          |
 / \   ASCII Ribbon Campaign              | Against HTML Mail and News
 
 

From: True Entropy <unexpectedvalue@yahoo.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: i386/122048: 6.3-RELEASE kernel crashes on high-volume & high-speed scp
Date: Mon, 24 Mar 2008 07:56:29 -0700 (PDT)

 Apparently the crash is unrelated to the transport method - ftp/ftpd also crash *both*
 machines, source and destination.
 
 
 
 .
 
 
 end
 
 
 
 .
 
 (spam starts here)
 
 
       ____________________________________________________________________________________
 Never miss a thing.  Make Yahoo your home page. 
 http://www.yahoo.com/r/hs

From: True Entropy <unexpectedvalue@yahoo.com>
To: bug-followup@FreeBSD.org
Cc: remko@elvandar.org
Subject: Re: i386/122048: 6.3-RELEASE kernel crashes on high-volume & high-speed scp
Date: Tue, 25 Mar 2008 07:41:08 -0700 (PDT)

 Kernel backtrace follows. Looks like interrupt handling problem in nve driver.
 
 
 # kgdb kernel.debug /var/crash/vmcore.0
 kgdb: kvm_nlist(_stopped_cpus):
 kgdb: kvm_nlist(_stoppcbs):
 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined
 symbol "ps_pglobal_lookup"]
 GNU gdb 6.1.1 [FreeBSD]
 Copyright 2004 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you are
 welcome to change it and/or distribute copies of it under certain conditions.
 Type "show copying" to see the conditions.
 There is absolutely no warranty for GDB.  Type "show warranty" for details.
 This GDB was configured as "i386-marcel-freebsd".
 
 Unread portion of the kernel message buffer:
 
 Fatal double fault:
 eip = 0xc090a41a
 esp = 0xe07f9000
 ebp = 0xe07f9008
 panic: double fault
 Uptime: 8h24m13s
 Dumping 894 MB (2 chunks)
   chunk 0: 1MB (159 pages) ... ok
   chunk 1: 894MB (228848 pages) 878 862 846 830 814 798 782 766 750 734 718 702 686 670
 654 638 622 606 590 574 558 542 526 510 494 478 462 446 430 414 398 382 366 350 334 318
 302 286 270 254 238 222 206 190 174 158 142 126 110 94 78 62 46 30 14
 
 #0  doadump () at pcpu.h:165
 165             __asm __volatile("movl %%fs:0,%0" : "=r" (td));
 (kgdb) backtrace
 #0  doadump () at pcpu.h:165
 #1  0xc06af3a2 in boot (howto=260) at ../../../kern/kern_shutdown.c:409
 #2  0xc06af638 in panic (fmt=0xc09d96f8 "double fault") at
 ../../../kern/kern_shutdown.c:565
 #3  0xc09388c2 in dblfault_handler () at ../../../i386/i386/trap.c:867
 #4  0xc090a41a in nve_dmamap_tx_cb (arg=0xe22ea894, segs=0xc4d08b00, nsegs=-500258640,
 mapsize=162, error=0)
     at ../../../dev/nve/if_nve.c:280
 #5  0xc0923752 in bus_dmamap_load_mbuf (dmat=0xc4de7500, map=0xc4d07e00, m0=0xc4deea00,
 callback=0xc090a400 <nve_dmamap_tx_cb>,
     callback_arg=0xe22ea894, flags=1) at ../../../i386/i386/busdma_machdep.c:787
 #6  0xc090b70e in nve_ifstart_locked (ifp=0xc4d0ac00) at ../../../dev/nve/if_nve.c:889
 #7  0xc090c6fa in nve_ospackettx (ctx=0xc4d0f200, id=0xe22ea8b0, success=1) at
 ../../../dev/nve/if_nve.c:1559
 #8  0xc0872307 in UpdateTransmitDescRingData ()
 #9  0xe22e7238 in ?? ()
 #10 0x00000001 in ?? ()
 #11 0xe24eb000 in ?? ()
 #12 0x00000000 in ?? ()
 #13 0x00000000 in ?? ()
 #14 0xe22e6000 in ?? ()
 #15 0xc08718c7 in ADAPTER_HandleInterruptThroughput ()
 #16 0x00000000 in ?? ()
 #17 0x00000010 in ?? ()
 #18 0xe22ea894 in ?? ()
 #19 0xc4d0ac00 in ?? ()
 #20 0xc4d0f200 in ?? ()
 #21 0xe07f9388 in ?? ()
 #22 0xc090b6de in nve_ifstart_locked (ifp=0xe22e7238) at ../../../dev/nve/if_nve.c:884
 Previous frame inner to this frame (corrupt stack?)
 (kgdb)
 
 .
 
 
 end
 
 
 
 .
 
 (spam starts here)
 
 
       ____________________________________________________________________________________
 Looking for last minute shopping deals?  
 Find them fast with Yahoo! Search.  http://tools.search.yahoo.com/newsearch/category.php?category=shopping

From: Remko Lodder <remko@elvandar.org>
To: True Entropy <unexpectedvalue@yahoo.com>
Cc: bug-followup@FreeBSD.org
Subject: Re: i386/122048: 6.3-RELEASE kernel crashes on high-volume & high-speed scp
Date: Tue, 25 Mar 2008 15:49:59 +0100

 Now that's something we are aware off, can you please upgrade to  
 releng_7_0 and see whether the NFE driver works ? I also recall that  
 there was a patch available for 6-STABLE to get NFE supported there,  
 but that's not within the tree yet.
 
 Also how bad it may sound: the nve driver probably will not get fixed  
 anymore, since the nfe driver will take over on 7.0 onward (which is  
 more stable then the nve driver).
 
 Thanks,
 remko
 --
 /"\   Best regards,					| remko@FreeBSD.org
 \ /   Remko Lodder				| remko@EFnet
 X    http://www.evilcoder.org/		|
 / \   ASCII Ribbon Campaign		| Against HTML Mail and News
 
State-Changed-From-To: open->feedback 
State-Changed-By: linimon 
State-Changed-When: Sat Mar 29 22:36:07 UTC 2008 
State-Changed-Why:  
Sounds like an nve bug.  Submitter has been asked to try 7.0 to see 
if the problem has been fixed. 


Responsible-Changed-From-To: freebsd-i386->freebsd-bugs 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Sat Mar 29 22:36:07 UTC 2008 
Responsible-Changed-Why:  

http://www.freebsd.org/cgi/query-pr.cgi?pr=122048 

From: True Entropy <unexpectedvalue@yahoo.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/122048: [nve] 6.3-RELEASE kernel crashes on high-volume &amp; high-speed scp
Date: Mon, 31 Mar 2008 14:50:47 -0700 (PDT)

 The problem was fixed by installing nfe driver to replace nve, supplied at
 http://www.f.csce.kyushu-u.ac.jp/~shigeaki/software/nfe-20071124.tar.gz and described on
 http://www.f.csce.kyushu-u.ac.jp/~shigeaki/software/freebsd-nfe.html
 
 1. The nve had to be removed from the kernel config and kernel recompiled.
 
 2. put files from nfe-20071124.tar.gz to /usr/src/sys/dev/nfe
 
 Then 
 
  cd /usr/src/sys/dev/nfe 
  make 
  make install
 
 3. Add line
 
 if_nfe_load="YES" 
 
 to /boot/loader.conf
  
 4. Reboot.
 
 
 This was all tested on 6.3-RELEASE.
 
 
 
 .
 
 
 end
 
 
 
 .
 
 (spam starts here)
 
 
       ____________________________________________________________________________________
 Special deal for Yahoo! users & friends - No Cost. Get a month of Blockbuster Total Access now 
 http://tc.deals.yahoo.com/tc/blockbuster/text3.com
State-Changed-From-To: feedback->closed 
State-Changed-By: remko 
State-Changed-When: Tue Apr 1 06:42:49 UTC 2008 
State-Changed-Why:  
Known bug in nve, which is very very unlikely to get fixed. 7.0 and 
further have if_nfe available for this, submitter configured that this 
is working properly; he used: 
http://www.f.csce.kyushu-u.ac.jp/~shigeaki/software/nfe-20071124.tar.gz 
and ttp://www.f.csce.kyushu-u.ac.jp/~shigeaki/software/freebsd-nfe.html 

http://www.freebsd.org/cgi/query-pr.cgi?pr=122048 
>Unformatted:
