From nobody@FreeBSD.org  Thu Sep 13 11:12:14 2007
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 05E5816A417
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 13 Sep 2007 11:12:14 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id F099313C48D
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 13 Sep 2007 11:12:13 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.1/8.14.1) with ESMTP id l8DBCDba067661
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 13 Sep 2007 11:12:13 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.1/8.14.1/Submit) id l8DBCDt8067660;
	Thu, 13 Sep 2007 11:12:13 GMT
	(envelope-from nobody)
Message-Id: <200709131112.l8DBCDt8067660@www.freebsd.org>
Date: Thu, 13 Sep 2007 11:12:13 GMT
From: Michael Reifenberger <mike@reifenberger.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: network problems under -current, nfe(4) and jumbo packets
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         116330
>Category:       kern
>Synopsis:       [nfe]: network problems under -current, nfe(4) and jumbo packets
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    yongari
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Sep 13 11:20:01 GMT 2007
>Closed-Date:    Mon Mar 03 06:58:03 UTC 2008
>Last-Modified:  Mon Mar 03 06:58:03 UTC 2008
>Originator:     Michael Reifenberger
>Release:        FreeBSD 7.0-CURRENT amd64
>Organization:
>Environment:
System: FreeBSD fs.reifenberger.com 7.0-CURRENT FreeBSD 7.0-CURRENT #3: Mon Sep
10 23:21:52 CEST 2007 root@fs.reifenberger.com:/usr/obj/usr/src/sys/fs amd64

>Description:
I have two identical Asus N2M32WS Motherboards with builtin 
NVIDIA nForce MCP55 Networking Adapter NIC's and Marvell 88E1116 Gigabit PHY's.

both nfe0's are connected via GB-switch, nfe1's are connected directly (for testing purposes).
One machine acts as a samba and NFS fileserver.

After turning on a MTU of 9000 on the NIC's of the fileserver I found that
various clients (Linux SLES10, Gentoo, FreeBSD) with various applications 
(NFS over TCP/UDP, iSCSI) where producing errors and corruptions when
transferring from/to the fileserver.

So for further testing I switched to use the direct connected NIC's of the
two computers.
When using benchmarks/netpipe the integrity check hangs using
packetsizes greater 1500, later terminates with error:

(fs)(root) NPtcp -i -h 10.0.1.2                                                 
Doing an integrity check instead of measuring performance
Send and receive buffers are 1048576 and 1048576 bytes
(A bug in Linux doubles the requested buffer sizes)
Now starting the main loop
  0:       5 bytes   2758 times -->  Integrity check passed
  1:       7 bytes   2838 times -->  Integrity check passed
  2:       9 bytes   1978 times -->  Integrity check passed
  3:      13 bytes   2381 times -->  Integrity check passed
  4:      17 bytes   2015 times -->  Integrity check passed
  5:      25 bytes   2585 times -->  Integrity check passed
  6:      33 bytes   1759 times -->  Integrity check passed
  7:      49 bytes   2406 times -->  Integrity check passed
  8:      65 bytes   2172 times -->  Integrity check passed
  9:      97 bytes   3138 times -->  Integrity check passed
 10:     129 bytes   2055 times -->  Integrity check passed
 11:     193 bytes   2842 times -->  Integrity check passed
 12:     257 bytes   1774 times -->  Integrity check passed
 13:     385 bytes   2583 times -->  Integrity check passed
 14:     513 bytes   1667 times -->  Integrity check passed
 15:     769 bytes   2298 times -->  Integrity check passed
 16:    1025 bytes   1341 times -->  Integrity check passed
 17:    1537 bytes   1765 times --> NetPIPE: read: error encountered, errno=60

- How to debug this issue?
- Is this a known problem?

my /etc/sysctl.conf:
#security.bsd.see_other_uids=0
compat.linux.osrelease=2.6.16
#kern.maxvnodes=400000
vfs.lookup_shared=1
kern.coredump=0
#net.inet.tcp.path_mtu_discovery=0
net.inet.udp.recvspace=65536
net.inet.raw.recvspace=16384
#hw.pci.enable_msix=0
#hw.pci.enable_msi=0
kern.ipc.nmbclusters=50000
#kern.timecounter.hardware=ACPI-fast
kern.ipc.maxsockbuf=16777216
net.inet.tcp.rfc1323=1
net.inet.tcp.sendspace=1048576
net.inet.tcp.recvspace=1048576                                                  
net.inet.tcp.log_debug=0

NFS mountoptions are:
.. rw,noatime,bg,soft,-i,-3,-T,async,-r=32768,-w=32768

dmesg and kernel config is available on request.

>How-To-Repeat:
Use NFS with big send/receive blocksizes or benchmarks/netpipe integrity check.


>Fix:
The only known workaround is to lower the MTU to 1500.

>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-net 
Responsible-Changed-By: remko 
Responsible-Changed-When: Fri Sep 21 06:14:55 UTC 2007 
Responsible-Changed-Why:  
Reassign to networking group 

http://www.freebsd.org/cgi/query-pr.cgi?pr=116330 
Responsible-Changed-From-To: freebsd-net->yongari 
Responsible-Changed-By: yongari 
Responsible-Changed-When: Thu Sep 27 07:20:53 UTC 2007 
Responsible-Changed-Why:  
Grab. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=116330 
State-Changed-From-To: open->feedback 
State-Changed-By: yongari 
State-Changed-When: Thu Sep 27 09:52:11 UTC 2007 
State-Changed-Why:  
Finally had time to experiment nfe(4) with jumbo frame. With default 
send/recv buffersize I can't reproduce it. With your buffersize 
1048576 I can't even make netpipe receiver to start. I always get 
"can't open stream socket! errno=55". So I guess the issue is 
in network stack not in nfe(4) driver. Did you get integrity 
check failure with default send/recev buffer size? What about 
other drivers that have jumbo frame capability? 

julia# NPtcp -i -h 192.168.40.2 
Doing an integrity check instead of measuring performance 
Send and receive buffers are 32768 and 65536 bytes 
(A bug in Linux doubles the requested buffer sizes) 
Now starting the main loop 
0:       5 bytes   1276 times -->  Integrity check passed 
1:       7 bytes   1215 times -->  Integrity check passed 
2:       9 bytes    858 times -->  Integrity check passed 
3:      13 bytes   1030 times -->  Integrity check passed 
4:      17 bytes    663 times -->  Integrity check passed 
5:      25 bytes    882 times -->  Integrity check passed 
6:      33 bytes    554 times -->  Integrity check passed 
7:      49 bytes    847 times -->  Integrity check passed 
8:      65 bytes    580 times -->  Integrity check passed 
9:      97 bytes    829 times -->  Integrity check passed 
10:     129 bytes    540 times -->  Integrity check passed 
11:     193 bytes    817 times -->  Integrity check passed 
12:     257 bytes    483 times -->  Integrity check passed 
13:     385 bytes    785 times -->  Integrity check passed 
14:     513 bytes    517 times -->  Integrity check passed 
15:     769 bytes    776 times -->  Integrity check passed 
16:    1025 bytes    476 times -->  Integrity check passed 
17:    1537 bytes    356 times -->  Integrity check passed 
18:    2049 bytes    221 times -->  Integrity check passed 
19:    3073 bytes    393 times -->  Integrity check passed 
20:    4097 bytes    262 times -->  Integrity check passed 
21:    6145 bytes    384 times -->  Integrity check passed 
22:    8193 bytes    160 times -->  Integrity check passed 
23:   12289 bytes    222 times -->  Integrity check passed 
24:   16385 bytes    110 times -->  Integrity check passed 
25:   24577 bytes    146 times -->  Integrity check passed 
26:   32769 bytes     62 times -->  Integrity check passed 
27:   49153 bytes     86 times -->  Integrity check passed 
28:   65537 bytes     40 times -->  Integrity check passed 
29:   98305 bytes     46 times -->  Integrity check passed 
30:  131073 bytes     21 times -->  Integrity check passed 
31:  196609 bytes     24 times -->  Integrity check passed 
32:  262145 bytes      9 times -->  Integrity check passed 
33:  393217 bytes     12 times -->  Integrity check passed 
34:  524289 bytes      5 times -->  Integrity check passed 
35:  786433 bytes      6 times -->  Integrity check passed 
36: 1048577 bytes      3 times -->  Integrity check passed 
37: 1572865 bytes      3 times -->  Integrity check passed 
38: 2097153 bytes      3 times -->  Integrity check passed 
39: 3145729 bytes      3 times -->  Integrity check passed 
40: 4194305 bytes      3 times -->  Integrity check passed 
41: 6291457 bytes      3 times -->  Integrity check passed 
42: 8388609 bytes      3 times -->  Integrity check passed 

nfe0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 
options=10b<RXCSUM,TXCSUM,VLAN_MTU,TSO4> 
ether 00:11:09:d5:04:42  
inet 192.168.40.1 netmask 0xffffff00 broadcast 192.168.40.255 
media: Ethernet autoselect (1000baseTX <full-duplex,flag0,flag1,flag2>) 
status: active 

http://www.freebsd.org/cgi/query-pr.cgi?pr=116330 
State-Changed-From-To: feedback->closed 
State-Changed-By: linimon 
State-Changed-When: Mon Mar 3 06:57:26 UTC 2008 
State-Changed-Why:  
Feedback timeout (> 3 months). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=116330 
>Unformatted:
