From rse@engelschall.com  Wed Dec 28 20:45:56 2005
Return-Path: <rse@engelschall.com>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 0DE7216A41F
	for <freebsd-gnats-submit@freebsd.org>; Wed, 28 Dec 2005 20:45:56 +0000 (GMT)
	(envelope-from rse@engelschall.com)
Received: from visp1.engelschall.com (visp1.engelschall.com [195.30.6.144])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 8500243D53
	for <freebsd-gnats-submit@freebsd.org>; Wed, 28 Dec 2005 20:45:55 +0000 (GMT)
	(envelope-from rse@engelschall.com)
Received: by visp1.engelschall.com (Postfix, from userid 21100)
	id 232E01B44833; Wed, 28 Dec 2005 21:46:02 +0100 (CET)
Received: from en1.home.engelschall.com (localhost.engelschall.com [127.0.0.1])
	by en1.engelschall.com (Postfix) with ESMTP id 64764A19BD
	for <FreeBSD-gnats-submit@freebsd.org>; Wed, 28 Dec 2005 21:45:39 +0100 (CET)
Received: (from rse@localhost)
	by en1.home.engelschall.com (8.13.4/8.13.4/Submit) id jBSKjcAs001138;
	Wed, 28 Dec 2005 21:45:38 +0100 (CET)
	(envelope-from rse)
Message-Id: <200512282045.jBSKjcAs001138@en1.home.engelschall.com>
Date: Wed, 28 Dec 2005 21:45:38 +0100 (CET)
From: "Ralf S. Engelschall" <rse@freebsd.org>
Reply-To: "Ralf S. Engelschall" <rse@freebsd.org>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: invalid IP checksum under if_bridge(4)+em(4) combination
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         91032
>Category:       kern
>Synopsis:       [if_bridge] invalid IP checksum under if_bridge(4)+em(4) combination
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    thompsa
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Dec 28 20:50:03 GMT 2005
>Closed-Date:    Wed May 31 00:27:49 GMT 2006
>Last-Modified:  Wed May 31 00:27:49 GMT 2006
>Originator:     Ralf S. Engelschall
>Release:        FreeBSD 6.0-STABLE i386
>Organization:
FreeBSD
>Environment:
NIC driven by em(4) attached to a bridge based on if_bridge(4).
FreeBSD en4.engelschall.com 6.0-STABLE FreeBSD 6.0-STABLE #0: Wed Dec 28 19:32:06 CET 2005     root@en4.engelschall.com:/usr/obj/usr/src/sys/EN4  amd64

>Description:
I've a HP DL385 running under FreeBSD 6.0-STABLE (as of 2005-12-28)
which has a tap(4) device "tap0" bridged, via the if_bridge(4) device
"bridge0", to the em(4) device "em0".

Without the bridge0 attached to em0, IP packets sent out on em0 have
a correct header checksum. Once the bridge0 is established and em0
attached to it, packets sent out on em0 have an incorrect header
checksum of 0x0000 and this way are just dropped by remote hosts.

The reasons for the 0x0000 checksum is that...

1. if_bridge(4) for unknown reasons explicitly clears the checksum in
   the function if_bridge.c:bridge_enqueue().

2. em(4) for unknown reasons DOES NOT perform the "checksum offloading",
   i.e., calculate the checksum via hardware assistance, if the packets
   comes in via if_bridge(4).

Hence a possible workaround for me was to simply disable the checksum
offloading on "em0" via "ifconfig em0 -txcsum". This effectively solved
the networking problems, but this is just a workaround.

Another workaround would have been to put into the box a 100baseTX NIC
driven by fxp(4) instead of the 1000baseTX NIC driven by em(4). Because
the combination of if_bridge(4) and fxp(4) I've running fine with mostly
the same configuration on another server.

The reason why em(4) doesn't perform the checksum offloading I do not
understand. This might be perhaps a buglet and is perhaps related to
the different packet flow through the system in the cases with and
without if_bridge(4). Perhaps someone who better knows both em(4) and
the internal packet flows can check this.

But the reason why if_bridge(4) _unconditionally_ clears the checksums
of all enqueued packets is totally unclear to me. That a bridge _checks_
the checksums of incoming packets is ok. That a bridge drops packets
with bad checksum I also can accept. But that a bridge clears the
checksum on incoming packets confuses me.

Perhaps it was done because if_bridge(4) not just forwards packets but
also _generates_ new one in case STP is performed. Here if_bridge(4)
perhaps feels lazy and just unconditionally clears the checksum in the
lower level function bridge_enqueue(). But IMHO the correct way would be
to conditionally clear the checksum only for the newly generated packets
(where a new checksum has to be generated) but not for the forwarded
ones (where the checksum already has to exist).

>How-To-Repeat:
Create a bridge with if_bridge(4) between a em(4) interface and for
instance a tap(4) interface. Then send out packets on em(4) and capture
them. Then look at the IP header and recognize that it contains an
invalid header checksum value of 0x0000.
>Fix:
A workaround is to disable the "checksum offloading" on em(4) with
"ifconfig em0 -txcsum". But the real fix IMHO is to conditionally clear
the checksum in if_bridge(4) only for the newly generated packets and
additionally to figure out why em(4) doesn't perform the checksum
(re-)calculation under "txcsum" if the interface is attached to a
if_bridge(4) device.

>Release-Note:
>Audit-Trail:

From: Gleb Smirnoff <glebius@FreeBSD.org>
To: "Ralf S. Engelschall" <rse@FreeBSD.org>
Cc: FreeBSD-gnats-submit@FreeBSD.org
Subject: Re: kern/91032: invalid IP checksum under if_bridge(4)+em(4) combination
Date: Wed, 28 Dec 2005 23:55:23 +0300

 On Wed, Dec 28, 2005 at 09:45:38PM +0100, Ralf S. Engelschall wrote:
 R> Another workaround would have been to put into the box a 100baseTX NIC
 R> driven by fxp(4) instead of the 1000baseTX NIC driven by em(4). Because
 R> the combination of if_bridge(4) and fxp(4) I've running fine with mostly
 R> the same configuration on another server.
 
 Not all fxp(4) cards can do checksum offloading. If you put the one
 that can't, then the seconds workaround doesn't differ from the first
 one. An problem ins't em specific in this case probably.
 
 So important questions is: does the fxp card you used can do offloading?
 
 -- 
 Totus tuus, Glebius.
 GLEBIUS-RIPN GLEB-RIPE

From: "Ralf S. Engelschall" <rse@FreeBSD.org>
To: Gleb Smirnoff <glebius@FreeBSD.org>
Cc: FreeBSD-gnats-submit@FreeBSD.org
Subject: Re: kern/91032: invalid IP checksum under if_bridge(4)+em(4) combination
Date: Wed, 28 Dec 2005 22:05:24 +0100

 On Wed, Dec 28, 2005, Gleb Smirnoff wrote:
 
 > On Wed, Dec 28, 2005 at 09:45:38PM +0100, Ralf S. Engelschall wrote:
 > R> Another workaround would have been to put into the box a 100baseTX NIC
 > R> driven by fxp(4) instead of the 1000baseTX NIC driven by em(4). Because
 > R> the combination of if_bridge(4) and fxp(4) I've running fine with mostly
 > R> the same configuration on another server.
 >
 > Not all fxp(4) cards can do checksum offloading. If you put the one
 > that can't, then the seconds workaround doesn't differ from the first
 > one. An problem ins't em specific in this case probably.
 
 Yes, of course. I think we actually have two problems here: first
 if_bridge(4) unconditionally clears the checksum, although I see no
 reason for doing this in case of non-generated and just forwarded
 packets. And the second problem (that the checksum is not re-calculated)
 is more em(4) related or at least related to the packet flow through the
 system in case a network device is attached to if_bridge(4).
 
 > So important questions is: does the fxp card you used can do offloading?
 
 According to the ifconfig(8) output on that box and the fact that I do
 not see any "TXCSUM" in the "options", I would say this particular card
 is not capable of doing "checksum offloading":
 
 | net0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
 |         options=8<VLAN_MTU>
 |         inet 172.16.120.130 netmask 0xffff0000 broadcast 172.16.255.255
 |         ether 00:07:e9:d9:4d:43
 |         media: Ethernet autoselect (100baseTX <full-duplex>)
 |         status: active
 
 Ok, then it is clear why the bridging works fine on this FreeBSD
 6.0-STABLE box while it was broken (and required the "ifconfig em0
 -txcsum") on the other box.
 
 --
 rse@FreeBSD.org                        Ralf S. Engelschall
 FreeBSD.org/~rse                       rse@engelschall.com
 FreeBSD committer                      www.engelschall.com
 
Responsible-Changed-From-To: freebsd-bugs->thompsa 
Responsible-Changed-By: thompsa 
Responsible-Changed-When: Mon Jan 16 20:36:30 UTC 2006 
Responsible-Changed-Why:  
Grab, this should be fixed in r1.50 of if_bridge. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=91032 

From: Andrew Thompson <thompsa@freebsd.org>
To: bug-followup@freebsd.org, rse@freebsd.org
Cc:  
Subject: Re: kern/91032: [if_bridge] invalid IP checksum under if_bridge(4) em(4) combination
Date: Tue, 17 Jan 2006 09:36:14 +1300

 Hi,
 
 I have committed code to HEAD to fix this up, see r1.50 of
 if_bridge.c. Its going to be MFC'd in a few more days if you
 want to test in 6-STABLE.
 
 cheers,
 Andrew

From: Andrew Thompson <thompsa@freebsd.org>
To: bug-followup@freebsd.org, rse@freebsd.org
Cc:  
Subject: Re: kern/91032: [if_bridge] invalid IP checksum under if_bridge(4) em(4) combination
Date: Tue, 21 Feb 2006 11:32:09 +1300

 Hi Ralf,
 
 
 Did you get to test if_bridge(4)+em(4) with a newish STABLE? the
 checksum problems should be gone.
 
 
 cheers,
 Andrew
 
 ps. Thanks for your gmirror article, it was very helpful!
State-Changed-From-To: open->patched 
State-Changed-By: thompsa 
State-Changed-When: Wed Apr 5 20:25:56 UTC 2006 
State-Changed-Why:  
This was fixed a while ago, just waiting for Ralf to confirm. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=91032 
State-Changed-From-To: patched->closed 
State-Changed-By: thompsa 
State-Changed-When: Wed May 31 00:27:20 UTC 2006 
State-Changed-Why:  
This has been fixed in 6.1R. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=91032 
>Unformatted:
