From nobody@FreeBSD.org  Sun Jan 23 07:41:53 2011
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D6422106564A
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 23 Jan 2011 07:41:53 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (unknown [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id C5E528FC0C
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 23 Jan 2011 07:41:53 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.4/8.14.4) with ESMTP id p0N7fr8h020137
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 23 Jan 2011 07:41:53 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.4/8.14.4/Submit) id p0N7frAn020136;
	Sun, 23 Jan 2011 07:41:53 GMT
	(envelope-from nobody)
Message-Id: <201101230741.p0N7frAn020136@red.freebsd.org>
Date: Sun, 23 Jan 2011 07:41:53 GMT
From: Alex <alex@ahhyes.net>
To: freebsd-gnats-submit@FreeBSD.org
Subject: High Collision count on "re" network interface
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         154236
>Category:       kern
>Synopsis:       [re] High Collision count on "re" network interface
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    yongari
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Jan 23 07:50:10 UTC 2011
>Closed-Date:    Fri Jun 24 02:19:40 UTC 2011
>Last-Modified:  Fri Jun 24 02:19:40 UTC 2011
>Originator:     Alex
>Release:        8.2 RC2
>Organization:
>Environment:
FreeBSD srv.mydomain.net 8.2-RC2 FreeBSD 8.2-RC2 #1: Sun Jan 23 12:49:35 EST 2011     alex@nospam.nospam.nospam:/usr/obj/usr/src/sys/custom-server  amd64
>Description:
I can report this issue is also present in 8.1/amd64 (and also occurs with the stock GENERIC kernels in both releases)

I am running FreeBSD on my VPS. The host is using XEN HVM.

There is an unusually high collision rate being reported on the network interface:

Name    Mtu Network       Address              Ipkts Ierrs Idrop    Opkts Oerrs  Coll
re0    1500 <Link#1>      00:16:3e:f0:8b:6c   136929     0     0   104042     0 104036

So, out of 104042 packets transmitted, collisions were detected on 104036 of them.

I thought that perhaps this may just be a XEN related issue until I ran into this article on the web:

http://blather.michaelwlucas.com/?p=268&cpage=1

This seems to be the same issue, but when running FreeBSD under KVM. Same network interface (re). Is this likely to be an issue with the "re" driver under virtualized environments?



>How-To-Repeat:
Run FreeBSD 8.1/amd64 in a XEN HVM environment
>Fix:


>Release-Note:
>Audit-Trail:

From: alex <alex@ahhyes.net>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: amd64/154236: High Collision count on "re" network interface
Date: Sun, 23 Jan 2011 19:11:50 +1100

 Some additional info:
 
 The interface is also reportedly in full duplex mode. Output of ifconfig:
 
 re0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
          options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
          ether 00:16:3e:f0:8b:6c
          inet 109.IP.IP.IP netmask 0xffffff80 broadcast 109.IP.IP.IP
          media: Ethernet autoselect (100baseTX <full-duplex>)
          status: active
 
Responsible-Changed-From-To: freebsd-amd64->freebsd-net 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Sun Jan 23 21:36:46 UTC 2011 
Responsible-Changed-Why:  
reclassify. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=154236 
State-Changed-From-To: open->feedback 
State-Changed-By: yongari 
State-Changed-When: Sun Jan 23 22:54:37 UTC 2011 
State-Changed-Why:  
The counter comes from status report from hardware so I think the 
counter is correct. Since you're using auto-negotiation, please 
make sure link partner also agrees on the resolved speed/duplex of 
re(4). 

Show me the output of dmesg for re(4)/rgephy(4). Also include the 
output of hardware MAC statistics counter. You can get it on your 
console after executing the following command. 
#sysctl dev.re.0.stats=1 


Responsible-Changed-From-To: freebsd-net->yongari 
Responsible-Changed-By: yongari 
Responsible-Changed-When: Sun Jan 23 22:54:37 UTC 2011 
Responsible-Changed-Why:  
Grab. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=154236 

From: alex <alex@ahhyes.net>
To: <bug-followup@FreeBSD.org>
Cc:  
Subject: Re: kern/154236: [re] High Collision count on "re" network interface
Date: Mon, 24 Jan 2011 12:59:48 +1100

  Alright,
 
  I have confirmed with the VPS provider that auto negotiation of the 
  duplex and speed has been handled correctly.
 
  I'll provide the rest of the information when I get home from work.
 
  Thanks.
 
 

From: alex <alex@ahhyes.net>
To: <bug-followup@FreeBSD.org>
Cc:  
Subject: Re: kern/154236: [re] High Collision count on "re" network interface
Date: Mon, 24 Jan 2011 19:38:10 +1100

  Hi,
 
  dmesg information:
  ------------------
 
  re0: <RealTek 8139C+ 10/100BaseTX> port 0xc100-0xc1ff mem 
  0xf3001000-0xf30010ff irq 32 at device 4.0 on pci0
  re0: Chip rev. 0x74800000
  re0: MAC rev. 0x00000000
  miibus0: <MII bus> on re0
  rlphy0: <RealTek internal media interface> PHY 0 on miibus0
  rlphy0:  10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 
  100baseTX-FDX, 100baseTX-FDX-flow, auto, auto-flow
  re0: Ethernet address: 00:16:3e:f0:8b:6c
  re0: [FILTER]
 
  Output of sysctl command:
  -------------------------
 
  srv# sysctl dev.re.0.stats=1
  dev.re.0.stats: -1 -> -1
 
 

From: Pyun YongHyeon <pyunyh@gmail.com>
To: alex <alex@ahhyes.net>
Cc: yongari@freebsd.org, bug-followup@FreeBSD.org
Subject: Re: kern/154236: [re] High Collision count on "re" network interface
Date: Mon, 24 Jan 2011 10:49:56 -0800

 On Mon, Jan 24, 2011 at 08:40:07AM +0000, alex wrote:
 > The following reply was made to PR kern/154236; it has been noted by GNATS.
 > 
 > From: alex <alex@ahhyes.net>
 > To: <bug-followup@FreeBSD.org>
 > Cc:  
 > Subject: Re: kern/154236: [re] High Collision count on "re" network interface
 > Date: Mon, 24 Jan 2011 19:38:10 +1100
 > 
 >   Hi,
 >  
 >   dmesg information:
 >   ------------------
 >  
 >   re0: <RealTek 8139C+ 10/100BaseTX> port 0xc100-0xc1ff mem 
 >   0xf3001000-0xf30010ff irq 32 at device 4.0 on pci0
 >   re0: Chip rev. 0x74800000
 >   re0: MAC rev. 0x00000000
 >   miibus0: <MII bus> on re0
 >   rlphy0: <RealTek internal media interface> PHY 0 on miibus0
 >   rlphy0:  10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 
 >   100baseTX-FDX, 100baseTX-FDX-flow, auto, auto-flow
 >   re0: Ethernet address: 00:16:3e:f0:8b:6c
 >   re0: [FILTER]
 >  
 >   Output of sysctl command:
 >   -------------------------
 >  
 >   srv# sysctl dev.re.0.stats=1
 >   dev.re.0.stats: -1 -> -1
 >  
 
 Hmm, show me the console output after executing sysctl.
 You may be able to find it in dmesg output too.

From: alex <alex@ahhyes.net>
To: <bug-followup@FreeBSD.org>
Cc:  
Subject: Re: kern/154236: [re] High Collision count on "re" network interface
Date: Tue, 25 Jan 2011 08:24:02 +1100

  There seems to be some data now:
 
  re0 statistics:
  Tx frames : 41519
  Rx frames : 289438
  Tx errors : 0
  Rx errors : 178370
  Rx missed frames : 4
  Rx frame alignment errs : 0
  Tx single collisions : 0
  Tx multiple collisions : 0
  Rx unicast frames : 46647
  Rx broadcast frames : 242795
  Rx multicast frames : 0
  Tx aborts : 0
  Tx underruns : 0
 
  [alex@srv ~]$ netstat -ni
  Name    Mtu Network       Address              Ipkts Ierrs Idrop    
  Opkts Oerrs  Coll
  re0    1500 <Link#1>      00:16:3e:f0:8b:6c   406372     0     0    
  46491     0 46478
 
 

From: alex <alex@ahhyes.net>
To: <bug-followup@FreeBSD.org>
Cc:  
Subject: Re: kern/154236: [re] High Collision count on "re" network interface
Date: Tue, 25 Jan 2011 08:42:49 +1100

  If the totals seem different from the initial report, I do apologize, I 
  had to restart the VPS for another reason.
 
  What's interesting about those statistics is that the collision count 
  according to the "re" status is zero, however netstat reports a 
  different story. What has me curious is the high amount of RX errors.
 
  It's a virtualized environment, where would all the errors be coming 
  from?
 

From: Pyun YongHyeon <pyunyh@gmail.com>
To: alex <alex@ahhyes.net>
Cc: yongari@freebsd.org, bug-followup@FreeBSD.org
Subject: Re: kern/154236: [re] High Collision count on "re" network interface
Date: Tue, 25 Jan 2011 11:57:16 -0800

 --ew6BAiZeqk4r7MaW
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 
 On Mon, Jan 24, 2011 at 09:50:12PM +0000, alex wrote:
 > The following reply was made to PR kern/154236; it has been noted by GNATS.
 > 
 > From: alex <alex@ahhyes.net>
 > To: <bug-followup@FreeBSD.org>
 > Cc:  
 > Subject: Re: kern/154236: [re] High Collision count on "re" network interface
 > Date: Tue, 25 Jan 2011 08:42:49 +1100
 > 
 >   If the totals seem different from the initial report, I do apologize, I 
 >   had to restart the VPS for another reason.
 >  
 >   What's interesting about those statistics is that the collision count 
 >   according to the "re" status is zero, however netstat reports a 
 >   different story. What has me curious is the high amount of RX errors.
 >  
 >   It's a virtualized environment, where would all the errors be coming 
 >   from?
 >  
 
 It seems the error status reported in TX descriptor is defined only
 for RTL8139C+. I slightly changed code that collects TX status.
 Would you try attached patch and let me know whether it makes any
 difference?
 
 The MAC statistics counter you showed me has an odd entry. It
 shows no TX errors but has large number of RX errors. These RX
 errors includes CRC errors and missed frames. The RX boardcast
 frames counter looks too high(greater than RX unicast frames). Not
 sure whether it's related with your working environments. Honestly
 I have no idea how to interpret RX MAC counter statistics.
 
 --ew6BAiZeqk4r7MaW
 Content-Type: text/x-diff; charset=us-ascii
 Content-Disposition: attachment; filename="re.stat.8139C+.diff"
 
 Index: sys/dev/re/if_re.c
 ===================================================================
 --- sys/dev/re/if_re.c	(revision 217832)
 +++ sys/dev/re/if_re.c	(working copy)
 @@ -2243,12 +2243,16 @@
  			    ("%s: freeing NULL mbufs!", __func__));
  			m_freem(txd->tx_m);
  			txd->tx_m = NULL;
 -			if (txstat & (RL_TDESC_STAT_EXCESSCOL|
 -			    RL_TDESC_STAT_COLCNT))
 -				ifp->if_collisions++;
 -			if (txstat & RL_TDESC_STAT_TXERRSUM)
 -				ifp->if_oerrors++;
 -			else
 +			if (sc->rl_type == RL_8139CPLUS) {
 +				if (txstat & (RL_TDESC_STAT_TXERRSUM |
 +				    RL_TDESC_STAT_UNDERRUN))
 +					ifp->if_oerrors++;
 +				else {
 +					ifp->if_collisions += (txstat &
 +					    RL_TDESC_STAT_COLCNT) >> 16;
 +					ifp->if_opackets++;
 +				}
 +			} else
  				ifp->if_opackets++;
  		}
  		sc->rl_ldata.rl_tx_free++;
 
 --ew6BAiZeqk4r7MaW--

From: alex <alex@ahhyes.net>
To: <bug-followup@FreeBSD.org>
Cc: <yongari@freebsd.org>
Subject: Re: kern/154236: [re] High Collision count on "re" network interface
Date: Wed, 26 Jan 2011 11:37:03 +1100

  Hi,
 
  The patch doesn't apply cleanly for me.
 
  -----------------------------
 
  Hmm...  Looks like a unified diff to me...
  The text leading up to this was:
  --------------------------
  |Index: sys/dev/re/if_re.c
  |===================================================================
  |--- sys/dev/re/if_re.c  (revision 217832)
  |+++ sys/dev/re/if_re.c  (working copy)
  --------------------------
  Patching file if_re.c using Plan A...
  Hunk #1 failed at 2243.
  1 out of 1 hunks failed--saving rejects to if_re.c.rej
  Hmm...  Ignoring the trailing garbage.
  done
 
  -----------------------------
 
  srv# cat if_re.c.rej
  ***************
  *** 2243,2254 ****
                               ("%s: freeing NULL mbufs!", __func__));
                           m_freem(txd->tx_m);
                           txd->tx_m = NULL;
  -                        if (txstat & (RL_TDESC_STAT_EXCESSCOL|
  -                            RL_TDESC_STAT_COLCNT))
  -                                ifp->if_collisions++;
  -                        if (txstat & RL_TDESC_STAT_TXERRSUM)
  -                                ifp->if_oerrors++;
  -                        else
                                   ifp->if_opackets++;
                   }
                   sc->rl_ldata.rl_tx_free++;
  --- 2243,2258 ----
                               ("%s: freeing NULL mbufs!", __func__));
                           m_freem(txd->tx_m);
                           txd->tx_m = NULL;
  +                        if (sc->rl_type == RL_8139CPLUS) {
  +                                if (txstat & (RL_TDESC_STAT_TXERRSUM |
  +                                    RL_TDESC_STAT_UNDERRUN))
  +                                        ifp->if_oerrors++;
  +                                else {
  +                                        ifp->if_collisions += (txstat 
  &
  +                                            RL_TDESC_STAT_COLCNT) >> 
  16;
  +                                        ifp->if_opackets++;
  +                                }
  +                        } else
                                   ifp->if_opackets++;
                   }
                   sc->rl_ldata.rl_tx_free++;
 
 
  -------------------------------------------------
 
  The high rate of RX errors would be causing a serious degradation in 
  performance wouldn't it?
 
 

From: alex <alex@ahhyes.net>
To: <bug-followup@FreeBSD.org>
Cc:  
Subject: Re: kern/154236: [re] High Collision count on "re" network interface
Date: Wed, 26 Jan 2011 13:01:24 +1100

  Patched if_re.c manually and did a rebuild.
 
  Rebooted and checked stats:
 
  re0 statistics:
  Tx frames : 167
  Rx frames : 466
  Tx errors : 0
  Rx errors : 168
  Rx missed frames : 0
  Rx frame alignment errs : 0
  Tx single collisions : 0
  Tx multiple collisions : 0
  Rx unicast frames : 233
  Rx broadcast frames : 233
  Rx multicast frames : 0
  Tx aborts : 0
  Tx underruns : 0
 
  srv# netstat -ni
  Name    Mtu Network       Address              Ipkts Ierrs Idrop    
  Opkts Oerrs  Coll
  re0    1500 <Link#1>      00:16:3e:f0:8b:6c      537     0     0      
  195     0   768
 
  Netstat still reports collisions. Any ideas?
 

From: Janne Snabb <snabb@epipe.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/154236: Re: Help! Network issue with freebsd + Xen
Date: Wed, 26 Jan 2011 09:18:10 +0000 (UTC)

 I can confirm seeing it here also, you are not alone:
 
 $ netstat -i
 Name    Mtu Network       Address              Ipkts Ierrs Idrop    Opkts Oerrs  Coll
 re0    1500 <Link#1>      00:16:3e:08:b4:c7 21150230     0     0  4101045     0 4019323
 [..]
 
 I had not noticed it before. I am not encountering any packet loss
 or other networking problems. No idea about the reason.
 
 This is on FreeBSD 8.1-RELEASE-p2 amd64 with GENERIC kernel. Not
 sure about the Xen and dom0 environment because this is tested on
 a commercial VPS provider.
 
 --
 Janne Snabb / EPIPE Communications
 snabb@epipe.com - http://epipe.com/

From: Janne Snabb <snabb@epipe.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/154236: Re: Help! Network issue with freebsd + Xen
Date: Fri, 28 Jan 2011 07:21:22 +0000 (UTC)

 This also happens on 8.2RC2/RELENG_8_2 amd64 GENERIC kernel running
 on Xen 4.0.1 with openSUSE 11.3 x86_64 kernel 2.6.34.7-0.7-xen
 (openSUSE kernel-standard) dom0 (my current testing environment).
 
 --
 Janne Snabb / EPIPE Communications
 snabb@epipe.com - http://epipe.com/

From: Pyun YongHyeon <pyunyh@gmail.com>
To: Janne Snabb <snabb@epipe.com>
Cc: yongari@freebsd.org, bug-followup@FreeBSD.org
Subject: Re: kern/154236: Re: Help! Network issue with freebsd + Xen
Date: Mon, 31 Jan 2011 13:25:41 -0800

 On Fri, Jan 28, 2011 at 07:30:14AM +0000, Janne Snabb wrote:
 > The following reply was made to PR kern/154236; it has been noted by GNATS.
 > 
 > From: Janne Snabb <snabb@epipe.com>
 > To: bug-followup@FreeBSD.org
 > Cc:  
 > Subject: Re: kern/154236: Re: Help! Network issue with freebsd + Xen
 > Date: Fri, 28 Jan 2011 07:21:22 +0000 (UTC)
 > 
 >  This also happens on 8.2RC2/RELENG_8_2 amd64 GENERIC kernel running
 >  on Xen 4.0.1 with openSUSE 11.3 x86_64 kernel 2.6.34.7-0.7-xen
 >  (openSUSE kernel-standard) dom0 (my current testing environment).
 >  
 
 Honestly I have no idea what's happening here. re(4) does not even
 know whether it runs in virtual environments or real hardware.
 Given that no other user that uses real hardware see the issue I
 suspect it could be a bug in virtual environments. I'm also not
 able to find wrong code in handling collision counters.
 I think other OS will also show similar issue in the virtual
 environments. If you don't see noticeable performance differences
 with large number of collisions in virtual environments I wouldn't
 worry about that.

From: YongHyeon PYUN <pyunyh@gmail.com>
To: Janne Snabb <snabb@epipe.com>
Cc: yongari@freebsd.org, Alex <alex@ahhyes.net>, bug-followup@FreeBSD.org
Subject: Re: kern/154236: Re: Help! Network issue with freebsd + Xen
Date: Fri, 20 May 2011 19:13:53 -0700

 On Mon, Jan 31, 2011 at 01:25:41PM -0800, Pyun YongHyeon wrote:
 > On Fri, Jan 28, 2011 at 07:30:14AM +0000, Janne Snabb wrote:
 > > The following reply was made to PR kern/154236; it has been noted by GNATS.
 > > 
 > > From: Janne Snabb <snabb@epipe.com>
 > > To: bug-followup@FreeBSD.org
 > > Cc:  
 > > Subject: Re: kern/154236: Re: Help! Network issue with freebsd + Xen
 > > Date: Fri, 28 Jan 2011 07:21:22 +0000 (UTC)
 > > 
 > >  This also happens on 8.2RC2/RELENG_8_2 amd64 GENERIC kernel running
 > >  on Xen 4.0.1 with openSUSE 11.3 x86_64 kernel 2.6.34.7-0.7-xen
 > >  (openSUSE kernel-standard) dom0 (my current testing environment).
 > >  
 > 
 > Honestly I have no idea what's happening here. re(4) does not even
 > know whether it runs in virtual environments or real hardware.
 > Given that no other user that uses real hardware see the issue I
 > suspect it could be a bug in virtual environments. I'm also not
 > able to find wrong code in handling collision counters.
 > I think other OS will also show similar issue in the virtual
 > environments. If you don't see noticeable performance differences
 > with large number of collisions in virtual environments I wouldn't
 > worry about that.
 
 I tried it on Linux QEMU and I was not able to see any issues with
 re(4). The interface emulated by QEMU is RTL8139C+ and it worked as
 expected. I strongly think the problem is *NOT* in FreeBSD re(4)
 driver.

From: Alex <alex@ahhyes.net>
To: <bug-followup@FreeBSD.org>
Cc:  
Subject: Re: kern/154236: Re: Help! Network issue with freebsd + Xen
Date: Sat, 21 May 2011 12:35:26 +1000

  No collisions when using a Xen kernel which uses the xn network 
  interface. Perhaps it's a compatibility issue with 're' and Xen.
 
 
  On Fri, 20 May 2011 19:13:53 -0700, YongHyeon PYUN wrote:
 > On Mon, Jan 31, 2011 at 01:25:41PM -0800, Pyun YongHyeon wrote:
 >> On Fri, Jan 28, 2011 at 07:30:14AM +0000, Janne Snabb wrote:
 >> > The following reply was made to PR kern/154236; it has been noted 
 >> by GNATS.
 >> >
 >> > From: Janne Snabb <snabb@epipe.com>
 >> > To: bug-followup@FreeBSD.org
 >> > Cc:
 >> > Subject: Re: kern/154236: Re: Help! Network issue with freebsd + 
 >> Xen
 >> > Date: Fri, 28 Jan 2011 07:21:22 +0000 (UTC)
 >> >
 >> >  This also happens on 8.2RC2/RELENG_8_2 amd64 GENERIC kernel 
 >> running
 >> >  on Xen 4.0.1 with openSUSE 11.3 x86_64 kernel 2.6.34.7-0.7-xen
 >> >  (openSUSE kernel-standard) dom0 (my current testing environment).
 >> >
 >>
 >> Honestly I have no idea what's happening here. re(4) does not even
 >> know whether it runs in virtual environments or real hardware.
 >> Given that no other user that uses real hardware see the issue I
 >> suspect it could be a bug in virtual environments. I'm also not
 >> able to find wrong code in handling collision counters.
 >> I think other OS will also show similar issue in the virtual
 >> environments. If you don't see noticeable performance differences
 >> with large number of collisions in virtual environments I wouldn't
 >> worry about that.
 >
 > I tried it on Linux QEMU and I was not able to see any issues with
 > re(4). The interface emulated by QEMU is RTL8139C+ and it worked as
 > expected. I strongly think the problem is *NOT* in FreeBSD re(4)
 > driver.
 
State-Changed-From-To: feedback->closed 
State-Changed-By: yongari 
State-Changed-When: Fri Jun 24 02:19:19 UTC 2011 
State-Changed-Why:  
Close. Not a re(4) bug. 
Also, note this collision counter in virtual environments is 
meaningless as actual packet transmission is done by other 
mechanism. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=154236 
>Unformatted:
