From mike@ms.ha.md.us  Wed Jun 24 13:04:37 1998
Received: from ms.ha.md.us (ha.ha.md.us [192.55.203.244])
          by hub.freebsd.org (8.8.8/8.8.8) with SMTP id NAA17089
          for <FreeBSD-gnats-submit@freebsd.org>; Wed, 24 Jun 1998 13:04:29 -0700 (PDT)
          (envelope-from mike@ms.ha.md.us)
Message-Id: <9806242201.aa01253@ms.ms.ha.md.us>
Date: Wed, 24 Jun 98 16:01:09 EDT
From: mike@ms.ha.md.us
Sender: mike@ms.ha.md.us
Reply-To: mike@ms.ha.md.us
To: FreeBSD-gnats-submit@freebsd.org, mike@ms.ha.md.us
Subject: 3Com3C509: lockups & high packet latency
X-Send-Pr-Version: 3.2

>Number:         7057
>Category:       i386
>Synopsis:       3Com 3C509 locks up, or has >1000ms rtt under 100pps load of RDUMP.
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    mdodd
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Jun 24 13:10:03 PDT 1998
>Closed-Date:    Fri Aug 26 11:32:55 GMT 2005
>Last-Modified:  Fri Aug 26 11:32:55 GMT 2005
>Originator:     Mike Muuss <mike>
>Release:        FreeBSD 3.0-980222-SNAP i386
>Organization:
Home
>Environment:

AMD 486DX4/100, Asus SP3G PCI motherboard, 64 MBytes of RAM.

1 3C5x9 board(s) on ISA found at 0x300
ep0 at 0x300-0x30f irq 11 on isa
ep0: aui/utp/bnc[*BNC*] address 00:60:08:27:e4:95

Only other 2 hosts on this ethernet are ms.ha.md.us: a P233 running
BSD/OS 2.1 unix, and nt.ha.md.us: a P150 dual-booting Win95 & Linux.
The FreeBSD machine is temporarily named ha.ha.md.us.

>Description:

While running RDUMP from the FreeBSD machine over the local ethernet to
the BSD/OS machine's Exabyte-8200 tape drive, two different (related?)
problems are observed:

(1)  The ethernet interface on teh FreeBSD system "locks up".  Total
ping loss, kernel queue limit of 50 is exceeded resulting in "No more
buffer space" errors.  ifconfig down followed by ifconfig up successfully
restarted the interface.

(2)  Very high round-trip-times observed between the two machines.
BSD/OS and Win95 machines can ping each other just fine, but the problem
is observed between both the BSD/OS & FreeBSD system and the Win95 and
FreeBSD system, pointing to trouble in the FreeBSD system.

Here is the evidence I collected:

1 ms> ping ha
PING ha.ha.md.us (192.55.203.244): 56 data bytes
64 bytes from 192.55.203.244: icmp_seq=0 ttl=255 time=3.332 ms
64 bytes from 192.55.203.244: icmp_seq=1 ttl=255 time=60.674 ms
64 bytes from 192.55.203.244: icmp_seq=2 ttl=255 time=60.558 ms
64 bytes from 192.55.203.244: icmp_seq=3 ttl=255 time=5.72 ms
64 bytes from 192.55.203.244: icmp_seq=4 ttl=255 time=1007.02 ms
64 bytes from 192.55.203.244: icmp_seq=5 ttl=255 time=7.803 ms
64 bytes from 192.55.203.244: icmp_seq=6 ttl=255 time=1010.03 ms
64 bytes from 192.55.203.244: icmp_seq=7 ttl=255 time=11.579 ms
64 bytes from 192.55.203.244: icmp_seq=8 ttl=255 time=1005.51 ms
64 bytes from 192.55.203.244: icmp_seq=9 ttl=255 time=6.384 ms
64 bytes from 192.55.203.244: icmp_seq=10 ttl=255 time=1007.31 ms
64 bytes from 192.55.203.244: icmp_seq=11 ttl=255 time=8.188 ms
64 bytes from 192.55.203.244: icmp_seq=12 ttl=255 time=0.617 ms
64 bytes from 192.55.203.244: icmp_seq=13 ttl=255 time=10.679 ms
64 bytes from 192.55.203.244: icmp_seq=14 ttl=255 time=13.146 ms
64 bytes from 192.55.203.244: icmp_seq=15 ttl=255 time=0.629 ms
^C
--- ha.ha.md.us ping statistics ---
16 packets transmitted, 16 packets received, 0% packet loss
round-trip min/avg/max = 0.617/263.698/1010.03 ms

Note the high variation of round trip times.  Packets are getting stuck
for a full second, and then kicked loose somehow.

You can see the consequences of this problem on the data flow of the
RDUMP.  This is from the point of view of the receiving (BSD/OS) system:

4 ms> netstat -i -I ef0 1
   input    (ef0)     output            input   (Total)    output
  packets  errs   packets  errs colls    packets  errs   packets  errs colls
  251315     3   310068     0 39103   5504076    50  5794021     0 39103
      43     0       23     0    35        43     0       23     0    35
     104     0       55     0    12       104     0       55     0    12
       0     0        1     0    26         2     0        2     0    26
     168     0       86     0    10       171     0       89     0    10
       0     0        1     0    45         1     0        3     0    45
      72     0       38     0    11        76     0       42     0    11
       6     0        3     0    12        12     0       10     0    12
      18     0       10     0     8        26     0       17     0     8
     103     0       55     0    14       109     0       58     0    14
       6     0        3     0    30         9     0        9     0    30
      22     0       12     0     8        22     0       12     0     8
     117     0       60     0    12       119     0       63     0    12

And here is the view from the sending (FreeBSD) system:

2 ha ENC> netstat -i -I ep0 1
            input          (ep0)           output
   packets  errs      bytes    packets  errs      bytes colls
        12     0        721         21     0      34311     0
         7     0        421         12     0      25125     0
         0     0          0          0     0       1652     0
        47     0       2822         79     0      94018     0
         5     0        301          8     0      19069     0
         3     0        180         18     0      13764     0
        27     0       1621         33     0      47835     0
        14     0        841         25     0      34209     0
         5     0        301          8     0      19069     0
        19     0       1182         30     0      27624     0
        52     0       3222         85     0     125010     0
        27     0       1718         42     0      28784     0
        75     0       4631        125     0     170565     0
         9     0        624         15     0       1358     0
       112     0       6882        207     0     273682     0
        55     0       3388         96     0     136218     0
        75     0       4520        147     0     204692     0
        31     0       1946         44     0      68778     0
        14     0        966          9     0        912     0
        39     0       2385         96     0     103189     0

I own a half dozen of these 3C509 cards, and they are rock solid and
fast performers on all my other systems.  I'll do some hardware swapping
and other experimenting tomorrow, but this looks like a driver bug.

>How-To-Repeat:

Run RDUMP out a 3C509 card, then run some pings.  I'll try to reproduce
using TTCP as well.

>Fix:
>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->closed 
State-Changed-By: jkh 
State-Changed-When: Thu Jun 25 15:13:12 PDT 1998 
State-Changed-Why:  
This is a duplicate of PR#7056? 
State-Changed-From-To: closed->open 
State-Changed-By: phk 
State-Changed-When: Sat Jun 27 02:22:17 PDT 1998 
State-Changed-Why:  
right, PR#7056 & PR#7057 were duplicates, but don't close both... 
Responsible-Changed-From-To: freebsd-bugs->mdodd 
Responsible-Changed-By: mdodd 
Responsible-Changed-When: Mon Jul 17 23:48:27 PDT 2000 
Responsible-Changed-Why:  
'ep' seems to be my problem. 
. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=7057 
State-Changed-From-To: open->feedback 
State-Changed-By: matteo 
State-Changed-When: Sun Jun 19 17:33:30 GMT 2005 
State-Changed-Why:  
Should we close this PR because it's so old? 

http://www.freebsd.org/cgi/query-pr.cgi?pr=7057 
State-Changed-From-To: feedback->closed 
State-Changed-By: matteo 
State-Changed-When: Fri Aug 26 11:32:13 GMT 2005 
State-Changed-Why:  
Feedback timeout 

http://www.freebsd.org/cgi/query-pr.cgi?pr=7057 
>Unformatted:
