From nobody@FreeBSD.org  Tue Nov 27 11:28:26 2001
Return-Path: <nobody@FreeBSD.org>
Received: from freefall.freebsd.org (freefall.FreeBSD.org [216.136.204.21])
	by hub.freebsd.org (Postfix) with ESMTP id C323637B419
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 27 Nov 2001 11:28:25 -0800 (PST)
Received: (from nobody@localhost)
	by freefall.freebsd.org (8.11.6/8.11.6) id fARJSP584881;
	Tue, 27 Nov 2001 11:28:25 -0800 (PST)
	(envelope-from nobody)
Message-Id: <200111271928.fARJSP584881@freefall.freebsd.org>
Date: Tue, 27 Nov 2001 11:28:25 -0800 (PST)
From: David S Madole <david@madole.net>
To: freebsd-gnats-submit@FreeBSD.org
Subject: Network to disk write performance low under ATA with DMA
X-Send-Pr-Version: www-1.0

>Number:         32338
>Category:       kern
>Synopsis:       [patch] sis(4):Network to disk write performance low under ATA with DMA
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Nov 27 11:30:01 PST 2001
>Closed-Date:    Wed Aug 25 21:40:50 GMT 2004
>Last-Modified:  Wed Aug 25 21:40:50 GMT 2004
>Originator:     David S Madole
>Release:        4.4-RELEASE
>Organization:
>Environment:
FreeBSD 4.4-RELEASE #4: Mon Nov 26 18:39:35 GMT 2001
    dmadole@devel.madole.net:/usr/src/sys/compile/MADOLE
Timecounter "i8254"  frequency 1193182 Hz
Timecounter "TSC"  frequency 551252600 Hz
CPU: AMD-K6(tm)-III Processor (551.25-MHz 586-class CPU)
  Origin = "AuthenticAMD"  Id = 0x5d0  Stepping = 0
  Features=0x8021bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8,PGE,MMX>
  AMD Features=0xc0000800<SYSCALL,DSP,3DNow!>
real memory  = 134217728 (131072K bytes)
avail memory = 127799296 (124804K bytes)
Preloaded elf kernel "kernel" at 0xc02d1000.
K6-family MTRR support enabled (2 registers)
Using $PIR table, 6 entries at 0xc00fddf0
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
pcib2: <VIA 82C598MVP (Apollo MVP3) PCI-PCI (AGP) bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib2
isab0: <VIA 82C586 PCI-ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <VIA 82C586 ATA33 controller> port 0xd000-0xd00f at device 7.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
pci0: <Cirrus Logic GD5434 SVGA controller> at 8.0
ed0: <NE2000 PCI Ethernet (RealTek 8029)> port 0xd800-0xd81f irq 11 at device 9.
0 on pci0
ed0: address 00:00:b4:9d:c5:08, type NE2000 (16 bit)
sis0: <NatSemi DP83815 10/100BaseTX> port 0xdc00-0xdcff mem 0xdf000000-0xdf000ff
f irq 9 at device 10.0 on pci0
sis0: Ethernet address: 00:a0:cc:75:1b:be
miibus0: <MII bus> on sis0
ukphy0: <Generic IEEE 802.3u media interface> on miibus0
ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
oltr0: <Olicom PCI/II 16/4 Adapter (OC-3137)> port 0xe000-0xe03f irq 10 at devic
e 11.0 on pci0
oltr0: MAC address 00:00:83:23:9c:e9
pcib1: <Host to PCI bridge> on motherboard
pci2: <PCI bus> on pcib1
orm0: <Option ROM> at iomem 0xc0000-0xc7fff on isa0
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
sio2: configured irq 5 not in bitmap of probed irqs 0
ppc0: <Parallel port> at port 0x378-0x37f irq 7 drq 3 on isa0
ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/16 bytes threshold
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
IP packet filtering initialized, divert enabled, rule-based forwarding enabled,
default to deny, logging disabled
ad0: 58623MB <Maxtor 96147U8> [119108/16/63] at ata0-master UDMA33
acd0: CD-RW <8X4X32> at ata1-master using PIO4
Mounting root from ufs:/dev/ad0s1a

>Description:
Data transfers from network to ATA hard drive are very slow. Visible
with Samba, HTTP PUT, etc., but most easily demonstrated with FTP.
Only occurs when DMA is enabled on the ATA controller.

Interestingly, only seems to happen when network data is being written
to the drive. Doing a large write to the drive while simultaneously
doing a 'ping -f -s 1400' to another node from another session does
not slow the disk writes significantly.

The drive is a Maxtor 60GB (model number in included dmesg). NIC is
NetGear with sis driver, although same problem occurs with LinkSys
card on dc driver. Also occurs with older Maxtor 6GB drive.

>How-To-Repeat:
Example below is FTPing to another machine also with 4.4-RELEASE
connected through NetGear NICs (sis driver) and a 100MB switch,
full-duplex.

Note that gets to the hard drive are extremely slow, even though
gets to /dev/null are fine. Puts are fine. The disk write speed
is OK when it is not data through the network. Turn off DMA and
write speed is fine (although CPU usage is very high!)

I am showing a small test file here, but transfer rates are 
similar with large (200MB) test files.

# sysctl hw.atamodes
hw.atamodes: dma,---,pio,---,
# ftp test.madole.net
Connected to test.madole.net.
220 test.madole.net FTP server (Version 6.00LS) ready.
Name (test.madole.net:dmadole): dmadole
331 Password required for dmadole.
Password:
230 User dmadole logged in.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> get /bin/csh /dev/null
local: /dev/null remote: /bin/csh
150 Opening BINARY mode data connection for '/bin/csh' (636076 bytes).
226 Transfer complete.
636076 bytes received in 0.13 seconds (4.52 MB/s)
ftp> get /bin/csh /tmp/temp
local: /tmp/temp remote: /bin/csh
150 Opening BINARY mode data connection for '/bin/csh' (636076 bytes).
100% |**************************************************|   621 KB    00:00 ETA
226 Transfer complete.
636076 bytes received in 16.04 seconds (38.73 KB/s)
ftp> put /bin/csh /tmp/temp
local: /bin/csh remote: /tmp/temp
150 Opening BINARY mode data connection for '/tmp/temp'.
100% |**************************************************|   621 KB    00:00 ETA
226 Transfer complete.
636076 bytes sent in 0.33 seconds (1.85 MB/s)
ftp> quit
221 Goodbye.
# dd if=/bin/csh of=/tmp/temp
1242+1 records in
1242+1 records out
636076 bytes transferred in 0.082739 secs (7687741 bytes/sec)
# sysctl -w hw.atamodes=pio,---,pio,---,
hw.atamodes: dma,---,pio,---, -> pio,---,pio,---,
# ftp test.madole.net
Connected to test.madole.net.
220 test.madole.net FTP server (Version 6.00LS) ready.
Name (test.madole.net:dmadole): dmadole
331 Password required for dmadole.
Password:
230 User dmadole logged in.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> get /bin/csh /tmp/temp
local: /tmp/temp remote: /bin/csh
150 Opening BINARY mode data connection for '/bin/csh' (636076 bytes).
100% |**************************************************|   621 KB    00:00 ETA
226 Transfer complete.
636076 bytes received in 0.12 seconds (5.12 MB/s)
ftp> quit
221 Goodbye.
#
>Fix:
I wish I knew where to look!

PIO mode is a kinda-workaround, but it hits the CPU hard enough to
cause other less-serious performance problems.

>Release-Note:
>Audit-Trail:

From: Sren Schmidt <sos@freebsd.dk>
To: David S Madole <david@madole.net>
Cc: freebsd-gnats-submit@FreeBSD.ORG
Subject: Re: kern/32338: Network to disk write performance low under ATA with
 DMA
Date: Tue, 27 Nov 2001 21:30:59 +0100 (CET)

 It seems David S Madole wrote:
 > >Description:
 > Data transfers from network to ATA hard drive are very slow. Visible
 > with Samba, HTTP PUT, etc., but most easily demonstrated with FTP.
 > Only occurs when DMA is enabled on the ATA controller.
 > 
 > Interestingly, only seems to happen when network data is being written
 > to the drive. Doing a large write to the drive while simultaneously
 > doing a 'ping -f -s 1400' to another node from another session does
 > not slow the disk writes significantly.
 > 
 > The drive is a Maxtor 60GB (model number in included dmesg). NIC is
 > NetGear with sis driver, although same problem occurs with LinkSys
 > card on dc driver. Also occurs with older Maxtor 6GB drive.
 
 Hmm, this sounds like the SiS network card may be using DMA too
 and badly at that, making the net card and the ATA driver
 compete for the bus...
 
 > >Fix:
 > I wish I knew where to look!
 
 Try another netcard, if that helps you know where to look :)
 
 -Sren

From: David Madole <david@madole.net>
To: sos@freebsd.dk
Cc: freebsd-gnats-submit <freebsd-gnats-submit@FreeBSD.ORG>
Subject: RE: kern/32338: Network to disk write performance low under ATA with DMA
Date: Thu, 29 Nov 2001 10:27:50 -0500

 >===== Original Message From sos@freebsd.dk =====
 >It seems David S Madole wrote:
 >> >Description:
 >> Data transfers from network to ATA hard drive are very slow. Visible
 >> with Samba, HTTP PUT, etc., but most easily demonstrated with FTP.
 >> Only occurs when DMA is enabled on the ATA controller.
 >>
 >> Interestingly, only seems to happen when network data is being written
 >> to the drive. Doing a large write to the drive while simultaneously
 >> doing a 'ping -f -s 1400' to another node from another session does
 >> not slow the disk writes significantly.
 >>
 >> The drive is a Maxtor 60GB (model number in included dmesg). NIC is
 >> NetGear with sis driver, although same problem occurs with LinkSys
 >> card on dc driver. Also occurs with older Maxtor 6GB drive.
 >
 >Hmm, this sounds like the SiS network card may be using DMA too
 >and badly at that, making the net card and the ATA driver
 >compete for the bus...
 >
 >> >Fix:
 >> I wish I knew where to look!
 >
 >Try another netcard, if that helps you know where to look :)
 
 I think it's some kind of funniness with the chipset. As I mentioned 
 originally, I get the same problem with a ADMtek AN985 on the dc driver. I 
 also do not get the problem on an Intel Triton-based board with the same NIC 
 and hard drive (and a much slower CPU - P-133 instead of K6-III-550).
 
 I added a little bit of instrumentation to the sis driver to keep track of 
 receive descriptor/buffer overflows and it appears to be the problem. If I FTP 
 more than about 150K, I get an overflow and slow performance (from TCP 
 throttling back in response to packet loss). I changed the sis driver code to 
 up the buffers from 128 to 1536, it still overflows. It appears that a NIC 
 interrupt is getting lost, or the interrupt latency is going way up. So what 
 in running the drive under DMA could cause interrupt problems for the NIC?
 
 I also forced the drive to WDMA2 instead of UDMA2 to match what the 
 Triton-based machine was doing, but it made no difference. Didn't really 
 expect it would.
 
 Dave
 

From: "David S. Madole" <david@madole.net>
To: <luigi@FreeBSD.org>, <sos@freebsd.dk>, <wpaul@freebsd.org>
Cc: <freebsd-gnats-submit@FreeBSD.ORG>
Subject: kern/32338: Network to disk write performance low under ATA with DMA
Date: Mon, 7 Jan 2002 20:13:19 -0500

 This is a multi-part message in MIME format.
 
 ------=_NextPart_000_0016_01C197B7.BFC2AB60
 Content-Type: text/plain;
 	charset="Windows-1252"
 Content-Transfer-Encoding: 8bit
 
 >===== Original Message From sos@freebsd.dk =====
 >It seems David S Madole wrote:
 >> >Description:
 >> Data transfers from network to ATA hard drive are very slow. Visible
 >> with Samba, HTTP PUT, etc., but most easily demonstrated with FTP.
 >> Only occurs when DMA is enabled on the ATA controller.
 >>
 >> Interestingly, only seems to happen when network data is being written
 >> to the drive. Doing a large write to the drive while simultaneously
 >> doing a 'ping -f -s 1400' to another node from another session does
 >> not slow the disk writes significantly.
 >>
 >> The drive is a Maxtor 60GB (model number in included dmesg). NIC is
 >> NetGear with sis driver, although same problem occurs with LinkSys
 >> card on dc driver. Also occurs with older Maxtor 6GB drive.
 >
 >Hmm, this sounds like the SiS network card may be using DMA too
 >and badly at that, making the net card and the ATA driver
 >compete for the bus...
 
 ( my apologies if anyone has received this already, but I don't think the
   copy I sent previously went out correctly since it never showed up on
 gnats
   and I was having mail trouble for a few days )
 
 Bill, I copied you since this is your driver.
 
 Luigi, I apologize if inappropriate to copy you on this, but I see you've
 been
 active in the sis driver lately and thought you might be interested.
 
 Sren, thanks for the response. As I mentioned before, I did have this
 problem with the dc driver as well, on an ADMtek AN985.
 
 It turns out that the NIC was unable to DMA it's FIFO into memory fast
 enough,
 causing it to overflow and packets to get lost, throttling back TCP. It
 seems
 that my BIOS initializes the PCI latency timer on the card to 32, which
 doesn't let it move enough data at a time to keep up with 100Mbs while
 competing with the ATA controller.
 
 I am heistant to second-quess every BIOS, so I decided to try an adaptive
 approach and bump the timer up a little bit (32) each time an overflow
 occurs.
 On my machine, once it reaches 96 the card is stable and never overflows
 again. This increased my FTP-to-disk throughput to a little over 9MBps,
 pretty
 good. I also changed the driver to not reinitialize the card when an
 overflow
 or receive error occurs -- it seems unnecessary and takes a relatively long
 time, causing many packets to get lost.
 
 I also adjusted some NIC settings particular to the National chip to make
 them
 conditional on the silicon version, as the latest datasheet specifies. I'm
 guessing it doesn't really matter much since these are probably just
 defaults
 on the later chips.
 
 I think some other DMA-based NIC drivers, like dc could probably benefit
 from
 this as well. I can fix up the dc driver as well, if  this seems like the
 best
 way to go, and see if it fixes the issue there, too.
 
 By the way, the attached diff is against 4.4-RELEASE.
 
 Thanks,
 Dave
 
 
 
 ------=_NextPart_000_0016_01C197B7.BFC2AB60
 Content-Type: text/plain;
 	name="diff_sis.txt"
 Content-Transfer-Encoding: quoted-printable
 Content-Disposition: attachment;
 	filename="diff_sis.txt"
 
 *** if_sisreg.h.orig	Wed Feb 21 22:17:51 2001=0A=
 --- if_sisreg.h	Thu Nov 29 19:28:35 2001=0A=
 ***************=0A=
 *** 76,81 ****=0A=
 --- 76,82 ----=0A=
   =0A=
   /* NS DP83815 registers */=0A=
   #define NS_CLKRUN		0x3C=0A=
 + #define NS_SRR			0x58=0A=
   #define NS_BMCR			0x80=0A=
   #define NS_BMSR			0x84=0A=
   #define NS_PHYIDR1		0x88=0A=
 ***************=0A=
 *** 401,406 ****=0A=
 --- 402,408 ----=0A=
   	struct sis_list_data	*sis_ldata;=0A=
   	struct sis_ring_data	sis_cdata;=0A=
   	struct callout_handle	sis_stat_ch;=0A=
 + 	device_t		sis_dev;=0A=
   };=0A=
   =0A=
   /*=0A=
 *** if_sis.c.orig	Wed Feb 21 22:17:51 2001=0A=
 --- if_sis.c	Thu Nov 29 19:41:31 2001=0A=
 ***************=0A=
 *** 723,728 ****=0A=
 --- 723,730 ----=0A=
   	unit =3D device_get_unit(dev);=0A=
   	bzero(sc, sizeof(struct sis_softc));=0A=
   =0A=
 + 	sc->sis_dev =3D dev;=0A=
 + =0A=
   	if (pci_get_device(dev) =3D=3D SIS_DEVICEID_900)=0A=
   		sc->sis_type =3D SIS_TYPE_900;=0A=
   	if (pci_get_device(dev) =3D=3D SIS_DEVICEID_7016)=0A=
 ***************=0A=
 *** 1159,1166 ****=0A=
   void sis_rxeoc(sc)=0A=
   	struct sis_softc	*sc;=0A=
   {=0A=
   	sis_rxeof(sc);=0A=
 - 	sis_init(sc);=0A=
   	return;=0A=
   }=0A=
   =0A=
 --- 1161,1183 ----=0A=
   void sis_rxeoc(sc)=0A=
   	struct sis_softc	*sc;=0A=
   {=0A=
 + 	int latency;=0A=
 + =0A=
 + 	/*=0A=
 + 	 * The BIOS may have initialized the maximum latency timer=0A=
 + 	 * too low to be able to keep up with a 100Mbs stream when=0A=
 + 	 * heavy disk or other DMA is taking place. Try to correct=0A=
 + 	 * for this adaptively by bumping it up a little each time=0A=
 + 	 * the receive FIFO overflows.=0A=
 + 	 */=0A=
 + =0A=
 + 	latency =3D pci_read_config(sc->sis_dev, PCIR_LATTIMER, 1);=0A=
 + 	if (latency < 255) {=0A=
 + 		if ((latency +=3D 32) > 255) latency =3D 255;=0A=
 + 		pci_write_config(sc->sis_dev, PCIR_LATTIMER, latency, 1);=0A=
 + 	}=0A=
 + =0A=
   	sis_rxeof(sc);=0A=
   	return;=0A=
   }=0A=
   =0A=
 ***************=0A=
 *** 1293,1305 ****=0A=
   			sis_txeof(sc);=0A=
   =0A=
   		if ((status & SIS_ISR_RX_DESC_OK) ||=0A=
 ! 		    (status & SIS_ISR_RX_OK))=0A=
   			sis_rxeof(sc);=0A=
   =0A=
 ! 		if ((status & SIS_ISR_RX_ERR) ||=0A=
 ! 		    (status & SIS_ISR_RX_OFLOW)) {=0A=
   			sis_rxeoc(sc);=0A=
 - 		}=0A=
   =0A=
   		if (status & SIS_ISR_SYSERR) {=0A=
   			sis_reset(sc);=0A=
 --- 1310,1321 ----=0A=
   			sis_txeof(sc);=0A=
   =0A=
   		if ((status & SIS_ISR_RX_DESC_OK) ||=0A=
 ! 		    (status & SIS_ISR_RX_OK) ||=0A=
 ! 		    (status & SIS_ISR_RX_ERR))=0A=
   			sis_rxeof(sc);=0A=
   =0A=
 ! 		if (status & SIS_ISR_RX_OFLOW)=0A=
   			sis_rxeoc(sc);=0A=
   =0A=
   		if (status & SIS_ISR_SYSERR) {=0A=
   			sis_reset(sc);=0A=
 ***************=0A=
 *** 1562,1569 ****=0A=
   	 * performance." Note however that at least three=0A=
   	 * of the registers are listed as "reserved" in=0A=
   	 * the register map, so who knows what they do.=0A=
   	 */=0A=
 ! 	if (sc->sis_type =3D=3D SIS_TYPE_83815) {=0A=
   		CSR_WRITE_4(sc, NS_PHY_PAGE, 0x0001);=0A=
   		CSR_WRITE_4(sc, NS_PHY_CR, 0x189C);=0A=
   		CSR_WRITE_4(sc, NS_PHY_TDATA, 0x0000);=0A=
 --- 1578,1593 ----=0A=
   	 * performance." Note however that at least three=0A=
   	 * of the registers are listed as "reserved" in=0A=
   	 * the register map, so who knows what they do.=0A=
 + 	 *=0A=
 + 	 * An appararently later version (December 2000)=0A=
 + 	 * has this data on page 78 and qualifies it as=0A=
 + 	 * applying only to silicon version 203h, which=0A=
 + 	 * is aparently a typo'd reference to version 302h=0A=
 + 	 * as referred to on page 64.=0A=
   	 */=0A=
 ! 	if (sc->sis_type =3D=3D SIS_TYPE_83815 &&=0A=
 ! 	    CSR_READ_4(sc, NS_SRR) =3D=3D 0x0302) {=0A=
 ! =0A=
   		CSR_WRITE_4(sc, NS_PHY_PAGE, 0x0001);=0A=
   		CSR_WRITE_4(sc, NS_PHY_CR, 0x189C);=0A=
   		CSR_WRITE_4(sc, NS_PHY_TDATA, 0x0000);=0A=
 
 ------=_NextPart_000_0016_01C197B7.BFC2AB60--
 

From: Luigi Rizzo <luigi@FreeBSD.org>
To: "David S. Madole" <david@madole.net>
Cc: sos@freebsd.dk, wpaul@FreeBSD.org,
	freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/32338: Network to disk write performance low under ATA with DMA
Date: Fri, 8 Feb 2002 17:57:48 -0800

 Hi,
 sorry for the delay with which I reply but i only had a chance
 to test this now.
 It seems that the removal of sis_init from sis_rxeoc() is
 definitely an important part of the fix, as in presence of
 bidirectional traffic an overflow on the receive side will
 cause a reset of the chip and drop all packets queued on
 the transmit side.
 
 Bill, do you think the removal of sis_init() has ill side
 effects otherwise ? I am bombing the interface with 148kpps
 (on a 486-133 so there are plenty of overflows)
 and things seem to work fine, much better indeed than
 when sis_init() was included in sis_rxeoc().
 
 Not sure about the PCI latency register, i would like
 to test this a bit more, as there are many reasons why
 rxeoc is called.
 
 	cheers
 	luigi
 
 On Mon, Jan 07, 2002 at 08:13:19PM -0500, David S. Madole wrote:
 > >===== Original Message From sos@freebsd.dk =====
 > >It seems David S Madole wrote:
 > >> >Description:
 > >> Data transfers from network to ATA hard drive are very slow. Visible
 > >> with Samba, HTTP PUT, etc., but most easily demonstrated with FTP.
 > >> Only occurs when DMA is enabled on the ATA controller.
 > >>
 > >> Interestingly, only seems to happen when network data is being written
 > >> to the drive. Doing a large write to the drive while simultaneously
 > >> doing a 'ping -f -s 1400' to another node from another session does
 > >> not slow the disk writes significantly.
 > >>
 > >> The drive is a Maxtor 60GB (model number in included dmesg). NIC is
 > >> NetGear with sis driver, although same problem occurs with LinkSys
 > >> card on dc driver. Also occurs with older Maxtor 6GB drive.
 > >
 > >Hmm, this sounds like the SiS network card may be using DMA too
 > >and badly at that, making the net card and the ATA driver
 > >compete for the bus...
 > 
 > ( my apologies if anyone has received this already, but I don't think the
 >   copy I sent previously went out correctly since it never showed up on
 > gnats
 >   and I was having mail trouble for a few days )
 > 
 > Bill, I copied you since this is your driver.
 > 
 > Luigi, I apologize if inappropriate to copy you on this, but I see you've
 > been
 > active in the sis driver lately and thought you might be interested.
 > 
 > Sren, thanks for the response. As I mentioned before, I did have this
 > problem with the dc driver as well, on an ADMtek AN985.
 > 
 > It turns out that the NIC was unable to DMA it's FIFO into memory fast
 > enough,
 > causing it to overflow and packets to get lost, throttling back TCP. It
 > seems
 > that my BIOS initializes the PCI latency timer on the card to 32, which
 > doesn't let it move enough data at a time to keep up with 100Mbs while
 > competing with the ATA controller.
 > 
 > I am heistant to second-quess every BIOS, so I decided to try an adaptive
 > approach and bump the timer up a little bit (32) each time an overflow
 > occurs.
 > On my machine, once it reaches 96 the card is stable and never overflows
 > again. This increased my FTP-to-disk throughput to a little over 9MBps,
 > pretty
 > good. I also changed the driver to not reinitialize the card when an
 > overflow
 > or receive error occurs -- it seems unnecessary and takes a relatively long
 > time, causing many packets to get lost.
 > 
 > I also adjusted some NIC settings particular to the National chip to make
 > them
 > conditional on the silicon version, as the latest datasheet specifies. I'm
 > guessing it doesn't really matter much since these are probably just
 > defaults
 > on the later chips.
 > 
 > I think some other DMA-based NIC drivers, like dc could probably benefit
 > from
 > this as well. I can fix up the dc driver as well, if  this seems like the
 > best
 > way to go, and see if it fixes the issue there, too.
 > 
 > By the way, the attached diff is against 4.4-RELEASE.
 > 
 > Thanks,
 > Dave
 > 
 > 
 
 > *** if_sisreg.h.orig	Wed Feb 21 22:17:51 2001
 > --- if_sisreg.h	Thu Nov 29 19:28:35 2001
 > ***************
 > *** 76,81 ****
 > --- 76,82 ----
 >   
 >   /* NS DP83815 registers */
 >   #define NS_CLKRUN		0x3C
 > + #define NS_SRR			0x58
 >   #define NS_BMCR			0x80
 >   #define NS_BMSR			0x84
 >   #define NS_PHYIDR1		0x88
 > ***************
 > *** 401,406 ****
 > --- 402,408 ----
 >   	struct sis_list_data	*sis_ldata;
 >   	struct sis_ring_data	sis_cdata;
 >   	struct callout_handle	sis_stat_ch;
 > + 	device_t		sis_dev;
 >   };
 >   
 >   /*
 > *** if_sis.c.orig	Wed Feb 21 22:17:51 2001
 > --- if_sis.c	Thu Nov 29 19:41:31 2001
 > ***************
 > *** 723,728 ****
 > --- 723,730 ----
 >   	unit = device_get_unit(dev);
 >   	bzero(sc, sizeof(struct sis_softc));
 >   
 > + 	sc->sis_dev = dev;
 > + 
 >   	if (pci_get_device(dev) == SIS_DEVICEID_900)
 >   		sc->sis_type = SIS_TYPE_900;
 >   	if (pci_get_device(dev) == SIS_DEVICEID_7016)
 > ***************
 > *** 1159,1166 ****
 >   void sis_rxeoc(sc)
 >   	struct sis_softc	*sc;
 >   {
 >   	sis_rxeof(sc);
 > - 	sis_init(sc);
 >   	return;
 >   }
 >   
 > --- 1161,1183 ----
 >   void sis_rxeoc(sc)
 >   	struct sis_softc	*sc;
 >   {
 > + 	int latency;
 > + 
 > + 	/*
 > + 	 * The BIOS may have initialized the maximum latency timer
 > + 	 * too low to be able to keep up with a 100Mbs stream when
 > + 	 * heavy disk or other DMA is taking place. Try to correct
 > + 	 * for this adaptively by bumping it up a little each time
 > + 	 * the receive FIFO overflows.
 > + 	 */
 > + 
 > + 	latency = pci_read_config(sc->sis_dev, PCIR_LATTIMER, 1);
 > + 	if (latency < 255) {
 > + 		if ((latency += 32) > 255) latency = 255;
 > + 		pci_write_config(sc->sis_dev, PCIR_LATTIMER, latency, 1);
 > + 	}
 > + 
 >   	sis_rxeof(sc);
 >   	return;
 >   }
 >   
 > ***************
 > *** 1293,1305 ****
 >   			sis_txeof(sc);
 >   
 >   		if ((status & SIS_ISR_RX_DESC_OK) ||
 > ! 		    (status & SIS_ISR_RX_OK))
 >   			sis_rxeof(sc);
 >   
 > ! 		if ((status & SIS_ISR_RX_ERR) ||
 > ! 		    (status & SIS_ISR_RX_OFLOW)) {
 >   			sis_rxeoc(sc);
 > - 		}
 >   
 >   		if (status & SIS_ISR_SYSERR) {
 >   			sis_reset(sc);
 > --- 1310,1321 ----
 >   			sis_txeof(sc);
 >   
 >   		if ((status & SIS_ISR_RX_DESC_OK) ||
 > ! 		    (status & SIS_ISR_RX_OK) ||
 > ! 		    (status & SIS_ISR_RX_ERR))
 >   			sis_rxeof(sc);
 >   
 > ! 		if (status & SIS_ISR_RX_OFLOW)
 >   			sis_rxeoc(sc);
 >   
 >   		if (status & SIS_ISR_SYSERR) {
 >   			sis_reset(sc);
 > ***************
 > *** 1562,1569 ****
 >   	 * performance." Note however that at least three
 >   	 * of the registers are listed as "reserved" in
 >   	 * the register map, so who knows what they do.
 >   	 */
 > ! 	if (sc->sis_type == SIS_TYPE_83815) {
 >   		CSR_WRITE_4(sc, NS_PHY_PAGE, 0x0001);
 >   		CSR_WRITE_4(sc, NS_PHY_CR, 0x189C);
 >   		CSR_WRITE_4(sc, NS_PHY_TDATA, 0x0000);
 > --- 1578,1593 ----
 >   	 * performance." Note however that at least three
 >   	 * of the registers are listed as "reserved" in
 >   	 * the register map, so who knows what they do.
 > + 	 *
 > + 	 * An appararently later version (December 2000)
 > + 	 * has this data on page 78 and qualifies it as
 > + 	 * applying only to silicon version 203h, which
 > + 	 * is aparently a typo'd reference to version 302h
 > + 	 * as referred to on page 64.
 >   	 */
 > ! 	if (sc->sis_type == SIS_TYPE_83815 &&
 > ! 	    CSR_READ_4(sc, NS_SRR) == 0x0302) {
 > ! 
 >   		CSR_WRITE_4(sc, NS_PHY_PAGE, 0x0001);
 >   		CSR_WRITE_4(sc, NS_PHY_CR, 0x189C);
 >   		CSR_WRITE_4(sc, NS_PHY_TDATA, 0x0000);
 

From: David S Madole <david@madole.net>
To: Luigi Rizzo <luigi@FreeBSD.org>
Cc: wpaul@FreeBSD.org, freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/32338: Network to disk write performance low under ATA with DMA
Date: Sat, 9 Feb 2002 05:55:07 +0000

 Luigi,
 
 Actually, in my case, it's the PCI latency that made all the
 difference. With the latency timer set to 32 -- the way my
 BIOS sets it up -- overflows occur many times a second. Removing 
 sis_init helped a little because it caused only one or two
 packets to get dropped, instead of 10's or 100's while the card
 was begin reset. That change took me from 50Kbyte/sec to
 500KByte/sec on FTP on 100MB full-duplex.
 
 Putting in the latency time patch causes two overflows to occur
 after boot-up, until the timer reaches 96. Then I never get an
 overflow again. From disk-to-disk between two machines with sis0
 cards, I get 9.8Mbytes/second on FTP - fast! If I go from disk
 to /dev/null (ie GET test /dev/null), I get 11.8MByte/second,
 pretty much full wire-rate.
 
 I'm surprised you get a lot of overflows even on the slow 
 machine, since the card empties through DMA. I can see mbufs
 might overflow, but not the card fifo at 33Mhz PCI DMA. You may
 need the latency timer increased on that machine, too. You can
 always bump it up by hand just to try (ie pciconf -w pci0:4:1
 0xc 96).
 
 Thanks for looking into this,
 David
 
 
 On Saturday 09 February 2002 01:57 am, you wrote:
 > Hi,
 > sorry for the delay with which I reply but i only had a chance
 > to test this now.
 > It seems that the removal of sis_init from sis_rxeoc() is
 > definitely an important part of the fix, as in presence of
 > bidirectional traffic an overflow on the receive side will
 > cause a reset of the chip and drop all packets queued on
 > the transmit side.
 >
 > Bill, do you think the removal of sis_init() has ill side
 > effects otherwise ? I am bombing the interface with 148kpps
 > (on a 486-133 so there are plenty of overflows)
 > and things seem to work fine, much better indeed than
 > when sis_init() was included in sis_rxeoc().
 >
 > Not sure about the PCI latency register, i would like
 > to test this a bit more, as there are many reasons why
 > rxeoc is called.
 >
 > 	cheers
 > 	luigi
 >
 > On Mon, Jan 07, 2002 at 08:13:19PM -0500, David S. Madole wrote:
 > > >===== Original Message From sos@freebsd.dk =====
 > > >
 > > >It seems David S Madole wrote:
 > > >> >Description:
 > > >>
 > > >> Data transfers from network to ATA hard drive are very slow.
 > > >> Visible with Samba, HTTP PUT, etc., but most easily demonstrated
 > > >> with FTP. Only occurs when DMA is enabled on the ATA controller.
 > > >>
 > > >> Interestingly, only seems to happen when network data is being
 > > >> written to the drive. Doing a large write to the drive while
 > > >> simultaneously doing a 'ping -f -s 1400' to another node from
 > > >> another session does not slow the disk writes significantly.
 > > >>
 > > >> The drive is a Maxtor 60GB (model number in included dmesg). NIC
 > > >> is NetGear with sis driver, although same problem occurs with
 > > >> LinkSys card on dc driver. Also occurs with older Maxtor 6GB
 > > >> drive.
 > > >
 > > >Hmm, this sounds like the SiS network card may be using DMA too
 > > >and badly at that, making the net card and the ATA driver
 > > >compete for the bus...
 > >
 > > ( my apologies if anyone has received this already, but I don't
 > > think the copy I sent previously went out correctly since it never
 > > showed up on gnats
 > >   and I was having mail trouble for a few days )
 > >
 > > Bill, I copied you since this is your driver.
 > >
 > > Luigi, I apologize if inappropriate to copy you on this, but I see
 > > you've been
 > > active in the sis driver lately and thought you might be
 > > interested.
 > >
 > > Sren, thanks for the response. As I mentioned before, I did have
 > > this problem with the dc driver as well, on an ADMtek AN985.
 > >
 > > It turns out that the NIC was unable to DMA it's FIFO into memory
 > > fast enough,
 > > causing it to overflow and packets to get lost, throttling back
 > > TCP. It seems
 > > that my BIOS initializes the PCI latency timer on the card to 32,
 > > which doesn't let it move enough data at a time to keep up with
 > > 100Mbs while competing with the ATA controller.
 > >
 > > I am heistant to second-quess every BIOS, so I decided to try an
 > > adaptive approach and bump the timer up a little bit (32) each time
 > > an overflow occurs.
 > > On my machine, once it reaches 96 the card is stable and never
 > > overflows again. This increased my FTP-to-disk throughput to a
 > > little over 9MBps, pretty
 > > good. I also changed the driver to not reinitialize the card when
 > > an overflow
 > > or receive error occurs -- it seems unnecessary and takes a
 > > relatively long time, causing many packets to get lost.
 > >
 > > I also adjusted some NIC settings particular to the National chip
 > > to make them
 > > conditional on the silicon version, as the latest datasheet
 > > specifies. I'm guessing it doesn't really matter much since these
 > > are probably just defaults
 > > on the later chips.
 > >
 > > I think some other DMA-based NIC drivers, like dc could probably
 > > benefit from
 > > this as well. I can fix up the dc driver as well, if  this seems
 > > like the best
 > > way to go, and see if it fixes the issue there, too.
 > >
 > > By the way, the attached diff is against 4.4-RELEASE.
 > >
 > > Thanks,
 > > Dave
 > >
 > >
 > >
 > > *** if_sisreg.h.orig	Wed Feb 21 22:17:51 2001
 > > --- if_sisreg.h	Thu Nov 29 19:28:35 2001
 > > ***************
 > > *** 76,81 ****
 > > --- 76,82 ----
 > >
 > >   /* NS DP83815 registers */
 > >   #define NS_CLKRUN		0x3C
 > > + #define NS_SRR			0x58
 > >   #define NS_BMCR			0x80
 > >   #define NS_BMSR			0x84
 > >   #define NS_PHYIDR1		0x88
 > > ***************
 > > *** 401,406 ****
 > > --- 402,408 ----
 > >   	struct sis_list_data	*sis_ldata;
 > >   	struct sis_ring_data	sis_cdata;
 > >   	struct callout_handle	sis_stat_ch;
 > > + 	device_t		sis_dev;
 > >   };
 > >
 > >   /*
 > > *** if_sis.c.orig	Wed Feb 21 22:17:51 2001
 > > --- if_sis.c	Thu Nov 29 19:41:31 2001
 > > ***************
 > > *** 723,728 ****
 > > --- 723,730 ----
 > >   	unit = device_get_unit(dev);
 > >   	bzero(sc, sizeof(struct sis_softc));
 > >
 > > + 	sc->sis_dev = dev;
 > > +
 > >   	if (pci_get_device(dev) == SIS_DEVICEID_900)
 > >   		sc->sis_type = SIS_TYPE_900;
 > >   	if (pci_get_device(dev) == SIS_DEVICEID_7016)
 > > ***************
 > > *** 1159,1166 ****
 > >   void sis_rxeoc(sc)
 > >   	struct sis_softc	*sc;
 > >   {
 > >   	sis_rxeof(sc);
 > > - 	sis_init(sc);
 > >   	return;
 > >   }
 > >
 > > --- 1161,1183 ----
 > >   void sis_rxeoc(sc)
 > >   	struct sis_softc	*sc;
 > >   {
 > > + 	int latency;
 > > +
 > > + 	/*
 > > + 	 * The BIOS may have initialized the maximum latency timer
 > > + 	 * too low to be able to keep up with a 100Mbs stream when
 > > + 	 * heavy disk or other DMA is taking place. Try to correct
 > > + 	 * for this adaptively by bumping it up a little each time
 > > + 	 * the receive FIFO overflows.
 > > + 	 */
 > > +
 > > + 	latency = pci_read_config(sc->sis_dev, PCIR_LATTIMER, 1);
 > > + 	if (latency < 255) {
 > > + 		if ((latency += 32) > 255) latency = 255;
 > > + 		pci_write_config(sc->sis_dev, PCIR_LATTIMER, latency, 1);
 > > + 	}
 > > +
 > >   	sis_rxeof(sc);
 > >   	return;
 > >   }
 > >
 > > ***************
 > > *** 1293,1305 ****
 > >   			sis_txeof(sc);
 > >
 > >   		if ((status & SIS_ISR_RX_DESC_OK) ||
 > > ! 		    (status & SIS_ISR_RX_OK))
 > >   			sis_rxeof(sc);
 > >
 > > ! 		if ((status & SIS_ISR_RX_ERR) ||
 > > ! 		    (status & SIS_ISR_RX_OFLOW)) {
 > >   			sis_rxeoc(sc);
 > > - 		}
 > >
 > >   		if (status & SIS_ISR_SYSERR) {
 > >   			sis_reset(sc);
 > > --- 1310,1321 ----
 > >   			sis_txeof(sc);
 > >
 > >   		if ((status & SIS_ISR_RX_DESC_OK) ||
 > > ! 		    (status & SIS_ISR_RX_OK) ||
 > > ! 		    (status & SIS_ISR_RX_ERR))
 > >   			sis_rxeof(sc);
 > >
 > > ! 		if (status & SIS_ISR_RX_OFLOW)
 > >   			sis_rxeoc(sc);
 > >
 > >   		if (status & SIS_ISR_SYSERR) {
 > >   			sis_reset(sc);
 > > ***************
 > > *** 1562,1569 ****
 > >   	 * performance." Note however that at least three
 > >   	 * of the registers are listed as "reserved" in
 > >   	 * the register map, so who knows what they do.
 > >   	 */
 > > ! 	if (sc->sis_type == SIS_TYPE_83815) {
 > >   		CSR_WRITE_4(sc, NS_PHY_PAGE, 0x0001);
 > >   		CSR_WRITE_4(sc, NS_PHY_CR, 0x189C);
 > >   		CSR_WRITE_4(sc, NS_PHY_TDATA, 0x0000);
 > > --- 1578,1593 ----
 > >   	 * performance." Note however that at least three
 > >   	 * of the registers are listed as "reserved" in
 > >   	 * the register map, so who knows what they do.
 > > + 	 *
 > > + 	 * An appararently later version (December 2000)
 > > + 	 * has this data on page 78 and qualifies it as
 > > + 	 * applying only to silicon version 203h, which
 > > + 	 * is aparently a typo'd reference to version 302h
 > > + 	 * as referred to on page 64.
 > >   	 */
 > > ! 	if (sc->sis_type == SIS_TYPE_83815 &&
 > > ! 	    CSR_READ_4(sc, NS_SRR) == 0x0302) {
 > > !
 > >   		CSR_WRITE_4(sc, NS_PHY_PAGE, 0x0001);
 > >   		CSR_WRITE_4(sc, NS_PHY_CR, 0x189C);
 > >   		CSR_WRITE_4(sc, NS_PHY_TDATA, 0x0000);

From: Luigi Rizzo <luigi@FreeBSD.org>
To: David S Madole <david@madole.net>
Cc: wpaul@FreeBSD.org, freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/32338: Network to disk write performance low under ATA with DMA
Date: Sat, 9 Feb 2002 07:31:26 -0800

 On Sat, Feb 09, 2002 at 05:55:07AM +0000, David S Madole wrote:
 > Luigi,
 > 
 > Actually, in my case, it's the PCI latency that made all the
 
 yes, i am not denying that, it's just that i havent had a chance to
 exercise that part of the code.
 
 > I'm surprised you get a lot of overflows even on the slow 
 > machine, since the card empties through DMA. I can see mbufs
 
 well, it's ring buffer overflow, not DMA overflow. But the
 code calls rxeoc in both cases...
 
 	cheers
 	luigi
Responsible-Changed-From-To: freebsd-bugs->sos 
Responsible-Changed-By: kris 
Responsible-Changed-When: Sun Jul 13 02:36:21 PDT 2003 
Responsible-Changed-Why:  
Assign to ATA maintainer 

http://www.freebsd.org/cgi/query-pr.cgi?pr=32338 
Responsible-Changed-From-To: sos->freebsd-bugs 
Responsible-Changed-By: sos 
Responsible-Changed-When: Mon Jul 14 04:15:09 PDT 2003 
Responsible-Changed-Why:  
This is not an ATA issue 

http://www.freebsd.org/cgi/query-pr.cgi?pr=32338 
State-Changed-From-To: open->feedback 
State-Changed-By: arved 
State-Changed-When: Wed Aug 25 20:45:41 GMT 2004 
State-Changed-Why:  
In the last two years several changes were committed to the sis driver. 
Did you test with a recent version of FreeBSD if your problem is fixed? 

http://www.freebsd.org/cgi/query-pr.cgi?pr=32338 

From: "David S. Madole" <david@madole.net>
To: "Tilman Linneweh" <arved@FreeBSD.org>, <freebsd-bugs@FreeBSD.org>
Cc:  
Subject: Re: kern/32338: Network to disk write performance low under ATA with DMA
Date: Wed, 25 Aug 2004 17:05:00 -0400

 From: "Tilman Linneweh" <arved@FreeBSD.org>
 >
 > Synopsis: Network to disk write performance low under ATA with DMA
 >
 > State-Changed-From-To: open->feedback
 > State-Changed-By: arved
 > State-Changed-When: Wed Aug 25 20:45:41 GMT 2004
 > State-Changed-Why:
 > In the last two years several changes were committed to the sis driver.
 > Did you test with a recent version of FreeBSD if your problem is fixed?
 
 I don't know, as I no longer have a machine running day-to-day that is a
 good test case.
 
 After submitting this, I resigned to the fact that no one was interested,
 probably rightly so, as the driver is probably not the best place to fix
 what is really a BIOS PCI bus initialization issue, although there were
 other drivers, at the time at least, that did this, too.
 
 I maintained a local kernel patch for a little while, then realized it's
 really just as easy to do something like
 
      pciconf -w -b pci0:9:0 0xd 0x60
 
 if rc.local, and that's what I did for a long time. In the last few
 months I have upgraded the machine that was the test case for this and it
 now has an Intel NIC.
 
 David
State-Changed-From-To: feedback->closed 
State-Changed-By: arved 
State-Changed-When: Wed Aug 25 21:35:53 GMT 2004 
State-Changed-Why:  
Okay, since this is not easy reproducable anymore, I will close the PR. 

The pciconf-workaround is now documented in the mail-archive, in case the  
driver modifications of the last year(s) didn't fix this problem and  
someone else is running into it. 

Thanks for your quick response and your patience.  

http://www.freebsd.org/cgi/query-pr.cgi?pr=32338 
>Unformatted:
