From nobody@FreeBSD.org  Wed Jul 19 07:49:59 2006
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id D693B16A4DE
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 19 Jul 2006 07:49:59 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [216.136.204.117])
	by mx1.FreeBSD.org (Postfix) with ESMTP id A3E0C43D53
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 19 Jul 2006 07:49:59 +0000 (GMT)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.13.1/8.13.1) with ESMTP id k6J7nxwU023183
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 19 Jul 2006 07:49:59 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.13.1/8.13.1/Submit) id k6J7nxK5023181;
	Wed, 19 Jul 2006 07:49:59 GMT
	(envelope-from nobody)
Message-Id: <200607190749.k6J7nxK5023181@www.freebsd.org>
Date: Wed, 19 Jul 2006 07:49:59 GMT
From: Arthur Hartwig <arthur.hartwig@nokia.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: Suboptimal network polling
X-Send-Pr-Version: www-2.3

>Number:         100519
>Category:       kern
>Synopsis:       [netisr] suggestion to fix suboptimal network polling
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    freebsd-net
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Jul 19 08:00:34 GMT 2006
>Closed-Date:    
>Last-Modified:  Mon Feb 26 11:56:26 GMT 2007
>Originator:     Arthur Hartwig
>Release:        6.0
>Organization:
Nokia
>Environment:
FreeBSD xxx.nokia.com 6.0-RELEASE FreeBSD 6.0-RELEASE #3: Wed Mar  1 10:46:02 EST 2006     hartwig@xxx.nokia.com:/usr/src/sys/i386/compile/oz-net-10  i386

>Description:
Network polling has unnecessary calls to the scheduler. These require
acquiring the sched lock which imposes a variable delay depending on
contention for this lock.

in netisr_pollmore() in kern/kern_poll.c there are two calls to
schednetisrbits().  schednetisrbits is defined in net/netisr.h to set
some bits in netisr and call legacy_setsoftnet(). legacy_setsoftnet() in
net/netisr.c calls swi_sched(). swi_sched() in kern/kern_intr.c which
calls ithread_schedule() in the same file. ithread_schedule() acquires
and releases the sched_lock.


>How-To-Repeat:

>Fix:
Since the netisr is running when netisr_pollmore() is executing and
swi_net() the main netisr despatcher loops until netisr is zero, it is
sufficient in netisr_pollmore() to just set the bits in netisr and not
also call legacy_setsoftnet():

replace the two instances of:
    schednetisrbits(1 << NETISR_POLL | 1 << NETISR_POLLMORE);

in netisr_pollmore() by:
    atomic_set_rel_int(&netisr, (1 << NETISR_POLL | 1 << NETISR_POLLMORE));


>Release-Note:
>Audit-Trail:

From: Gleb Smirnoff <glebius@FreeBSD.org>
To: Arthur Hartwig <arthur.hartwig@nokia.com>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/100519: Suboptimal network polling
Date: Fri, 11 Aug 2006 17:35:10 +0400

   Arthur,
 
 On Wed, Jul 19, 2006 at 07:49:59AM +0000, Arthur Hartwig wrote:
 A> >Fix:
 A> Since the netisr is running when netisr_pollmore() is executing and swi_net() the main netisr despatcher loops until netisr is zero, it is sufficient in netisr_pollmore() to just set the bits in netisr and not also call legacy_setsoftnet():
 A> 
 A> replace the two instances of:
 A>     schednetisrbits(1 << NETISR_POLL | 1 << NETISR_POLLMORE);
 A> 
 A> in netisr_pollmore() by:
 A>     atomic_set_rel_int(&netisr, (1 << NETISR_POLL | 1 << NETISR_POLLMORE));
 
 Hmm, interesting. Have you done any profiling?
 
 -- 
 Totus tuus, Glebius.
 GLEBIUS-RIPN GLEB-RIPE

From: Gleb Smirnoff <glebius@FreeBSD.org>
To: freebsd-gnats-submit@FreeBSD.org
Cc:  
Subject: Re: kern/100519: Suboptimal network polling
Date: Thu, 28 Sep 2006 21:37:43 +0400

   attach this to PR, so that it won't be lost in
 my mailbox.
 
 ----- Forwarded message from Arthur Hartwig <Arthur.Hartwig@nokia.com> -----
 
 >Hmm, interesting. Have you done any profiling?
 
 G'day Gleb,
    My work was then largely concerned with fast forwarding performance 
 between two em interfaces on FreeBSD 6.0 and in particular with trying 
 to get some performance improvement at high rates by using additional 
 CPUs. I did some timing using the tsc and found that the em driver in 
 polling mode was using roughly roughly half the time spent in 
 ether_input() to the point of putting a frame on the output queue. So I 
 thought if I ran the polling code on one CPU and ether_input() on 
 another CPU I would get an observable improvement. This turned out to be 
 the case.
 
    At some stage I noticed that vmstat showed lots of interrupts even 
 though I was using polling mode.  I noticed v_intr was incremented in 
 swi_sched() and intr_execute_handlers(). Another system not using 
 polling mode didn't show the same high interrupt rate. After making the 
 described change  I found the interrupt rate reported by vmstat dropped 
 significantly.  The change didn't result in a significant performance 
 improvement but clearly reduce the overhead in the polling thread, thus 
 making more CPU time available to the system.
 
 Arthur
 
 
 
    I found interrupt driven fast forwarding performed significantly 
 worse on a dual CPU system than on a single CPU system. Some profiling 
 showed a lot of contention for the routing table locks, sometimes with 
 significant delays between a thread attempting to acquire a lock and its 
 actual acquisition. Changing to polling mode seemed as if would help 
 reduce lock contention and so improve performance. A number of tweaks 
 combined to get noloss fast forwarding performance between two GigE 
 interfaces on a dual CPU system up to about what it was on a single CPU 
 interrupt driven system
 
 Arthur
 
 ----- End forwarded message -----
Responsible-Changed-From-To: freebsd-bugs->bms 
Responsible-Changed-By: bms 
Responsible-Changed-When: Thu Sep 28 17:49:11 UTC 2006 
Responsible-Changed-Why:  
I'll take this for the sake of convenience... 

http://www.freebsd.org/cgi/query-pr.cgi?pr=100519 

From: Bruce M Simpson <bms@incunabulum.net>
To: freebsd-gnats-submit@FreeBSD.org
Cc:  
Subject: Re: kern/100519: [netisr] suggestion to fix suboptimal network polling
Date: Sat, 03 Feb 2007 04:29:37 +0000

 This is a multi-part message in MIME format.
 --------------030601000007040609010601
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed
 Content-Transfer-Encoding: 7bit
 
 For convenience, I've attached a diff against bleeding edge HEAD.
 
 When I put fxp0 into polling mode with this patch, network operations 
 appear to wedge.
 
 anglepoise# sysctl debug.mpsafenet
 debug.mpsafenet: 1
 anglepoise# uname -a
 FreeBSD anglepoise.lon.incunabulum.net 7.0-CURRENT FreeBSD 7.0-CURRENT 
 #19: Sat Feb  3 04:20:32 GMT 2007     
 bms@anglepoise.lon.incunabulum.net:/usr/obj/usr/home/bms/head/src/sys/ANGLEPOISE7  
 amd64
 anglepoise# ifconfig fxp0
 fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
         options=48<VLAN_MTU,POLLING>
         inet 192.168.123.17 netmask 0xffffff00 broadcast 192.168.123.255
         ether 00:90:27:59:40:2c
         media: Ethernet autoselect (10baseT/UTP)
         status: active
 anglepoise# ifconfig fxp0 -polling
 
  -> network operations resume
 
 
 --------------030601000007040609010601
 Content-Type: text/x-patch;
  name="schednetisr1.diff"
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline;
  filename="schednetisr1.diff"
 
 Index: sys/kern/kern_poll.c
 ===================================================================
 RCS file: /home/ncvs/src/sys/kern/kern_poll.c,v
 retrieving revision 1.28
 diff -u -p -r1.28 kern_poll.c
 --- sys/kern/kern_poll.c	6 Dec 2006 06:34:55 -0000	1.28
 +++ sys/kern/kern_poll.c	3 Feb 2007 04:12:25 -0000
 @@ -314,7 +314,13 @@ hardclock_device_poll(void)
  		if (phase != 0)
  			suspect++;
  		phase = 1;
 +#if 0
  		schednetisrbits(1 << NETISR_POLL | 1 << NETISR_POLLMORE);
 +#else
 +		/* optimize out unnecessary calls to the scheduler. */
 +		atomic_set_rel_int(&netisr,
 +		    (1 << NETISR_POLL | 1 << NETISR_POLLMORE));
 +#endif
  		phase = 2;
  	}
  	if (pending_polls++ > 0)
 @@ -371,7 +377,13 @@ netisr_pollmore()
  	mtx_lock(&poll_mtx);
  	phase = 5;
  	if (residual_burst > 0) {
 +#if 0
  		schednetisrbits(1 << NETISR_POLL | 1 << NETISR_POLLMORE);
 +#else
 +		/* optimize out unnecessary calls to the scheduler. */
 +		atomic_set_rel_int(&netisr,
 +		    (1 << NETISR_POLL | 1 << NETISR_POLLMORE));
 +#endif
  		mtx_unlock(&poll_mtx);
  		/* will run immediately on return, followed by netisrs */
  		return;
 
 --------------030601000007040609010601--

From: Bruce M Simpson <bms@incunabulum.net>
To: Luigi Rizzo <rizzo@icir.org>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/100519 suggested netisr performance improvement
Date: Mon, 05 Feb 2007 01:06:06 +0000

 This is a multi-part message in MIME format.
 --------------050603080006090002060202
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed
 Content-Transfer-Encoding: 7bit
 
 I tried the attached patch. With SWI_DELAY, nothing happens and
 traffic stalls. Without it, it's business as usual.
 
 Of course, with !SWI_DELAY. this will force a call to
 intr_event_schedule_thread(), making no real difference.
 
 I should point out all my testing was done on a uniprocessor
 amd64 system with an fxp card, though debug.mpsafenet is 1.
 
 
 --------------050603080006090002060202
 Content-Type: text/plain;
  name="netisr.diff"
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline;
  filename="netisr.diff"
 
 Index: src/sys/kern/kern_poll.c
 ===================================================================
 RCS file: /home/ncvs/src/sys/kern/kern_poll.c,v
 retrieving revision 1.28
 diff -u -p -r1.28 kern_poll.c
 --- src/sys/kern/kern_poll.c	6 Dec 2006 06:34:55 -0000	1.28
 +++ src/sys/kern/kern_poll.c	5 Feb 2007 00:50:30 -0000
 @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD: src/sys/kern/kern_po
  
  #include <sys/param.h>
  #include <sys/systm.h>
 +#include <sys/bus.h>
 +#include <sys/interrupt.h>
  #include <sys/kernel.h>
  #include <sys/socket.h>			/* needed by net/if.h		*/
  #include <sys/sockio.h>
 @@ -314,7 +316,10 @@ hardclock_device_poll(void)
  		if (phase != 0)
  			suspect++;
  		phase = 1;
 -		schednetisrbits(1 << NETISR_POLL | 1 << NETISR_POLLMORE);
 +		/* optimize out unnecessary calls to the scheduler. */
 +		atomic_set_rel_int(&netisr,
 +		    (1 << NETISR_POLL | 1 << NETISR_POLLMORE));
 +		swi_sched(net_ih, SWI_DELAY);
  		phase = 2;
  	}
  	if (pending_polls++ > 0)
 @@ -371,7 +376,10 @@ netisr_pollmore()
  	mtx_lock(&poll_mtx);
  	phase = 5;
  	if (residual_burst > 0) {
 -		schednetisrbits(1 << NETISR_POLL | 1 << NETISR_POLLMORE);
 +		/* optimize out unnecessary calls to the scheduler. */
 +		atomic_set_rel_int(&netisr,
 +		    (1 << NETISR_POLL | 1 << NETISR_POLLMORE));
 +		swi_sched(net_ih, SWI_DELAY);
  		mtx_unlock(&poll_mtx);
  		/* will run immediately on return, followed by netisrs */
  		return;
 Index: src/sys/net/netisr.h
 ===================================================================
 RCS file: /home/ncvs/src/sys/net/netisr.h,v
 retrieving revision 1.33
 diff -u -p -r1.33 netisr.h
 --- src/sys/net/netisr.h	7 Jan 2005 01:45:35 -0000	1.33
 +++ src/sys/net/netisr.h	5 Feb 2007 00:50:30 -0000
 @@ -69,6 +69,7 @@
  #ifdef _KERNEL
  
  void legacy_setsoftnet(void);
 +extern void *net_ih;
  
  extern volatile unsigned int	netisr;	/* scheduling bits for network */
  #define	schednetisr(anisr) do {						\
 Index: src/sys/net/netisr.c
 ===================================================================
 RCS file: /home/ncvs/src/sys/net/netisr.c,v
 retrieving revision 1.18
 diff -u -p -r1.18 netisr.c
 --- src/sys/net/netisr.c	28 Nov 2006 11:19:36 -0000	1.18
 +++ src/sys/net/netisr.c	5 Feb 2007 00:50:30 -0000
 @@ -82,7 +82,7 @@ struct netisr {
  	int		ni_flags;
  } netisrs[32];
  
 -static void *net_ih;
 +void *net_ih;
  
  /*
   * Not all network code is currently capable of running MPSAFE; however,
 
 --------------050603080006090002060202--

From: Luigi Rizzo <rizzo@icir.org>
To: Bruce M Simpson <bms@incunabulum.net>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/100519 suggested netisr performance improvement
Date: Sun, 4 Feb 2007 23:49:52 -0800

 On Mon, Feb 05, 2007 at 01:06:06AM +0000, Bruce M Simpson wrote:
 > I tried the attached patch. With SWI_DELAY, nothing happens and
 
 [for the benefit of the archives]
 
 hmmm... so either the analysis in the PR is not correct and
 the call is done while the isr is not under service, or
 there is something that is done in swi_sched which we are
 not taking into account properly.
 
 	cheers
 	luigi
 
 > traffic stalls. Without it, it's business as usual.
 > 
 > Of course, with !SWI_DELAY. this will force a call to
 > intr_event_schedule_thread(), making no real difference.
 > 
 > I should point out all my testing was done on a uniprocessor
 > amd64 system with an fxp card, though debug.mpsafenet is 1.
 > 
 
 > Index: src/sys/kern/kern_poll.c
 > ===================================================================
 > RCS file: /home/ncvs/src/sys/kern/kern_poll.c,v
 > retrieving revision 1.28
 > diff -u -p -r1.28 kern_poll.c
 > --- src/sys/kern/kern_poll.c	6 Dec 2006 06:34:55 -0000	1.28
 > +++ src/sys/kern/kern_poll.c	5 Feb 2007 00:50:30 -0000
 > @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD: src/sys/kern/kern_po
 >  
 >  #include <sys/param.h>
 >  #include <sys/systm.h>
 > +#include <sys/bus.h>
 > +#include <sys/interrupt.h>
 >  #include <sys/kernel.h>
 >  #include <sys/socket.h>			/* needed by net/if.h		*/
 >  #include <sys/sockio.h>
 > @@ -314,7 +316,10 @@ hardclock_device_poll(void)
 >  		if (phase != 0)
 >  			suspect++;
 >  		phase = 1;
 > -		schednetisrbits(1 << NETISR_POLL | 1 << NETISR_POLLMORE);
 > +		/* optimize out unnecessary calls to the scheduler. */
 > +		atomic_set_rel_int(&netisr,
 > +		    (1 << NETISR_POLL | 1 << NETISR_POLLMORE));
 > +		swi_sched(net_ih, SWI_DELAY);
 >  		phase = 2;
 >  	}
 >  	if (pending_polls++ > 0)
 > @@ -371,7 +376,10 @@ netisr_pollmore()
 >  	mtx_lock(&poll_mtx);
 >  	phase = 5;
 >  	if (residual_burst > 0) {
 > -		schednetisrbits(1 << NETISR_POLL | 1 << NETISR_POLLMORE);
 > +		/* optimize out unnecessary calls to the scheduler. */
 > +		atomic_set_rel_int(&netisr,
 > +		    (1 << NETISR_POLL | 1 << NETISR_POLLMORE));
 > +		swi_sched(net_ih, SWI_DELAY);
 >  		mtx_unlock(&poll_mtx);
 >  		/* will run immediately on return, followed by netisrs */
 >  		return;
 > Index: src/sys/net/netisr.h
 > ===================================================================
 > RCS file: /home/ncvs/src/sys/net/netisr.h,v
 > retrieving revision 1.33
 > diff -u -p -r1.33 netisr.h
 > --- src/sys/net/netisr.h	7 Jan 2005 01:45:35 -0000	1.33
 > +++ src/sys/net/netisr.h	5 Feb 2007 00:50:30 -0000
 > @@ -69,6 +69,7 @@
 >  #ifdef _KERNEL
 >  
 >  void legacy_setsoftnet(void);
 > +extern void *net_ih;
 >  
 >  extern volatile unsigned int	netisr;	/* scheduling bits for network */
 >  #define	schednetisr(anisr) do {						\
 > Index: src/sys/net/netisr.c
 > ===================================================================
 > RCS file: /home/ncvs/src/sys/net/netisr.c,v
 > retrieving revision 1.18
 > diff -u -p -r1.18 netisr.c
 > --- src/sys/net/netisr.c	28 Nov 2006 11:19:36 -0000	1.18
 > +++ src/sys/net/netisr.c	5 Feb 2007 00:50:30 -0000
 > @@ -82,7 +82,7 @@ struct netisr {
 >  	int		ni_flags;
 >  } netisrs[32];
 >  
 > -static void *net_ih;
 > +void *net_ih;
 >  
 >  /*
 >   * Not all network code is currently capable of running MPSAFE; however,
 
State-Changed-From-To: open->feedback 
State-Changed-By: bms 
State-Changed-When: Mon Feb 5 11:33:28 UTC 2007 
State-Changed-Why:  
Think we're back to the drawing board here, at least for the current 
code base... 

http://www.freebsd.org/cgi/query-pr.cgi?pr=100519 

From: Arthur Hartwig <Arthur.Hartwig@nokia.com>
To: bug-followup@FreeBSD.org, arthur.hartwig@nokia.com
Cc:  
Subject: Re: kern/100519: [netisr] suggestion to fix suboptimal network polling
Date: Tue, 06 Feb 2007 07:31:32 +1000

 This is a multi-part message in MIME format.
 --------------040104080801080900000409
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed
 Content-Transfer-Encoding: 7bit
 
 I haven't yet looked at the 7.0 code, but in the 6.0 code the call to 
 schednetisrbits() in hardclock_device_poll() was necessary to get the 
 netisr thread running.  But in netisr_pollmore() the call to the 
 scheduler was unnecessary because the netisr was already running. The 
 patch by bms did rather more than I suggested (/replace two instance . . 
 . in netisrpollmore()/) in that it it removed the call to 
 schednetisrbits() in hardclock_device_poll()
 
 I'm about to go to work. I try to take a look at the 7.0 code later today.
 
 Arthur
 
 
 --------------040104080801080900000409
 Content-Type: text/html; charset=ISO-8859-1
 Content-Transfer-Encoding: 7bit
 
 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
 <html>
 <head>
   <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
   <title></title>
 </head>
 <body bgcolor="#ffffff" text="#000000">
 I haven't yet looked at the 7.0 code, but in the 6.0 code the call to
 schednetisrbits() in hardclock_device_poll() was necessary to get the
 netisr thread running.&nbsp; But in netisr_pollmore() the call to the
 scheduler was unnecessary because the netisr was already running. The
 patch by bms did rather more than I suggested (<i>replace two instance
 . . . in netisrpollmore()</i>) in that it it removed the call to
 schednetisrbits() in hardclock_device_poll() <br>
 <br>
 I'm about to go to work. I try to take a look at the 7.0 code later
 today.<br>
 <br>
 Arthur<br>
 <br>
 </body>
 </html>
 
 --------------040104080801080900000409--

From: Arthur Hartwig <Arthur.Hartwig@nokia.com>
To: bug-followup@FreeBSD.org, arthur.hartwig@nokia.com
Cc:  
Subject: Re: kern/100519: [netisr] suggestion to fix suboptimal network polling
Date: Tue, 06 Feb 2007 11:35:17 +1000

 I've looked at the HEAD version of kern_poll.c and it seems similar 
 enough to 6.0 version.
 The call to schednetisrbits() in hardclock device_poll() needs to remain 
 to ensure the netisr is scheduled to run. However, in netisr_pollmore() 
 the two calls to schednetisrbits() can be replaced as I originally 
 described.
 
 My reading of the diffs is that you replaced the call to 
 schednetisrbits() in hardclock_device_poll() (which I didn't suggest) 
 and replaced only one of the two calls to schednetisrbits() in 
 netisr_pollmore() (which is only part of what I suggested.)
 
 Arthur
 
State-Changed-From-To: feedback->open 
State-Changed-By: bms 
State-Changed-When: Sun Feb 25 16:18:13 UTC 2007 
State-Changed-Why:  
Back to the net pool 


Responsible-Changed-From-To: bms->net 
Responsible-Changed-By: bms 
Responsible-Changed-When: Sun Feb 25 16:18:13 UTC 2007 
Responsible-Changed-Why:  
Back to the net pool 

http://www.freebsd.org/cgi/query-pr.cgi?pr=100519 
Responsible-Changed-From-To: net->freebsd-net 
Responsible-Changed-By: rwatson 
Responsible-Changed-When: Mon Feb 26 11:55:47 UTC 2007 
Responsible-Changed-Why:  
Assign to freebsd-net instead of net, since that's the more usual name 
for net@ assignments. 


http://www.freebsd.org/cgi/query-pr.cgi?pr=100519 
>Unformatted:
