From nobody@FreeBSD.org  Fri Aug 17 20:45:06 2012
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2ED25106566C
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 17 Aug 2012 20:45:06 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id 05D438FC18
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 17 Aug 2012 20:45:06 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.4/8.14.4) with ESMTP id q7HKj5lh094140
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 17 Aug 2012 20:45:05 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.4/8.14.4/Submit) id q7HKj5Da094133;
	Fri, 17 Aug 2012 20:45:05 GMT
	(envelope-from nobody)
Message-Id: <201208172045.q7HKj5Da094133@red.freebsd.org>
Date: Fri, 17 Aug 2012 20:45:05 GMT
From: Garrett Cooper <yaneurabeya@gmail.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: [cxgb] Driver must be loaded after boot due to timing issues checking for kern.ipc.nmb* values set via /boot/loader.conf
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         170713
>Category:       kern
>Synopsis:       [cxgb] Driver must be loaded after boot due to timing issues checking for kern.ipc.nmb* values set via /boot/loader.conf
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    np
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Aug 17 20:50:04 UTC 2012
>Closed-Date:    Thu Oct 25 19:00:07 UTC 2012
>Last-Modified:  Sun Feb 03 22:30:51 UTC 2013
>Originator:     Garrett Cooper
>Release:        7.3-RELEASE/7.4-RELEASE
>Organization:
EMC Isilon
>Environment:
FreeBSD  7.4-RELEASE-p10 FreeBSD 7.4-RELEASE-p10 #0: Fri Aug 17 07:15:01 UTC 2012     root@:/usr/obj/mnt/freebsd/releng/7.4/sys/ISI-GENERIC  amd64
>Description:
There's a timing issue where the following section of code in cxgb_sge.c always fails on 7.3-RELEASE/7.4-RELEASE when trying to load the driver because the values being checked are set to default values (in particular the jumbo_q_size check fails):

    383 #if __FreeBSD_version >= 700111
    384         if (cxgb_use_16k_clusters)
    385                 jumbo_q_size = min(nmbjumbo16/(3*nqsets), JUMBO_Q_SIZE);
    386         else
    387                 jumbo_q_size = min(nmbjumbo9/(3*nqsets), JUMBO_Q_SIZE);
    388 #else
    389         jumbo_q_size = min(nmbjumbop/(3*nqsets), JUMBO_Q_SIZE);
    390 #endif
    391         while (!powerof2(jumbo_q_size))
    392                 jumbo_q_size--;
    393
    394         if (fl_q_size < (FL_Q_SIZE / 4) || jumbo_q_size < (JUMBO_Q_SIZE / 2))
    395                 device_printf(adap->dev,
    396                     "Insufficient clusters and/or jumbo buffers.\n");

Example (7.3-RELEASE with the patch noted below):

cxgbc0: <Chelsio T320, 2 ports> mem 0xfaf7e000-0xfaf7efff,0xfaf7f000-0xfaf7ffff irq 24 at device 0.0 on pci8
cxgbc0: Insufficient clusters and/or jumbo buffers (4096 < 1024 or 256 < 512).
cxgbc0: using MSI-X interrupts (9 vectors)
cxgb0: <Port 0 10GBASE-R> on cxgbc0
cxgb0: Ethernet address: 00:07:43:07:41:1c
cxgb1: <Port 1 10GBASE-R> on cxgbc0
cxgb1: Ethernet address: 00:07:43:07:41:1d
cxgbc0: Firmware Version 7.8.0

Example (7.4-RELEASE):

cxgbc0: <Chelsio T320, 2 ports> mem 0xfaf7e000-0xfaf7efff,0xfaf7f000-0xfaf7ffff irq 24 at device 0.0 on pci8
cxgbc0: Insufficient clusters and/or jumbo buffers.
cxgbc0: using MSI-X interrupts (9 vectors)
cxgb0: <Port 0 10GBASE-R> on cxgbc0
cxgb0: Ethernet address: 00:07:43:07:41:1c
cxgb1: <Port 1 10GBASE-R> on cxgbc0
cxgb1: Ethernet address: 00:07:43:07:41:1d
cxgbc0: Firmware Version 7.8.0

I've worked around this issue by loading the driver later in the boot process,
but this chicken and egg problem should be properly resolved with the proper calls being made to kern_ipc.c before sge is attached to the kernel.

Please note that this issue does not exist on 9.0-RELEASE (or at least didn't exist in the limited testing I've done.. the race still might be there but ameliorated); did not verify if this issue exists on 8.x.

More details:

# cat /boot/loader.conf 
if_cxgb_load="YES"
if_em_load="YES"

kern.ipc.nmbclusters="250000"
kern.ipc.nmbjumbo9=262144
kern.ipc.nmbjumbo16=262144
kern.ipc.nmbclusters=262144
kern.ipc.nmbjumbop=262144
kern.ipc.maxsockbuf=2097152
>How-To-Repeat:
1. Set values for kern.ipc to values shown above.
2. Load if_cxgb at boot.
3. Boot the kernel.
>Fix:
Something needs to be backported from 9.x to 7.x in order to fix this chicken and egg problem, or better sequencing needs to be done, so the mbuf tunables are read in before the driver is probed and attached.

>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-net 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Sat Aug 18 20:40:36 UTC 2012 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=170713 

From: Navdeep Parhar <np@FreeBSD.org>
To: bug-followup@FreeBSD.org, yanegomi@gmail.com
Cc:  
Subject: Re: kern/170713: [cxgb] Driver must be loaded after boot due to timing
 issues checking for kern.ipc.nmb* values set via /boot/loader.conf
Date: Wed, 22 Aug 2012 17:56:34 -0700

 First, note that only kern.ipc.nmbclusters is a valid tunable.  The rest 
 of the nmbXXX settings in your loader.conf have no effect.  There are 
 sysctls but no tunables for the rest.
 
 Take a look at tunable_mbinit in kern_mbuf.c -- on recent FreeBSD 
 versions it starts with nmbclusters and sizes others based on this.  You 
 can set nmbclusters really high and influence the values of the other 
 parameters.
 
 In my opinion we should have a TUNABLE_INT_FETCH for all of the nmbXXX 
 and autocalculate the ones that are not set, just like what we do for 
 nmbclusters today.
 
 static void
 tunable_mbinit(void *dummy)
 {
 	TUNABLE_INT_FETCH("kern.ipc.nmbclusters", &nmbclusters);
 
 	/* This has to be done before VM init. */
 	if (nmbclusters == 0)
 		nmbclusters = 1024 + maxusers * 64;
 	nmbjumbop = nmbclusters / 2;
 	nmbjumbo9 = nmbjumbop / 2;
 	nmbjumbo16 = nmbjumbo9 / 2;
 }
 
 
 Compare this to the tunable_mbinit in 7 and you can see why the 
 nmbclusters tunable does not affect the others -- it is updated after 
 the other values have already been calculated.
 static void
 tunable_mbinit(void *dummy)
 {
 
 	/* This has to be done before VM init. */
 	nmbclusters = 1024 + maxusers * 64;
 	nmbjumbop = nmbclusters / 2;
 	nmbjumbo9 = nmbjumbop / 2;
 	nmbjumbo16 = nmbjumbo9 / 2;
 	TUNABLE_INT_FETCH("kern.ipc.nmbclusters", &nmbclusters);
 }
 
 Regards,
 Navdeep

From: Garrett Cooper <yaneurabeya@gmail.com>
To: Navdeep Parhar <np@freebsd.org>
Cc: bug-followup@freebsd.org
Subject: Re: kern/170713: [cxgb] Driver must be loaded after boot due to
 timing issues checking for kern.ipc.nmb* values set via /boot/loader.conf
Date: Wed, 22 Aug 2012 22:48:23 -0700

 On Wed, Aug 22, 2012 at 5:56 PM, Navdeep Parhar <np@freebsd.org> wrote:
 > First, note that only kern.ipc.nmbclusters is a valid tunable.  The rest of
 > the nmbXXX settings in your loader.conf have no effect.  There are sysctls
 > but no tunables for the rest.
 
 Yes. I've been tweaking things between both areas, and I agree that
 sysctl-only works for non-nmbclusters.
 
 > Take a look at tunable_mbinit in kern_mbuf.c -- on recent FreeBSD versions
 > it starts with nmbclusters and sizes others based on this.  You can set
 > nmbclusters really high and influence the values of the other parameters.
 >
 > In my opinion we should have a TUNABLE_INT_FETCH for all of the nmbXXX and
 > autocalculate the ones that are not set, just like what we do for
 > nmbclusters today.
 >
 > static void
 > tunable_mbinit(void *dummy)
 > {
 >         TUNABLE_INT_FETCH("kern.ipc.nmbclusters", &nmbclusters);
 >
 >         /* This has to be done before VM init. */
 >         if (nmbclusters == 0)
 >                 nmbclusters = 1024 + maxusers * 64;
 >         nmbjumbop = nmbclusters / 2;
 >         nmbjumbo9 = nmbjumbop / 2;
 >         nmbjumbo16 = nmbjumbo9 / 2;
 > }
 >
 >
 > Compare this to the tunable_mbinit in 7 and you can see why the nmbclusters
 > tunable does not affect the others -- it is updated after the other values
 > have already been calculated.
 > static void
 > tunable_mbinit(void *dummy)
 > {
 >
 >         /* This has to be done before VM init. */
 >         nmbclusters = 1024 + maxusers * 64;
 >         nmbjumbop = nmbclusters / 2;
 >         nmbjumbo9 = nmbjumbop / 2;
 >         nmbjumbo16 = nmbjumbo9 / 2;
 >         TUNABLE_INT_FETCH("kern.ipc.nmbclusters", &nmbclusters);
 > }
 
 Funny. I'll do some poking around in our sourcebase and see whether
 this has been backported..
 
 Thanks,
 -Garrett
Responsible-Changed-From-To: freebsd-net->np 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Fri Oct 5 17:23:14 UTC 2012 
Responsible-Changed-Why:  
by request. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=170713 
State-Changed-From-To: open->closed 
State-Changed-By: np 
State-Changed-When: Thu Oct 25 18:59:11 UTC 2012 
State-Changed-Why:  
Fixed in r239624, MFC'd to stable/9 in r241468. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=170713 
>Unformatted:
