From nobody@FreeBSD.org  Thu May 14 21:00:12 2009
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B16EC10656DC
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 14 May 2009 21:00:12 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 5A6338FC18
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 14 May 2009 21:00:12 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id n4EL0CSS057185
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 14 May 2009 21:00:12 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id n4EL0CiJ057184;
	Thu, 14 May 2009 21:00:12 GMT
	(envelope-from nobody)
Message-Id: <200905142100.n4EL0CiJ057184@www.freebsd.org>
Date: Thu, 14 May 2009 21:00:12 GMT
From: Alexander Sack <pisymbol@gmail.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: bge(4) panics on shutdown under heavy traffic load
X-Send-Pr-Version: www-3.1
X-GNATS-Notify: delphij@FreeBSD.org

>Number:         134548
>Category:       kern
>Synopsis:       [bge] bge panics on shutdown under heavy traffic load
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    delphij
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu May 14 21:10:01 UTC 2009
>Closed-Date:    Wed May 20 21:18:32 UTC 2009
>Last-Modified:  Wed May 20 21:20:01 UTC 2009
>Originator:     Alexander Sack
>Release:        CURRENT (8.x) and 7.1-RELEASE-amd64
>Organization:
Niksun
>Environment:
>Description:
Well shutting down an interface either via IOCTL's or ifconfig bgeX down etc., the bge driver will panic in bge_rxeof() with a kernel page fault (it was trying to access a mbuf I believe).

The problem is a race between bge_stop() and bge_rxeof() for the softc lock.  What is happening is the following:

- bge_intr()
- bge_rxeof()
- process rings in while loop
- bge_stop() is called in the middle of processing BD's in bge_rxeof()
- bge_rxeof() releases soft sc lock BGE_UNLOCK() before calling input routine
- bge_stop() is left through, stops the hardware, and marks the ifp as resets the IFP_DRV_RUNNING flag
- bge_rxeof() continues to process RX rings (BDs) and panics since memory maps have been unloaded and resources released



>How-To-Repeat:
Connect two BGE ports on any amd64 system running CURRENT or 7.1-RELEASE+ and shoot large amounts of traffic through them.  I was sending GIGE traffic through two ports at 100% utilization (it was actually SmartBit traffic).
>Fix:
-- if_bge.c.CURRENT	2009-05-14 14:39:39.000000000 -0400
+++ if_bge.c	2009-05-14 16:57:02.000000000 -0400
@@ -3073,8 +3073,9 @@
 		bus_dmamap_sync(sc->bge_cdata.bge_rx_jumbo_ring_tag,
 		    sc->bge_cdata.bge_rx_jumbo_ring_map, BUS_DMASYNC_POSTREAD);
 
-	while(sc->bge_rx_saved_considx !=
-	    sc->bge_ldata.bge_status_block->bge_idx[0].bge_rx_prod_idx) {
+	while (sc->bge_rx_saved_considx !=
+	    sc->bge_ldata.bge_status_block->bge_idx[0].bge_rx_prod_idx && 
+		(ifp->if_drv_flags & IFF_DRV_RUNNING)) {
 		struct bge_rx_bd	*cur_rx;
 		uint32_t		rxidx;
 		struct mbuf		*m = NULL;


Patch above follows similar style if_em logic in that we check before proceeding to process the RX ring if the driver is running.  This prevents all panics at the cost of an extra check every time we are on the loop (albeit my testing has not shown any significant performance penalty yet to cause drops but I realize this execution path is very sensitive).

>Release-Note:
>Audit-Trail:

From: Xin LI <delphij@delphij.net>
To: bug-followup@FreeBSD.org, pisymbol@gmail.com
Cc:  
Subject: Re: kern/134548: bge(4) panics on shutdown under heavy traffic load
Date: Thu, 14 May 2009 14:40:59 -0700

 This is a multi-part message in MIME format.
 --------------010504020606080505020502
 Content-Type: text/plain; charset=ISO-8859-1
 Content-Transfer-Encoding: 7bit
 
 My version of the patch:
 
  - Check IFF_RUNNING right after softc lock is re-obtained.
  - Avoid txeof calls after rxeof calls if IFF_RUNNING is disabled.
 
 --------------010504020606080505020502
 Content-Type: text/plain;
  name="if_bge.c.diff"
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline;
  filename="if_bge.c.diff"
 
 Index: if_bge.c
 ===================================================================
 --- if_bge.c	(revision 191995)
 +++ if_bge.c	(working copy)
 @@ -3073,7 +3073,7 @@ bge_rxeof(struct bge_softc *sc)
  		bus_dmamap_sync(sc->bge_cdata.bge_rx_jumbo_ring_tag,
  		    sc->bge_cdata.bge_rx_jumbo_ring_map, BUS_DMASYNC_POSTREAD);
  
 -	while(sc->bge_rx_saved_considx !=
 +	while (sc->bge_rx_saved_considx !=
  	    sc->bge_ldata.bge_status_block->bge_idx[0].bge_rx_prod_idx) {
  		struct bge_rx_bd	*cur_rx;
  		uint32_t		rxidx;
 @@ -3193,6 +3193,9 @@ bge_rxeof(struct bge_softc *sc)
  		BGE_UNLOCK(sc);
  		(*ifp->if_input)(ifp, m);
  		BGE_LOCK(sc);
 +
 +		if (!(ifp->if_drv_flags & IFF_DRV_RUNNING))
 +			return;
  	}
  
  	if (stdcnt > 0)
 @@ -3301,6 +3304,10 @@ bge_poll(struct ifnet *ifp, enum poll_cmd cmd, int
  
  	sc->rxcycles = count;
  	bge_rxeof(sc);
 +	if (!(ifp->if_drv_flags & IFF_DRV_RUNNING)) {
 +		BGE_UNLOCK(sc);
 +		return;
 +	}
  	bge_txeof(sc);
  	if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd))
  		bge_start_locked(ifp);
 @@ -3370,7 +3377,9 @@ bge_intr(void *xsc)
  	if (ifp->if_drv_flags & IFF_DRV_RUNNING) {
  		/* Check RX return ring producer/consumer. */
  		bge_rxeof(sc);
 +	}
  
 +	if (ifp->if_drv_flags & IFF_DRV_RUNNING) {
  		/* Check TX ring producer/consumer. */
  		bge_txeof(sc);
  	}
 
 --------------010504020606080505020502--
State-Changed-From-To: open->patched 
State-Changed-By: delphij 
State-Changed-When: Thu May 14 22:33:56 UTC 2009 
State-Changed-Why:  
Patch applied against -HEAD, MFC reminder. 


Responsible-Changed-From-To: freebsd-bugs->delphij 
Responsible-Changed-By: delphij 
Responsible-Changed-When: Thu May 14 22:33:56 UTC 2009 
Responsible-Changed-Why:  
I touch it so I buy it... 

http://www.freebsd.org/cgi/query-pr.cgi?pr=134548 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/134548: commit references a PR
Date: Thu, 14 May 2009 22:33:48 +0000 (UTC)

 Author: delphij
 Date: Thu May 14 22:33:37 2009
 New Revision: 192127
 URL: http://svn.freebsd.org/changeset/base/192127
 
 Log:
   Try to workaround a race where bge_stop() may sneak in when bge_rxeof()
   drops and re-grabs the softc mutex in the middle, resulting in kernel
   trap 12.  This may happen when a lot of traffic is being hammered on
   one bge(4) interface while the system is shutting down.
   
   Reported by:	Alexander Sack <pisymbol gmail com>
   PR:		kern/134548
   MFC After:	2 weeks
 
 Modified:
   head/sys/dev/bge/if_bge.c
 
 Modified: head/sys/dev/bge/if_bge.c
 ==============================================================================
 --- head/sys/dev/bge/if_bge.c	Thu May 14 22:13:17 2009	(r192126)
 +++ head/sys/dev/bge/if_bge.c	Thu May 14 22:33:37 2009	(r192127)
 @@ -3193,6 +3193,9 @@ bge_rxeof(struct bge_softc *sc)
  		BGE_UNLOCK(sc);
  		(*ifp->if_input)(ifp, m);
  		BGE_LOCK(sc);
 +
 +		if (!(ifp->if_drv_flags & IFF_DRV_RUNNING))
 +			return;
  	}
  
  	if (stdcnt > 0)
 @@ -3301,6 +3304,10 @@ bge_poll(struct ifnet *ifp, enum poll_cm
  
  	sc->rxcycles = count;
  	bge_rxeof(sc);
 +	if (!(ifp->if_drv_flags & IFF_DRV_RUNNING)) {
 +		BGE_UNLOCK(sc);
 +		return;
 +	}
  	bge_txeof(sc);
  	if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd))
  		bge_start_locked(ifp);
 @@ -3370,7 +3377,9 @@ bge_intr(void *xsc)
  	if (ifp->if_drv_flags & IFF_DRV_RUNNING) {
  		/* Check RX return ring producer/consumer. */
  		bge_rxeof(sc);
 +	}
  
 +	if (ifp->if_drv_flags & IFF_DRV_RUNNING) {
  		/* Check TX ring producer/consumer. */
  		bge_txeof(sc);
  	}
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: patched->closed 
State-Changed-By: delphij 
State-Changed-When: Wed May 20 21:18:04 UTC 2009 
State-Changed-Why:  
Patch applied against 7-STABLE, thanks for reporting the problem! 

http://www.freebsd.org/cgi/query-pr.cgi?pr=134548 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/134548: commit references a PR
Date: Wed, 20 May 2009 21:17:30 +0000 (UTC)

 Author: delphij
 Date: Wed May 20 21:17:10 2009
 New Revision: 192478
 URL: http://svn.freebsd.org/changeset/base/192478
 
 Log:
   MFC r192127:
   
   Try to workaround a race where bge_stop() may sneak in when bge_rxeof()
   drops and re-grabs the softc mutex in the middle, resulting in kernel
   trap 12.  This may happen when a lot of traffic is being hammered on
   one bge(4) interface while the system is shutting down.
   
   Reported by:	Alexander Sack <pisymbol gmail com>
   PR:		kern/134548
 
 Modified:
   stable/7/sys/   (props changed)
   stable/7/sys/dev/bge/if_bge.c
 
 Modified: stable/7/sys/dev/bge/if_bge.c
 ==============================================================================
 --- stable/7/sys/dev/bge/if_bge.c	Wed May 20 21:13:49 2009	(r192477)
 +++ stable/7/sys/dev/bge/if_bge.c	Wed May 20 21:17:10 2009	(r192478)
 @@ -3193,6 +3193,9 @@ bge_rxeof(struct bge_softc *sc)
  		BGE_UNLOCK(sc);
  		(*ifp->if_input)(ifp, m);
  		BGE_LOCK(sc);
 +
 +		if (!(ifp->if_drv_flags & IFF_DRV_RUNNING))
 +			return;
  	}
  
  	if (stdcnt > 0)
 @@ -3301,6 +3304,10 @@ bge_poll(struct ifnet *ifp, enum poll_cm
  
  	sc->rxcycles = count;
  	bge_rxeof(sc);
 +	if (!(ifp->if_drv_flags & IFF_DRV_RUNNING)) {
 +		BGE_UNLOCK(sc);
 +		return;
 +	}
  	bge_txeof(sc);
  	if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd))
  		bge_start_locked(ifp);
 @@ -3370,7 +3377,9 @@ bge_intr(void *xsc)
  	if (ifp->if_drv_flags & IFF_DRV_RUNNING) {
  		/* Check RX return ring producer/consumer. */
  		bge_rxeof(sc);
 +	}
  
 +	if (ifp->if_drv_flags & IFF_DRV_RUNNING) {
  		/* Check TX ring producer/consumer. */
  		bge_txeof(sc);
  	}
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
>Unformatted:
