From nobody@FreeBSD.org  Fri Mar 23 21:02:40 2012
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 880561065670
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 23 Mar 2012 21:02:40 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id 725CB8FC0A
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 23 Mar 2012 21:02:40 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.4/8.14.4) with ESMTP id q2NL2e5R034766
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 23 Mar 2012 21:02:40 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.4/8.14.4/Submit) id q2NL2eTW034765;
	Fri, 23 Mar 2012 21:02:40 GMT
	(envelope-from nobody)
Message-Id: <201203232102.q2NL2eTW034765@red.freebsd.org>
Date: Fri, 23 Mar 2012 21:02:40 GMT
From: Adrian Chadd <adrian@FreeBSD.org>
To: freebsd-gnats-submit@FreeBSD.org
Subject: [ath] 802.11n TX stall when the first _AND_ last frame in the BAW is in the software queue
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         166357
>Category:       kern
>Synopsis:       [ath] 802.11n TX stall when the first frame in the BAW is in the software queue
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-wireless
>State:          patched
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Mar 23 21:10:06 UTC 2012
>Closed-Date:    
>Last-Modified:  Mon Apr 02 16:39:17 UTC 2012
>Originator:     Adrian Chadd
>Release:        10.0-CURRENT
>Organization:
>Environment:
mips24k (ar913x) (TP-WR1043ND)
>Description:
I'm seeing TX stalls after a bunch of bidirectional traffic in a lossy 802.11n environment.

Every single time, I see:

* the TID isn't scheduled;
* The software queue for that TID has the first and last seqno in the BAW in the software queue (along with other frames here and there);
* .. i _think_ the HW transmission has failed, so the frames appear again in the SWQ;
* .. but it's thus not being rescheduled.

I'll post further information soon.
>How-To-Repeat:

>Fix:


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-wireless 
Responsible-Changed-By: adrian 
Responsible-Changed-When: Fri Mar 23 21:14:27 UTC 2012 
Responsible-Changed-Why:  
reassign to -wireless. 


http://www.freebsd.org/cgi/query-pr.cgi?pr=166357 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/166357: commit references a PR
Date: Sun, 25 Mar 2012 23:50:44 +0000 (UTC)

 Author: adrian
 Date: Sun Mar 25 23:50:34 2012
 New Revision: 233480
 URL: http://svn.freebsd.org/changeset/base/233480
 
 Log:
   Add some more debugging to try and nail down exactly what's going on when
   I see traffic stalls.
   
   It turns out that the bug isn't because the first and last frame in the
   BAW is in the software queue.  It is more likely that it's because
   the first frame in the BAW is still in the software queue and thus there's
   no more room to allocate and do subsequent TX.
   
   PR:		kern/166357
 
 Modified:
   head/sys/dev/ath/if_ath_tx.c
 
 Modified: head/sys/dev/ath/if_ath_tx.c
 ==============================================================================
 --- head/sys/dev/ath/if_ath_tx.c	Sun Mar 25 21:54:36 2012	(r233479)
 +++ head/sys/dev/ath/if_ath_tx.c	Sun Mar 25 23:50:34 2012	(r233480)
 @@ -2357,6 +2357,7 @@ ath_tx_xmit_aggr(struct ath_softc *sc, s
  	/* paused? queue */
  	if (tid->paused) {
  		ATH_TXQ_INSERT_TAIL(tid, bf, bf_list);
 +		/* XXX don't sched - we're paused! */
  		return;
  	}
  
 @@ -2647,13 +2648,19 @@ ath_tx_tid_drain(struct ath_softc *sc, s
  		if (t == 0) {
  			device_printf(sc->sc_dev,
  			    "%s: node %p: bf=%p: addbaw=%d, dobaw=%d, "
 -			    "seqno_assign=%d, seqno_required=%d, seqno=%d\n",
 +			    "seqno_assign=%d, seqno_required=%d, seqno=%d, retry=%d\n",
  			    __func__, ni, bf,
  			    bf->bf_state.bfs_addedbaw,
  			    bf->bf_state.bfs_dobaw,
  			    bf->bf_state.bfs_need_seqno,
  			    bf->bf_state.bfs_seqno_assigned,
 -			    SEQNO(bf->bf_state.bfs_seqno));
 +			    SEQNO(bf->bf_state.bfs_seqno),
 +			    bf->bf_state.bfs_retries);
 +			device_printf(sc->sc_dev,
 +			    "%s: node %p: bf=%p: tid txq_depth=%d hwq_depth=%d\n",
 +			    __func__, ni, bf,
 +			    tid->axq_depth,
 +			    tid->hwq_depth);
  			device_printf(sc->sc_dev,
  			    "%s: node %p: bf=%p: tid %d: txq_depth=%d, "
  			    "txq_aggr_depth=%d, sched=%d, paused=%d, "
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/166357: commit references a PR
Date: Mon, 26 Mar 2012 16:05:38 +0000 (UTC)

 Author: adrian
 Date: Mon Mar 26 16:05:19 2012
 New Revision: 233514
 URL: http://svn.freebsd.org/changeset/base/233514
 
 Log:
   Use the assigned sequence number when checking if a retried packet is
   within the BAW.
   
   This regression was introduced in ane earlier commit by me to fix the
   BAW seqno allocation-but-not-insertion-into-BAW race.  Since it was only
   ever using the to-be allocated sequence number, any frame retries
   with the first frame in the BAW still in the software queue would
   have constantly failed, as ni_txseqs[tid] would always be outside
   the BAW.
   
   TODO:
   
   * Extract out the mostly common code here in the agg and non-agg ADDBA
     case and stuff it into a single function.
   
   PR:		kern/166357
 
 Modified:
   head/sys/dev/ath/if_ath_tx.c
   head/sys/dev/ath/if_ath_tx_ht.c
 
 Modified: head/sys/dev/ath/if_ath_tx.c
 ==============================================================================
 --- head/sys/dev/ath/if_ath_tx.c	Mon Mar 26 15:38:17 2012	(r233513)
 +++ head/sys/dev/ath/if_ath_tx.c	Mon Mar 26 16:05:19 2012	(r233514)
 @@ -2348,7 +2348,6 @@ ath_tx_xmit_aggr(struct ath_softc *sc, s
  	struct ath_tid *tid = &an->an_tid[bf->bf_state.bfs_tid];
  	struct ath_txq *txq = bf->bf_state.bfs_txq;
  	struct ieee80211_tx_ampdu *tap;
 -	int seqno;
  
  	ATH_TXQ_LOCK_ASSERT(txq);
  
 @@ -2384,13 +2383,31 @@ ath_tx_xmit_aggr(struct ath_softc *sc, s
  	 * the TIDs that map to it.  Ugh.
  	 */
  	if (bf->bf_state.bfs_dobaw) {
 -		if (! BAW_WITHIN(tap->txa_start, tap->txa_wnd,
 -		    ni->ni_txseqs[bf->bf_state.bfs_tid])) {
 +		ieee80211_seq seqno;
 +
 +		/*
 +		 * If the sequence number is allocated, use it.
 +		 * Otherwise, use the sequence number we WOULD
 +		 * allocate.
 +		 */
 +		if (bf->bf_state.bfs_seqno_assigned)
 +			seqno = SEQNO(bf->bf_state.bfs_seqno);
 +		else
 +			seqno = ni->ni_txseqs[bf->bf_state.bfs_tid];
 +
 +		/*
 +		 * Check whether either the currently allocated
 +		 * sequence number _OR_ the to-be allocated
 +		 * sequence number is inside the BAW.
 +		 */
 +		if (! BAW_WITHIN(tap->txa_start, tap->txa_wnd, seqno)) {
  			ATH_TXQ_INSERT_TAIL(tid, bf, bf_list);
  			ath_tx_tid_sched(sc, tid);
  			return;
  		}
  		if (! bf->bf_state.bfs_seqno_assigned) {
 +			int seqno;
 +
  			seqno = ath_tx_tid_seqno_assign(sc, ni, bf, bf->bf_m);
  			if (seqno < 0) {
  				device_printf(sc->sc_dev,
 
 Modified: head/sys/dev/ath/if_ath_tx_ht.c
 ==============================================================================
 --- head/sys/dev/ath/if_ath_tx_ht.c	Mon Mar 26 15:38:17 2012	(r233513)
 +++ head/sys/dev/ath/if_ath_tx_ht.c	Mon Mar 26 16:05:19 2012	(r233514)
 @@ -652,7 +652,6 @@ ath_tx_form_aggr(struct ath_softc *sc, s
  	int status = ATH_AGGR_DONE;
  	int prev_frames = 0;	/* XXX for AR5416 burst, not done here */
  	int prev_al = 0;	/* XXX also for AR5416 burst */
 -	int seqno;
  
  	ATH_TXQ_LOCK_ASSERT(sc->sc_ac2q[tid->ac]);
  
 @@ -747,13 +746,32 @@ ath_tx_form_aggr(struct ath_softc *sc, s
  		 * see ath_tx_xmit_aggr() for more info.
  		 */
  		if (bf->bf_state.bfs_dobaw) {
 +			ieee80211_seq seqno;
 +
 +			/*
 +			 * If the sequence number is allocated, use it.
 +			 * Otherwise, use the sequence number we WOULD
 +			 * allocate.
 +			 */
 +			if (bf->bf_state.bfs_seqno_assigned)
 +				seqno = SEQNO(bf->bf_state.bfs_seqno);
 +			else
 +				seqno = ni->ni_txseqs[bf->bf_state.bfs_tid];
 +
 +			/*
 +			 * Check whether either the currently allocated
 +			 * sequence number _OR_ the to-be allocated
 +			 * sequence number is inside the BAW.
 +			 */
  			if (! BAW_WITHIN(tap->txa_start, tap->txa_wnd,
 -			    ni->ni_txseqs[bf->bf_state.bfs_tid])) {
 +			    seqno)) {
  				status = ATH_AGGR_BAW_CLOSED;
  				break;
  			}
 +
  			/* XXX check for bfs_need_seqno? */
  			if (! bf->bf_state.bfs_seqno_assigned) {
 +				int seqno;
  				seqno = ath_tx_tid_seqno_assign(sc, ni, bf, bf->bf_m);
  				if (seqno < 0) {
  					device_printf(sc->sc_dev,
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: open->patched 
State-Changed-By: adrian 
State-Changed-When: Mon Apr 2 16:38:30 UTC 2012 
State-Changed-Why:  
Fix 


http://www.freebsd.org/cgi/query-pr.cgi?pr=166357 
>Unformatted:
