From nobody@FreeBSD.org  Fri Nov 18 02:10:20 2011
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id F40D0106566B
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 18 Nov 2011 02:10:19 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id CAD868FC15
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 18 Nov 2011 02:10:19 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.4/8.14.4) with ESMTP id pAI2AJRE018121
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 18 Nov 2011 02:10:19 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.4/8.14.4/Submit) id pAI2AJNG018120;
	Fri, 18 Nov 2011 02:10:19 GMT
	(envelope-from nobody)
Message-Id: <201111180210.pAI2AJNG018120@red.freebsd.org>
Date: Fri, 18 Nov 2011 02:10:19 GMT
From: Adrian Chadd <adrian@FreeBSD.org>
To: freebsd-gnats-submit@FreeBSD.org
Subject: [ath] 11n TX aggregation session / TX hang
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         162647
>Category:       kern
>Synopsis:       [ath] 11n TX aggregation session / TX hang
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    freebsd-wireless
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Nov 18 02:20:02 UTC 2011
>Closed-Date:    
>Last-Modified:  Fri Nov 18 03:56:51 UTC 2011
>Originator:     Adrian Chadd
>Release:        10.0-CURRENT
>Organization:
FreeBSD
>Environment:
FreeBSD unknown 10.0-CURRENT FreeBSD 10.0-CURRENT #37: Thu Jan  1 08:00:00 WST 1970     adrian@dummy:/home/adrian/work/freebsd/git/adrianchadd-freebsd-work/obj/mipseb/mips.mipseb/home/adrian/work/freebsd/git/adrianchadd-freebsd-work/adrianchadd-freebsd-work/sys/RSPRO  mips

(it's a recent -HEAD, ignore the date.)

hostap: AR9227

ath1: <Atheros 9227> irq 1 at device 18.0 on pci0
ath1: [HT] enabling HT modes
ath1: [HT] enabling short-GI in 20MHz mode
ath1: [HT] 2 RX streams; 2 TX streams
ath1: AR9227 mac 384.2 RF5133 phy 15.15

This also includes my git fixes for correctly handling packet queue flushes during reset, but this bug will occur regardless.
>Description:
A node flush is causing the BA window to be completely messed up, resulting in TX timeouts.

It's currently unknown why ath_tx_tid_drain() was called - that's called from:

* ath_tx_txq_drain()
* ath_tx_node_flush()

The former is called during ath_tx_draintxq(); the latter is called from ath_node_cleanup(). So either it's being called during ath_reset(sc, ATH_RESET_DEFAULT or ATH_RESET_FULL); or ic_node_cleanup.

A log snippet:

ath1: ath_tx_aggr_comp_aggr: TID 0: send BAR; seq 3678
ath1: ath_tx_aggr_comp_aggr: TID 0: send BAR; seq 3718
ath1: ath_tx_aggr_comp_aggr: TID 0: send BAR; seq 3742
ath1: ath_tx_aggr_comp_aggr: TID 0: send BAR; seq 3784
ath1: stuck beacon; resetting (bmiss count 4)
ath1: ath_tx_tid_drain: node 0xc0927000: tid 0: txq_depth=2, txq_aggr_depth=2, sched=0, paused=0, hwq_depth=2, incomp=0, baw_head=103, baw_tail=38 txa_start=3396, ni_txseqs=3861
ath1: ath_tx_tid_drain: wasn't added: seqno 3459
ath1: ath_tx_tid_drain: wasn't added: seqno 3460
.
.
ath1: ath_tx_tid_drain: wasn't added: seqno 3857
ath1: ath_tx_tid_drain: wasn't added: seqno 3858
ath1: ath_tx_tid_drain: wasn't added: seqno 3859
ath1: ath_tx_tid_drain: wasn't added: seqno 3860
ath1: ath_tx_default_comp: dobaw should've been cleared!
ath1: ath_tx_default_comp: dobaw should've been cleared!
ath1: ath_tx_default_comp: dobaw should've been cleared!
ath1: ath_tx_default_comp: dobaw should've been cleared!
ath1: ath_tx_default_comp: dobaw should've been cleared!
ath1: ath_tx_default_comp: dobaw should've been cleared!
ath1: ath_tx_default_comp: dobaw should've been cleared!
ath1: ath_tx_default_comp: dobaw should've been cleared!
ath1: device timeout

>How-To-Repeat:
Just general hostap use. The question is how/why the node flush occured.
>Fix:


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-wireless 
Responsible-Changed-By: adrian 
Responsible-Changed-When: Fri Nov 18 03:56:39 UTC 2011 
Responsible-Changed-Why:  
Punt 


http://www.freebsd.org/cgi/query-pr.cgi?pr=162647 
>Unformatted:
