From nakaji@xa12.heimat.gr.jp  Fri Jun 18 01:31:52 2004
Return-Path: <nakaji@xa12.heimat.gr.jp>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id EF50116A4CE; Fri, 18 Jun 2004 01:31:52 +0000 (GMT)
Received: from pcat.heimat.gr.jp (catv-118-241.tees.ne.jp [203.141.118.241])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 4FA4143D39; Fri, 18 Jun 2004 01:31:52 +0000 (GMT)
	(envelope-from nakaji@xa12.heimat.gr.jp)
Received: from xa12.heimat.gr.jp (xa12.heimat.gr.jp [202.216.136.35])
	by pcat.heimat.gr.jp (8.12.11/8.12.11) with ESMTP id i5I1VBVY023036;
	Fri, 18 Jun 2004 10:31:11 +0900 (JST)
	(envelope-from nakaji@xa12.heimat.gr.jp)
Received: (from nakaji@localhost)
	by xa12.heimat.gr.jp (8.12.10/8.12.10/Submit) id i5I1VA0D021169;
	Fri, 18 Jun 2004 10:31:10 +0900 (JST)
	(envelope-from nakaji)
Message-Id: <200406180131.i5I1VA0D021169@xa12.heimat.gr.jp>
Date: Fri, 18 Jun 2004 10:31:10 +0900 (JST)
From: NAKAJI Hiroyuki <nakaji@jp.freebsd.org>
Reply-To: NAKAJI Hiroyuki <nakaji@jp.freebsd.org>
To: FreeBSD-gnats-submit@freebsd.org
Cc: sos@freebsd.org
Subject: recent kernel cannot boot on pc98 with ata disks
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         68066
>Category:       kern
>Synopsis:       recent kernel cannot boot on pc98 with ata disks
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    sos
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Jun 18 01:40:20 GMT 2004
>Closed-Date:    Mon Oct 04 11:26:23 GMT 2004
>Last-Modified:  Mon Oct 04 11:26:23 GMT 2004
>Originator:     NAKAJI Hiroyuki
>Release:        FreeBSD 5.2-CURRENT i386
>Organization:
>Environment:
System: FreeBSD xa12.heimat.gr.jp 5.2-CURRENT FreeBSD 5.2-CURRENT #2: Sun Jan 4 12:30:30 JST 2004 root@xa12.heimat.gr.jp:/usr/obj/usr/src/sys/NAKAJI i386



>Description:
	The recent kernels cannot boot on my PC-9821Xa12 with two IDE
drives. The 'boot -v' log of failure is available at
http://heimat.jp/~nakaji/FreeBSD/pc98-Jun08.log, for example. The
kernel of Jan 4 2004 JST can boot and work fine.
	I tried cvsup and build{world,kernel} again and again with
various date=... and finnaly found that the difference of
sys/dev/ata/ata-queue.c between 1.20 and 1.21 on Feb 17 19:24 2004
causes the boot failure, that is, the kernel of
date=2004.02.17.19.00.00 can boot and works well and that of
date=2004.02.17.20.00.00 cannot.
	I'm not sure what it is aimed at, but after this commit, my
pc98 box cannot use any newer kernels.

	The kernel configuration is following.

include	GENERIC
ident	NAKAJI
nooption	INVARIANTS
nooption	INVARIANT_SUPPORT
nooption	WITNESS
nooption	WITNESS_SKIPSPIN
options		NETATALK
options		QUOTA
options		MSGBUF_SIZE=40960
device	pcm


>How-To-Repeat:
	n/a
>Fix:
	n/a
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->sos 
Responsible-Changed-By: nyan 
Responsible-Changed-When: Fri Jun 18 13:58:59 GMT 2004 
Responsible-Changed-Why:  
Over to maintainer. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=68066 

Adding to audit trail from misfiled followups pending/68177,
pending/68183, and pending/68216:

Mon Jun 21 20:40:29 GMT 2004

 Urhm recent ? jan 4'th isn't recent to me, please try something really 
 recent like 21 jun 2004.
 
 -- 
 -Sren
 
Mon Jun 21 23:00:31 GMT 2004:

 I know Jan 4 is not recent but there is a reason, because the kernels
 in April, May, and June cannot boot.
 
 I tried some kernels with the HEAD on some dates, and found
 
 1. the kernel on date=3D2004.02.17.19.00.00 is ok (can boot)
 2. the kernel on date=3D2004.02.17.20.00.00 is ng (cannot boot)
 
 And,
 
 3. all kernels after 2004.02.17.20.00.00 cannot boot anymore, all get
    into debugger with same message
    http://heimat.jp/~nakaji/FreeBSD/pc98-Jun19.log
 
 ...
 
 Good news now!!
 
 Just before sending this email, I tried the HEAD with some change.
 And it successfully boot up. /var/run/dmesg.boot is attached below.
 
 The diff is just a backout of 1.21 -> 1.20.
 
 Index: ata-queue.c
 -------------------------------------------------------------------
 RCS file: /net/pcat/home/ncvs/src/sys/dev/ata/ata-queue.c,v
 retrieving revision 1.29
 diff -u -u -r1.29 ata-queue.c
 --- ata-queue.c	1 Jun 2004 12:26:08 -0000	1.29
 +++ ata-queue.c	21 Jun 2004 14:16:38 -0000
 @@ -215,11 +215,11 @@
  	ata_completed(request, 0);
      }
      else {
 -	if (request->bio && !(request->flags & ATA_R_TIMEOUT))
 +	if (request->bio)
  	    bio_taskqueue(request->bio, (bio_task_t *)ata_completed, request);
  	else {
  	    TASK_INIT(&request->task, 0, ata_completed, request);
 -	    taskqueue_enqueue(taskqueue_thread, &request->task);
 +	    taskqueue_enqueue(taskqueue_swi, &request->task);
  	}
      }
  }
 
 Thanks.
 --
 NAKAJI Hiroyuki

Wed Jun 23 00:00:49 GMT 2004:

 OK. I tried Jun 22nd kernel with one line change:
 
 Index: ata-queue.c
 ---------------------------------------------------------------------
 RCS file: /net/pcat/home/ncvs/src/sys/dev/ata/ata-queue.c,v
 retrieving revision 1.29
 diff -u -u -r1.29 ata-queue.c
 --- ata-queue.c	1 Jun 2004 12:26:08 -0000	1.29
 +++ ata-queue.c	22 Jun 2004 23:38:57 -0000
 @@ -219,7 +219,7 @@
  	    bio_taskqueue(request->bio, (bio_task_t *)ata_completed, request);
  	else {
  	    TASK_INIT(&request->task, 0, ata_completed, request);
 -	    taskqueue_enqueue(taskqueue_thread, &request->task);
 +	    taskqueue_enqueue(taskqueue_swi, &request->task);
  	}
      }
  }
 
 This kernel CAN BOOT so I think taskqueue_thread() causes the
 problem at least on pc98.
 
 $ uname -a
 FreeBSD xa12.heimat.gr.jp 5.2-CURRENT FreeBSD 5.2-CURRENT #3: Tue Jun 22 23:20:42 JST 2004     root@xa12.heimat.gr.jp:/usr/obj/home/nakaji/FreeBSD-PC98/src/sys/NAKAJI  i386
 
 You committed on 2004.02.17.19.24.11 two line change,
 
   revision 1.21
   date: 2004/02/17 19:24:11;  author: sos;  state: Exp;  lines: +2 -2
   Dont use the bio_taskqueue if we are in timeout.
   Use taskqueue_thread rather than taskqueue_swi (maybe we should have
   a taskqueue_ata).
 
 and the latter seems bad for pc98.
 
 Thanks.
 --
 NAKAJI Hiroyuki

From: NAKAJI Hiroyuki <nakaji@tutrp.tut.ac.jp>
To: FreeBSD-gnats-submit@FreeBSD.org
Cc: freebsd-bugs@FreeBSD.org, sos@freebsd.org
Subject: Re: kern/68066: recent kernel cannot boot on pc98 with ata disks
Date: Mon, 28 Jun 2004 23:42:27 +0900

 I found this change is enough,
 
 Index: ata-queue.c
 ===================================================================
 RCS file: /net/pcat/home/ncvs/src/sys/dev/ata/ata-queue.c,v
 retrieving revision 1.29
 diff -u -u -r1.29 ata-queue.c
 --- ata-queue.c	1 Jun 2004 12:26:08 -0000	1.29
 +++ ata-queue.c	22 Jun 2004 23:38:57 -0000
 @@ -219,7 +219,7 @@
  	    bio_taskqueue(request->bio, (bio_task_t *)ata_completed, request);
  	else {
  	    TASK_INIT(&request->task, 0, ata_completed, request);
 -	    taskqueue_enqueue(taskqueue_thread, &request->task);
 +	    taskqueue_enqueue(taskqueue_swi, &request->task);
  	}
      }
  }
 
 But what is the difference between taskqueue_thread() and
 taskqueue_swi()? And why taskqueue_thread() causes error?
 
 Anyway I'm live with newer kernel and userland now.
 
 $ uname -a
 FreeBSD xa12.heimat.gr.jp 5.2-CURRENT FreeBSD 5.2-CURRENT #0: Sun Jun 27 20:52:34 JST 2004     root@xa12.heimat.gr.jp:/usr/obj/home/nakaji/FreeBSD-PC98/src/sys/NAKAJI  i386
 -- 
 NAKAJI Hiroyuki

From: NAKAJI Hiroyuki <nakaji@jp.freebsd.org>
To: =?ISO-8859-1?Q?S=F8ren?= Schmidt <sos@DeepCore.dk>
Cc: FreeBSD-gnats-submit@FreeBSD.ORG, sos@FreeBSD.ORG
Subject: Re: kern/68066: recent kernel cannot boot on pc98 with ata disks
Date: Sun, 25 Jul 2004 23:24:08 +0900

 Hello, Sos,
 
 I tried today's kernel on my new PC-9821Ra333 which has
 
 o one IDE HDD
 o one ATAPI CD-ROM and
 o one SCSI Card (detected as sym0) with one HDD connected
 
 and which can use 5.2.1-RELEASE very fine.
 
 Today's kernel cannot boot, it enters the debugger just after detect
 of ad0.
 
 ad0: 3079MB <QUANTUM FIREBALL SE3.2A> [46368/8/17] at ata0-master PIO4
 Sleeping on "atabnk" with the following non-sleepable locks held:
 exclusive sleep mutex ATA queue lock r = 0 (0xc1468d20) locked @ /usr/src/sys/dev/ata/ata-queue.c:164
 KDB: stack backtrace:
 kdb_backtrace(1,c14031bc,c13f32c0,c0475fb4,ca2ebc74) at kdb_backtrace+0x29
 witness_warn(5,0,c06f0de3,c06dd4e3,c07b2bf8) at witness_warn+0x18e
 msleep(c0475fb4,0,4c,c06dd4e3,1) at msleep+0x42
 ata_cbus_banking(c1468c00,1,c1468d20,0,c06dc9ff,a4) at ata_cbus_banking+0x72
 ata_start(c1468c00,1,c15a9d34,ca2ebd10,c05591ab) at ata_start+0x39
 ata_completed(c15a9ca8,1,c1461958,0,c06f314d) at ata_completed+0x3c4
 taskqueue_run(c1461940,ca2ebd34,c052ab04,0,ca2ebd48) at taskqueue_run+0x83
 taskqueue_thread_loop(0,ca2ebd48,0,c0559238,0) at taskqueue_thread_loop+0x2b
 fork_exit(c0559238,0,ca2ebd48) at fork_exit+0x98
 fork_trampoline() at fork_trampoline+0x8
 --- trap 0x1, eip = 0, esp = 0xca2ebd7c, ebp = 0 ---
 panic: sleeping thread (pid 5) owns a non-sleepable lock
 KDB: enter: panic
 [thread 100024]
 Stopped at      kdb_enter+0x2b: nop
 db> tr
 kdb_enter(c06f0605) at kdb_enter+0x2b
 panic(c06f3298,5,38,c078d628,c07893e0) at panic+0xbb
 propagate_priority(c13d52c0,c078d718,c07893e0,c13d52c0,0) at propagate_priority+0x91
 turnstile_wait(0,c07893e0,c0785840,c07893e0,2,c06ef991,20e) at turnstile_wait+0x2a9
 _mtx_lock_sleep(c07893e0,0,c06f16ee,f6) at _mtx_lock_sleep+0x103
 _mtx_lock_flags(c07893e0,0,c06f16ee,f6,0) at _mtx_lock_flags+0x7f
 softclock(0) at softclock+0x166
 ithread_loop(c13c9580,c9aefd48,c13c9580,c052b53c,0) at ithread_loop+0x134
 fork_exit(c052b53c,c13c9580,c9aefd48) at fork_exit+0x98
 fork_trampoline() at fork_trampoline+0x8
 --- trap 0x1, eip = 0, esp = 0xc9aefd7c, ebp = 0 ---
 db> c
 Uptime: 2s
 Cannot dump. No dump device defined.
 panic: Assertion td->td_turnstile != NULL failed at /usr/src/sys/kern/subr_turnstile.c:478
 KDB: enter: panic
 [thread 100024]
 Stopped at      kdb_enter+0x2b: nop
 db> th
 [thread 100024]
 kdb_enter+0x2b: nop
 db> reset
 
 The whole serial console log captured by "Hyper Terminal on Windows
 2000" is available at http://heimat.jp/~nakaji/FreeBSD/ra333.log
 
 With "-v", http://heimat.jp/~nakaji/FreeBSD/ra333-v.txt
 And when boot with "-v", it hangs up (or gets into infinite loop)
 without entering to debugger.
 
 The kernel configuration is
 
 ======
 include GENERIC
 ident   RA333
 options MSGBUF_SIZE=81920
 ======
 
 And, I reported s/taskqueue_thread/taskqueue_swi/ in
 dev/ata/ata-queue.c can solve this boot failure, but after your last
 commit to dev/ata/ata-lowlevel.c, this substitution cannot help. The
 situation got worse unfortunately.
 
 Thanks in advance for any help.
 -- 
 NAKAJI Hiroyuki

From: NAKAJI Hiroyuki <nakaji@jp.freebsd.org>
To: =?ISO-8859-1?Q?S=F8ren?= Schmidt <sos@DeepCore.dk>
Cc: FreeBSD-gnats-submit@FreeBSD.ORG, sos@FreeBSD.ORG
Subject: Re: kern/68066: recent kernel cannot boot on pc98 with ata disks
Date: Sat, 07 Aug 2004 22:13:22 +0900

 Good news! My problem has gone. Thank you very much.
 
 The kernel with sys/dev/ata/ata-lowlevel.c rev.1.42 can boot with no
 panic.
 
 description:
 ----------------------------
 revision 1.42
 date: 2004/08/06 22:23:53;  author: njl;  state: Exp;  lines: +1 -1
 Fix a panic in ata_generic_transaction().  The DMA pointer of the channel
 was being unconditionally dereferenced but was NULL for PIO requests.
 Check the request flags for a DMA transaction before dereferencing.
 
 Reported by:    ceri
 Tested by:      Radek Kozlowski <radek -at- raadradd.com>
 =============================================================================
 
 Just before this commit, the kernel panics with fatal trap 12 in
 ata_generic_transactionq().
 
 ====
 Fatal trap 12: page fault while in kernel mode
 fault virtual address   = 0x3c
 fault code              = supervisor read, page not present
 instruction pointer     = 0x8:0xc0478bfe
 stack pointer           = 0x10:0xc0c21cd4
 frame pointer           = 0x10:0xc0c21ce8
 code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, def32 1, gran 1
 processor eflags        = resume, IOPL = 0
 current process         = 0 (swapper)
 [thread 0]
 Stopped at      ata_generic_transaction+0x6e2:  testb   $0x2,0x3c(%eax)
 ====
 
 Thank you again. Please close the PR.
 -- 
 NAKAJI Hiroyuki
State-Changed-From-To: open->closed 
State-Changed-By: sos 
State-Changed-When: Mon Oct 4 11:25:09 GMT 2004 
State-Changed-Why:  
Problem fixed. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=68066 
>Unformatted:
