From nobody@FreeBSD.org  Wed Sep  1 13:07:05 2010
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id AB8E510656BC
	for <freebsd-gnats-submit@FreeBSD.org>; Wed,  1 Sep 2010 13:07:05 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 99A6C8FC08
	for <freebsd-gnats-submit@FreeBSD.org>; Wed,  1 Sep 2010 13:07:05 +0000 (UTC)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id o81D75wS088530
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 1 Sep 2010 13:07:05 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id o81D75bE088529;
	Wed, 1 Sep 2010 13:07:05 GMT
	(envelope-from nobody)
Message-Id: <201009011307.o81D75bE088529@www.freebsd.org>
Date: Wed, 1 Sep 2010 13:07:05 GMT
From: David Evans <dave.evans55@googlemail.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: [Parallels Desktop] CDROM disconnected leads to panic, eventually
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         150186
>Category:       kern
>Synopsis:       [parallels] [panic] Parallels Desktop: CDROM disconnected leads to panic, eventually
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-emulation
>State:          analyzed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Sep 01 13:10:03 UTC 2010
>Closed-Date:    
>Last-Modified:  Thu Dec 16 19:20:09 UTC 2010
>Originator:     David Evans
>Release:        8.1 STABLE
>Organization:
>Environment:
FreeBSD eight.pearl 8.1-STABLE FreeBSD 8.1-STABLE #1: Mon Aug 30 03:24:09 BST 2010     root@:/usr/obj/usr/src/sys/GENERIC  i386

>Description:
I've put Parallels in the subject line so that anyone
interested in Parallels Desktop will find this report
in a single-line search.  At the moment I do not think this
is a Desktop bug.

For the last few months I've had the occasional panic with
my FreeBSD 8 installation running in a VM under Parallels
Desktop for Mac versions 4.0 and 5.0. Most of the time
the panic message disappeared off the screen before I could
make a note of it. In the last few days I have finally managed
to capture the panic report.

In an effort to track down the bug I have tried various things
such a increasing the memory or setting the number of CPUs from
2 to 1. Nothing worked. I was lucky to get an uptime greater than
60 minutes. Finally, I removed the CDROM device from the Desktop's
list of virtual hardware. This seems to have fixed the problem.

Here is an annotated log of the panics.

--------
Machine: eight.pearl
Desktop-Name: FBSD-8-new-precious (eight)
Parallels-Version: 4.0
FreeBSD version: 8.0, 2010-05-28
--------
CPU: 2
Type: i386
Date: 2010-05-24 16:00
Crash:

Page not found
----------
Date: 2010-08-28
Information: I installed Parallels 5.0
----------
Date 2010:08-29 23:00
CPU:1
Crash:
Page not found after finishing cvsup ports
---------
Date: 2010-08-30
CPU: 1
Information:

Single User
Now rebuilding kernel and world with sources from 2010-08-29 cvsup
Single CPU,  Single USER.  No crashes even after 3 hours.

After installing the kernel took snapshot-1
Kernel is now dated Mon Aug 30 03:24:09 BST 2010

Installed world, then took snapshot-2

Reboot CPU:2, multiuser and test by repeated buildworlds

---------
Date: 2010-08-30
Information:
FreeBSD-version: 8.1-STABLE Aug 30 03:24:09 2010

---------
Date: 2010-08-30 12:32
Crash:
CPU: 2

Spontaneous reboot while make buildworld after about 30 minutes,
not sure whether page not found message appeared.

Could this be related to DHCP? The lease time is around 30 minutes.
No, I don't think so.

Action: set CPU to 1 and revert to snapshot-2, test with buildworld again.

Oops! forgot to set CPU to one, so:

--------
Date: 2010-08-30 13:13
Crash:
CPU: 2

while buildworld in multi user mode
got:

acd0: warning - PREVENT_ALLOW taskqueue timeout - completing request directly

Fatal tap 12: page fault while in kernel mode
cpuid - 0; apic id == 00
fault virtual address = 0x1a4
fault code = supervisor read, page not present
current process = 12 (swi6: task queue)
trap number = 12
panic: page fault
uptime: 33m0s


Action: set CPU:1 revert to snapshot-2, try buildworld again
Note: when reverting to a snapshot, you get the CPUs active when that
      snapshot was taken. Need to configure CPU:1 again.
---------
Date: 2010-08-30 15:00
Crash:
CPU: 1

Still crashes buildworld multiuser CPU=1, just takes longer (about an hour)
---------
Date: 2010-08-30  18:48
Crash:
CPU: 2

During buildworld:
panic message

acd0: WARNING - unknown CMD (0x4a) taskqueue timeout - completing request directly
acd0: WARNING - unknown CMD (0x4a) freeing taskqueue zombie request

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id =00
fault virtual address = 0x1a4
fault code = supervisor read, page not present
instruction pointer   = 0x20:0xc0894f5f
stack pointer         = 0x28:0xe683d960
frame pointer         = 0x28:0xe683d978
code segment          = base 0x0, limit 0xdffff, type =0x1b
                      = DPL 0, pres 1, def32 1, gran 1
processor eflags      = interrupt enabled, resume, IOPL=0
current process       = 1184 (initial thread)
trap number = 12
panic: page fault
cpuid = 0
Uptime: 57m10s

Action: try removing the CDROM device from the Config screen hardware list.
Previously I had it installed but not connected.

Save the configuration as snapshot-3
Seems to be working, one buildworld and cvs co ports succeeded.

-----------
Date: 2010-09-01 12:00
CPU: 2
Information:

Since removing the CDROM device I have experienced no panics.
Current uptime is 1d1h26m.

I have sucessfully done several buildworlds, a couple of cvsup's
for ports and src, and various other tasks

======================================

Conclusion:

Whenever I have captured a panic it is always preceeded by a message
from the acd0 device.  I had the CDROM installed but not connected to
a real CD or disk image. Uninstalling the CDROM makes the panics go
away. Having the CDROM connected to a real CD or diskimage also works.
Fortuitiously, I had most of my other VMs connected to a diskimage.


>How-To-Repeat:
Install FreeBSD 8-STABLE on a Parallels Desktop version 4 or 5 VM.

Add a CDROM to the Desktop list of installed hardware.  Disconnect the CDROM.

Boot FreeBSD and do some processor and disk intensive task, such as make buildworld.
Within a couple of hours you will get a panic.
>Fix:

Workaround.
Reconnect or uninstall the CDROM.

I am currently building world on another VM with ZFS and a connected
CDROM and it seems to be ok so far.


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-emulation 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Mon Sep 6 07:35:57 UTC 2010 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=150186 
State-Changed-From-To: open->feedback 
State-Changed-By: jh 
State-Changed-When: Mon Sep 6 14:53:47 UTC 2010 
State-Changed-Why:  
Could you try to reproduce this with CAM(4) subsystem (ATA_CAM kernel 
option)? 

http://www.freebsd.org/cgi/query-pr.cgi?pr=150186 

From: David Evans <dave.evans55@googlemail.com>
To: bug-followup@FreeBSD.org, dave.evans55@googlemail.com
Cc:  
Subject: Re: kern/150186: [parallels] [panic] Parallels Desktop: CDROM disconnected
 leads to panic, eventually
Date: Fri, 10 Sep 2010 11:49:17 +0100

  Summary:
 --------
 In a Desktop VM with the CDrom installed, but not connected, and with
 the hald and dbus daemons running, and running buildworld or background
 fsck or both, there is a high probability of a panic within a few minutes.
 After disabling  hald and dbus in /etc/rc.conf, I successfully ran
 make buildworld in a loop 7 times without any problems. This amounts to
 about 11 hours of runtime.
 
 Environment:
 ------------
 
 Parallels Desktop 5 for Mac build 5.0.9376. A slightly older version was
        also used.
 
 Mac OS X Snow Leopard 10.6.4, 4G of ram, 1G allocated to VM
 
 CVS tag RELENG_8 src cvsup'ed at 2010-09-02 22:27 UTC
 
 Ports cvsup'ed at 2010-08-31 15:05 UTC
 
 Events so far:
 --------------
 
 My main development VM, known as eight.pearl, has been running for
 the last five days without the CDROM installed. It has successfully
 built world, ran a major portupgrade and done a few dump(8)s without any
 panics. It is far too precious to risk any data corruption, so I made a clone.
 
 The cloned VM is known imaginatively as clone8.pearl
 
 In clone8.pearl, I installed the CDROM and disconnected it. I then started
 make buildworld.  Within a few minutes there was a panic. I rebooted and
 tried another buildworld.  Again, there was soon another panic.  Each panic
 appeared to be preceeded by a message from /dev/acd0.  Fortunately I had
 enabled dumps (see below).
 
 In clone8.pearl I then disabled hald and dbus in /etc/rc.conf.  I then
 ran make buildworld in a loop 7 times overnight. This morning I found
 the VM was still running.
 
 Additional VMs created
 ----------------------
 
 I created two more VMs: cdpanic.pearl and cam.pearl. Both were the
 minimum installation from the FreeBSD cdrom 1 of November 2009. I updated
 the world and kernel from my local sources. No ports were installed.
 cdpanic.pearl had a standard GENERIC kernel with DDB. cam.pearl also
 was a standard debugging kernel with option ATA_CAM, as suggested by jh
 earlier in this bug report.
 
 I installed and disconnected the CDrom on both VMs and started a buildworld.
 Both completed successfully with no panics.
 
 hald and dbus
 -------------
 These two ports run as daemons checking the status of devices.
 hald comes from sysutils/hal. dbus is from devel/dbus.  They
 are the only two daemons I can see that access the CDrom device.
 I am now convinced they are tickling a bug in the acd device
 which causes a panic.  
 
 To trigger the bug you need to run something disk-intensive.
 make buildworld is good. So is background fsck.
 
 The acd0 device needs to report NOT READY status when it is not connected.
 This is probably a Desktop problem.
 
 To Do
 -----
 I must create another clone of eight.pearl and install a CAM kernel
 on it.
 
 Dumps
 -----
 I managed to obtain two dumps. here is the output of dmesg.  I realise
 they are not much use, but I need to hone my kernel debugging skills
 to get more useful information. Both stopped at the same instruction pointer.
 
 -------
 ata1: WARNING - READ_TOC read data overrun 18>12
 acd0: WARNING - READ_TOC taskqueue timeout - completing request directly
 
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address    = 0x1a4
 fault code        = supervisor read, page not present
 instruction pointer    = 0x20:0xc08a119f
 stack pointer            = 0x28:0xe4521b44
 frame pointer            = 0x28:0xe4521b5c
 code segment        = base 0x0, limit 0xfffff, type 0x1b
             = DPL 0, pres 1, def32 1, gran 1
 processor eflags    = interrupt enabled, resume, IOPL = 0
 current process        = 12 (swi6: task queue)
 panic: from debugger
 cpuid = 0
 Uptime: 6m0s
 Physical memory: 1011 MB
 Dumping 148 MB: 133 117 101 85 69 53 37 21 5
 -------------------------
 acd0: WARNING - READ_TOC taskqueue timeout - completing request directly
 
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address    = 0x1a4
 fault code        = supervisor read, page not present
 instruction pointer    = 0x20:0xc08a119f
 stack pointer            = 0x28:0xe4521b44
 frame pointer            = 0x28:0xe4521b5c
 code segment        = base 0x0, limit 0xfffff, type 0x1b
             = DPL 0, pres 1, def32 1, gran 1
 processor eflags    = interrupt enabled, resume, IOPL = 0
 current process        = 12 (swi6: task queue)
 panic: from debugger
 cpuid = 0
 Uptime: 21m2s
 Physical memory: 1011 MB
 Dumping 138 MB: 123 107 91 75 59 43 27 11
 

From: David Evans <dave.evans55@googlemail.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/150186: [parallels] [panic] Parallels Desktop: CDROM disconnected
 leads to panic, eventually
Date: Mon, 13 Sep 2010 12:17:28 +0100

  I have now tried a kernel built with the ATA_CAM option.
 
 I then cloned another VM from my eight.pearl development system.
 I set the CDrom to installed and disconnected, and also made sure
 that dbus and hald were running with the new kernel.
 
 This VM now passes the "build the world seven times" test with no
 panics. This means 14 hours of continuous running, which is far longer
 than the 20 minutes it managed before.
 
 This is good news.
 

From: David Evans <dave.evans55@googlemail.com>
To: bug-followup@FreeBSD.org, dave.evans55@googlemail.com
Cc:  
Subject: Re: kern/150186: [parallels] [panic] Parallels Desktop: CDROM disconnected
 leads to panic, eventually
Date: Mon, 13 Sep 2010 12:32:11 +0100

  I forgot to mention in that last posting that I had accidentally omitted to
 include  "devices ada" in my Kernel configuration.  It did not seem to affect
 anything as ada0 etc still appeared in my /dev/ directory.
 
State-Changed-From-To: feedback->analyzed 
State-Changed-By: jh 
State-Changed-When: Sat Sep 25 17:26:05 UTC 2010 
State-Changed-Why:  
Feedback received. Seems to be an issue with acd(4) or ata(4). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=150186 

From: "Li-Lun Wang (Leland Wang)" <llwang@infor.org>
To: bug-followup@FreeBSD.org, dave.evans55@googlemail.com
Cc:  
Subject: Re: kern/150186: [parallels] [panic] Parallels Desktop: CDROM
 disconnected leads to panic, eventually
Date: Fri, 17 Dec 2010 03:13:51 +0800

 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1
 
 Hi,
 
 I think I may have stumbled upon the same issue after I updated my
 installed ports, including hald.  I run a FreeBSD 8.1-stable amd64 (not
 the latest but a few months old) in virtual box on a windows 7 x64
 host.  If I disable hald or the cdrom device in virtual box, or run the
 same FreeBSD installation natively, the problem doesn't seem to occur.
 When the problem does occur, I get the following messages (not
 necessarily in any particular order):
 
 ata0: WARNING - unknown CMD (0x4a) read data overrun 18>8
 ata0: WARNING - READ_TOC read data overrun 18>12
 ata0: WARNING - PREVENT_ALLOW read data overrun 18>0
 ata0: WARNING - TEST_UNIT_READY read data overrun 18>0
 
 These messages repeat seemingly at random for a few times.  Eventually
 the box might panic.  Here is a backtrace:
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address	= 0x290
 fault code		= supervisor read data, page not present
 instruction pointer	= 0x20:0xffffffff8025aec5
 stack pointer		= 0x28:0xffffff800007bae0
 frame pointer		= 0x28:0xffffff800007bb00
 code segment		= base 0x0, limit 0xfffff, type 0x1b
 			= DPL 0, pres 1, long 1, def32 0, gran 1
 processor eflags	= interrupt enabled, resume, IOPL=0
 current process		= 11 (swi6: task queue)
 [thread pid 11 tid 100018 ]
 Stopped at	_mtx_lock_sleep+0x4e:	movl	0x290(%rcx),%esi
 db> bt
 Tracing pid 11 tid 100018 td 0xffffff00024bcba0
 _mtx_lock_sleep() at _mtx_lock_sleep+0x4e
 _sema_post() at _sema_post+0x89
 ata_completed() at ata_completed+0x46e
 taskqueue_run() at taskqueue_run+0x94
 intr_event_execute_handlers() at intr_event_execute_handlers+0xf9
 ithread_loop() at ithread_loop+0x8e
 fork_exit() at fork_exit+0x118
 fork_trampoline() at fork_trampoline+0xe
 - --- trap 0, rip = 0, rsp = 0xffffff800007bd30, rdp = 0 ---
 db>
 
 - -- llwang
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.12 (FreeBSD)
 
 iD8DBQFNCmTrCQM7t5B2mhARAoHkAJ9zNOQ8QApAP5gDgmSgUABt39es8wCghPyr
 g82JpbmVYsjLnyGkU+/JQ3k=
 =VHTI
 -----END PGP SIGNATURE-----
>Unformatted:
