From nobody@FreeBSD.org  Mon May 28 18:31:13 2012
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7A16D1065673
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 28 May 2012 18:31:13 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id 5AA108FC0C
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 28 May 2012 18:31:13 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.4/8.14.4) with ESMTP id q4SIVD00001369
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 28 May 2012 18:31:13 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.4/8.14.4/Submit) id q4SIVDMt001368;
	Mon, 28 May 2012 18:31:13 GMT
	(envelope-from nobody)
Message-Id: <201205281831.q4SIVDMt001368@red.freebsd.org>
Date: Mon, 28 May 2012 18:31:13 GMT
From: Mark Felder <feld@feld.me>
To: freebsd-gnats-submit@FreeBSD.org
Subject: OS hangs when guest on VMWare ESX
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         168416
>Category:       kern
>Synopsis:       [hang] OS hangs when guest on VMWare ESX
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon May 28 18:40:01 UTC 2012
>Closed-Date:    
>Last-Modified:  Tue Jul 24 17:00:24 UTC 2012
>Originator:     Mark Felder
>Release:        8.x, 9.x
>Organization:
>Environment:
Known to happen on any 8.x or 9.x installations on ESX/ESXi 4.0, 4.1, and 5.0 with any patch level.
>Description:
This problem has been discussed in depth on the freebsd-hackers@ and freebsd-questions@ mailing lists under the subject "Please help me diagnose this crazy VMWare/FreeBSD 8.x crash". Q3 2012 will mark 2 years since we started trying to track down this bug.

Symptoms:
FreeBSD hangs with what appears to be I/O starvation. Anything in memory functions, but any processes that need disk access hang in a blocked state. VMWare's performance info shows that at the time of the issue CPU usage of the VM spikes to 100%. There is no panic even if you leave the VM running for long periods of time. Pausing the VM does not fix it, and trying to migrate it to another host hoping that will kick I/O back to life does not work either. The only recourse is a hard reboot.

Things We Have Done (everything imaginable, really):
- Rebuilt crashing VMs and VM templates from scratch with verified media multiple times, even changing OS versions -- 8.0 through 8.3, i386 and amd64. We have very few 9.0 servers and haven't seen a crash there yet, but Dane Foster confirmed on 9.0.
- Changed every VM setting imaginable -- including undocumented things suggested by VMWare Support
- Ruled out specific software. There is no common denominator so far; just FreeBSD
- Replaced ESX hardware
- Replaced iSCSI switches
- Replaced SANs
- Verified that it crashes on Local Disk -- a SAN is not required for this crash to happen
- Changed ESX versions (can replicate this on ESXi 4.0 - 5.0)


Possibly Valuable Info:
On one machine that started to crash regularly I built a full debugging kernel and managed to drop it into DDB when it finally crashed. Results can be seen here, but not sure of the value: http://feld.me/pub/freebsd/esx_crash/

After further discussion on the lists it was declared that the problem might be interrupts. This helped us to narrow down the issue which we think is on the right track.

A trend I immediately noticed:

VMs that are known to crash share an IRQ with em0 and mpt0:

$ vmstat -i
interrupt                          total       rate
irq1: atkbd0                         378          0
irq6: fdc0                             9          0
irq15: ata1                           34          0
irq16: em1                        687237          1
irq18: em0 mpt0                319094024        539
cpu0: timer                    236770821        400
Total                          556552503        940

VMs that we have never seen crash before are not sharing an IRQ:

$ vmstat -i
interrupt                          total       rate
irq1: atkbd0                          38          0
irq6: fdc0                             9          0
irq15: ata1                           34          0
irq16: em1                          2811         15
irq17: em2                             5          0
cpu0: timer                        71013        398
irq256: mpt0                       12163         68
Total                              86073        483

It was suggested that we could use hint.mpt.0.msi_enable="1" to prevent that behavior and possibly prevent the crash. So far the effectiveness of this is unconfirmed.

Dane Foster can no longer reproduce the crash on demand when he applies the following settings to FreeBSD 8.x (does not work on 9.x) and leaving the em0 NIC unused (disconnected in VMWare -- no LINK; pretends it's unplugged)

hw.pci.enable_msi="0"
hw.pci.enable_msix="0"

His results are as follows:

samael:~:% vmstat -i
interrupt                          total       rate
irq1: atkbd0                           6          0
irq18: em0 mpt0                  3061100         15
irq19: em1                       6891706         35
cpu0: timer                    166383735        868
cpu1: timer                    166382123        868
cpu3: timer                    166382123        868
cpu2: timer                    166382121        868
Total                          675482914       3525



I hope this is enough information. If any other details are required please let me know. I believe both Dane and I are ready and willing to test any suggested workarounds or patches that are made available.
>How-To-Repeat:
At this moment I do not have a way to reliably reproduce it with our workload. Some of our VMs can go nearly 90 days without a crash. Some will begin to crash multiple times a week, and then mysteriously stop. It is very unpredictable for us. 

Dane Foster can reproduce it at will with his workload (handbrake video encoding) and we will make VM images available and provide detailed instructions on how to reproduce it.
>Fix:
FreeBSD 7.x is unaffected, which is our fix on machines we have declared too valuable to have crash -- reverting to 7.4, or cloning the server off to its own hardware. Unfortunately 7.4 is getting quite old and the support will be gone early next year so a solution is desperately needed.

>Release-Note:
>Audit-Trail:

From: Mark Felder <feld@feld.me>
To: bug-followup@freebsd.org
Cc:  
Subject: Re: kern/168416: [hang] OS hangs when guest on VMWare ESX
Date: Thu, 7 Jun 2012 13:42:33 -0500

 I have wonderful news: we can now reproduce the crash on demand. We  
 discovered that if we stress em and mpt at the same time by doing I/O on a  
 HAST device, we can easily reproduce this issue.
 
 I also have a coredump I took from breaking into DDB and running "dump"  
 and also a picture of the backtrace:
 
 http://feld.me/pub/freebsd/esx_crash/bt.png
 http://feld.me/pub/freebsd/esx_crash/vmcore.0.gz
 
 
 Requirements:
 
 VMWare ESXi 5: RAM 1GB CPUs 1
 FreeBSD 9 (-RELEASE, or STABLE... produced this coredump on -STABLE from  
 Jun 3rd)
 HAST
 iozone or bonnie++ (seems that iozone crashes it faster and more  
 consistently)
 
 / 40GB  UFS+SUJ
 /dev/hast/hast0 (mounted on /mnt) 8GB   UFS+SUJ
 SWAP 2GB (I put my swap on a separate disk as well to help aid getting a  
 successful dump.)
 
 So in this environment I have 2 servers (node1 and node2) for proper HAST,  
 so it actually does try to transfer changes to the secondary. It's merely  
 there to receive the data; it's not otherwise involved in this test.
 
 hast.conf:
 # global section
 timeout 5
 compression hole
 
 resource hast0 {
 	on node1 {
 		local /dev/da1
 		remote 192.168.44.2
 	}
 	on node2 {
 		local /dev/da1
 		remote 192.168.44.1
 	}
 }
 	
 
 Kernel config "DEBUG" I used for getting this coredump:
 
 include GENERIC
 makeoptions	DEBUG=-g
 options	INVARIANTS
 options	INVARIANT_SUPPORT
 options	WITNESS
 options	DEBUG_LOCKS
 options	DEBUG_VFS_LOCKS
 options	DIAGNOSTIC
 options KDB
 options DDB
 options DDB
 options BREAK_TO_DEBUGGER
 options ALT_BREAK_TO_DEBUGGER
 options KTR
 options KTR_ENTRIES=1024
 options KTR_COMPILE=(KTR|KTR_PROC)
 options KTR_MASK=KTR_SCHED
 options KTR_CPUMASK=("0x3")
 options KTR_VERBOSE
 
 
 And the iozone command that works quite consistently (I ran it in a loop  
 just in case it wouldn't crash the first time):
 iozone -M -e -+u -T -t 8 -r 128k -s 40960 -i 0 -i 1 -i 2 -i 8 -+p 70 -C -F  
 /mnt/io.1 /mnt/io.2 /mnt/io.3 /mnt/io.4 /mnt/io.5 /mnt/io.6 /mnt/io.7  
 /mnt/io.8
 
 Bonnie++ command we can get to cause the crash sometimes:
 bonnie++ -u root -d /mnt/ -s 3552M -n 10:102400:1024:1024
 
 The only other tip I have for you if you want to rebuild this entire  
 environment is to change your keybind. You can't break in to the debugger  
 on VMWare with CTRL+ALT+ESC because CTRL+ALT drops your focus on the VM.  
 You have to override this. I tend to use CTRL+ALT+SHIFT. The following is  
 how you change that:
 
 XP: C:\Documents and Settings\USERNAME\Application  
 Data\VMware\preferences.ini
 Vista/7: C:\users\<username>\appdata\roaming\vmware\preferences.ini
 
 pref.hotkey.shift = "true"
 pref.hotkey.control = "true"
 pref.hotkey.alt = "true"
 
 
 Please let me know if there is anything I can do to help aid in resolving  
 this issue.

From: Mark Felder <feld@feld.me>
To: bug-followup@freebsd.org
Cc:  
Subject: Re: kern/168416: [hang] OS hangs when guest on VMWare ESX
Date: Thu, 7 Jun 2012 15:57:22 -0500

 I've been informed that without exact kernel used the vmcore.0 is not of  
 value. I've updated the vmcore.0.gz file (md5:  
 e2d2fb3d3f4601d6a4055a939d547dbd ) and have also uploaded the kernel which  
 was built with the same specs as previously defined, but is based on  
 RELENG_9_0
 
 http://feld.me/pub/freebsd/esx_crash/vmcore.0.gz
 http://feld.me/pub/freebsd/esx_crash/kernel.tar.gz
 
 I guess the backtrace screenshot was useless because it was just from  
 entering the DDB. That makes sense because this doesn't actually do a real  
 crash/panic. I've provided a couple others that might be useful, though:
 
 http://feld.me/pub/freebsd/esx_crash/chains.png
 http://feld.me/pub/freebsd/esx_crash/showthreads.png
 
 Hopefully this will be more useful.

From: Mark Felder <feld@feld.me>
To: bug-followup@freebsd.org
Cc:  
Subject: Re: kern/168416: [hang] OS hangs when guest on VMWare ESX
Date: Tue, 24 Jul 2012 11:56:11 -0500

 Just wanted to follow up with news that I have repeated this crash on  
 FreeBSD 10 / HEAD from 7-24-12 as well.
>Unformatted:
