From nobody@FreeBSD.org  Fri Mar 23 16:18:44 2012
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 0DAC61065676
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 23 Mar 2012 16:18:44 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id D33BF8FC1F
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 23 Mar 2012 16:18:43 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.4/8.14.4) with ESMTP id q2NGIhPk036717
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 23 Mar 2012 16:18:43 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.4/8.14.4/Submit) id q2NGIhp4036716;
	Fri, 23 Mar 2012 16:18:43 GMT
	(envelope-from nobody)
Message-Id: <201203231618.q2NGIhp4036716@red.freebsd.org>
Date: Fri, 23 Mar 2012 16:18:43 GMT
From: Christian Esken <christian.esken@trivago.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: Process under FreeBSD 9.0 hangs in uninterruptable sleep with apparently no syscall (empty wchan)
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         166340
>Category:       kern
>Synopsis:       [kernel] Process under FreeBSD 9.0 hangs in uninterruptable sleep with apparently no syscall (empty wchan)
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Mar 23 16:20:11 UTC 2012
>Closed-Date:    Thu Feb 28 09:32:21 UTC 2013
>Last-Modified:  Thu Feb 28 09:32:21 UTC 2013
>Originator:     Christian Esken
>Release:        FreeBSD 9.0
>Organization:
>Environment:
FreeBSD dev 9.0-RELEASE FreeBSD 9.0-RELEASE #0: Thu Mar  1 17:17:19 CET 2012     root@dev:/usr/obj/usr/src/sys/GENERIC  amd64
>Description:
I have a process that sometimes goes into uninterruptable sleep under FreeBSD 9.0. "ps" shows the process is in "D" state and wchan is "-". The process works reliably under various other operating systems, including misc FreeBSD 8 versions.

I presume it should never happen that a process is in uninterruptible sleep without showing the kernel call in wchan. What could be the reason, or how can I further tackle the situation?

The nature of the process is a remote logger. It reads access logs from a pipe/STDIN (Apache httpd CustomLog) and sends them via TCP to a "scribed" logging server. More details can be found on Stackoverflow [1].

Below is some more information, like "procstat -k" and uname output.

---------------

This is the result from "procstat -k" and  "procstat -kk"

# procstat -k 38013
  PID    TID COMM             TDNAME           KSTACK                       
38013 100817 serelog          -                mi_switch sleepq_check_timeout sleepq_timedwait_sig _sleep soreceive_generic kern_recvit recvit sys_recvfrom amd64_syscall Xfast_syscall 

# procstat -kk 38013
  PID    TID COMM             TDNAME           KSTACK                       
38013 100817 serelog          -                mi_switch+0x174 sleepq_check_timeout+0x80 sleepq_timedwait_sig+0x20 _sleep+0x1b1 soreceive_generic+0xf95 kern_recvit+0x205 recvit+0x21 sys_recvfrom+0x82 amd64_syscall+0x450 Xfast_syscall+0xf7 

---------------

uname -a:
FreeBSD dev 9.0-RELEASE FreeBSD 9.0-RELEASE #0: Thu Mar  1 17:17:19 CET 2012     root@dev:/usr/obj/usr/src/sys/GENERIC  amd64

There is no NFS involved on the system.

Best regards,
     Christian

[1] http://stackoverflow.com/questions/9786501/process-under-freebsd-9-0-hangs-in-uninterruptable-sleep-with-apparently-no-sysc
>How-To-Repeat:
I can reproduce the issue quickly by sending a lot of HTTP requests. The problem does not occur when tracing the process with truss.
If not solvable by just the description, I will isolate a small test case from the full source code.
>Fix:


>Release-Note:
>Audit-Trail:

From: Andriy Gapon <avg@FreeBSD.org>
To: bug-followup@FreeBSD.org, christian.esken@trivago.com
Cc:  
Subject: Re: misc/166340: Process under FreeBSD 9.0 hangs in uninterruptable
 sleep with apparently no syscall (empty wchan)
Date: Fri, 23 Mar 2012 23:17:12 +0200

 Hmm, sleepq_timedwait_sig() is supposed to react to signals.
 At least that's what its description says:
 
 /*
  * Block the current thread until it is awakened from its sleep queue,
  * it is interrupted by a signal, or it times out waiting to be awakened.
  */
 int
 sleepq_timedwait_sig(void *wchan, int pri)
 {
         int rcatch, rvalt, rvals;
 
         rcatch = sleepq_catch_signals(wchan, pri);
         rvalt = sleepq_check_timeout();
         rvals = sleepq_check_signals();
         thread_unlock(curthread);
         if (rcatch)
                 return (rcatch);
         if (rvals)
                 return (rvals);
         return (rvalt);
 }
 
 
 -- 
 Andriy Gapon

From: Kostik Belousov <kostikbel@gmail.com>
To: bug-followup@FreeBSD.org, christian.esken@trivago.com, avg@freebsd.org
Cc:  
Subject: Re: misc/166340: Process under FreeBSD 9.0 hangs in uninterruptable
 sleep with apparently no syscall (empty wchan)
Date: Sat, 24 Mar 2012 00:54:03 +0200

 Please, attach the kgdb to the running system when the process hang
 with the '-' wchan.
 Use the command like "kgdb /boot/kernel/kernel.symbols /dev/mem".
 
 Then, run the shell command ps -o pid,paddr | grep <pid> where pid is
 the pid of the hung
 process. Take the printed address A and, from kgdb, do:
 p *(struct proc *)A
 p/x *(((struct proc *)A)->p_threads.tqh_first)
 and show us the output.

From: Christian Esken <Christian.Esken@trivago.com>
To: Kostik Belousov <kostikbel@gmail.com>
Cc: bug-followup@FreeBSD.org, avg@freebsd.org
Subject: Re: misc/166340: Process under FreeBSD 9.0 hangs in uninterruptable
 sleep with apparently no syscall (empty wchan)
Date: Mon, 26 Mar 2012 11:58:12 +0200

 Am Samstag, den 24.03.2012, 00:54 +0200 schrieb Kostik Belousov:
 > Please, attach the kgdb to the running system when the process hang
 > with the '-' wchan.
 > Use the command like "kgdb /boot/kernel/kernel.symbols /dev/mem".
 > 
 > Then, run the shell command ps -o pid,paddr | grep <pid> where pid is
 > the pid of the hung
 > process. Take the printed address A and, from kgdb, do:
 > p *(struct proc *)A
 > p/x *(((struct proc *)A)->p_threads.tqh_first)
 > and show us the output.
 
 Thanks for the quick response. Requested information can be found below.
 I am also showing signal status, as signals seem to be relevant in
 the kernel code section shown by Andriy.
 
 
  Christian
 
 # ps  -o user,pid,ppid,pending,caught,ignored,blocked,stat,wchan  1780
 USER     PID  PPID  PENDING   CAUGHT  IGNORED  BLOCKED STAT WCHAN
 nobody  1780  1776        0 80005006 79fa8011        0 D    -
 
 # ps wwwaux -o pid,paddr | grep 1780
 nobody   1780   0.0  0.0  55252   6040  ??  D    12:11PM
 0:01.09 /usr/local/bin/s  1780 fffffe003eb71488
 
 
 (kgdb) p *(struct proc *)0xfffffe003eb71488
 $1 = {p_list = {le_next = 0xfffffe003eb71910, le_prev = 0xfffffe003e20f488}, p_threads = {
     tqh_first = 0xfffffe000bcf7460, tqh_last = 0xfffffe000bcf7470}, p_slock = {lock_object = {
       lo_name = 0xffffffff80ccb0fc "process slock", lo_flags = 720896, lo_data = 0, 
       lo_witness = 0xffffff8000689f80}, mtx_lock = 4}, p_ucred = 0xfffffe003e850200, 
   p_fd = 0xfffffe003ec9aa00, p_fdtol = 0x0, p_stats = 0xfffffe003eca3400, 
   p_limit = 0xfffffe0008372500, p_limco = {c_links = {sle = {sle_next = 0x0}, tqe = {
         tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, 
     c_lock = 0xfffffe003eb71580, c_flags = 0, c_cpu = 0}, p_sigacts = 0xfffffe003e9de000, 
   p_flag = 268452096, p_state = PRS_NORMAL, p_pid = 1780, p_hash = {le_next = 0x0, 
     le_prev = 0xfffffe03889e00b8}, p_pglist = {le_next = 0xfffffe003eb71910, 
     le_prev = 0xfffffe01936a39d8}, p_pptr = 0xfffffe003e99b488, p_sibling = {
     le_next = 0xfffffe003eb71910, le_prev = 0xfffffe01936a39f0}, p_children = {
     lh_first = 0x0}, p_mtx = {lock_object = {lo_name = 0xffffffff80ccb0ef "process lock", 
       lo_flags = 21168128, lo_data = 0, lo_witness = 0xffffff8000688400}, mtx_lock = 4}, 
   p_ksi = 0xfffffe000b0a6380, p_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {
       __bits = {0, 0, 0, 0}}, sq_list = {tqh_first = 0x0, tqh_last = 0xfffffe003eb715c8}, 
     sq_proc = 0xfffffe003eb71488, sq_flags = 1}, p_oppid = 0, p_dbg_child = 0, 
   p_vmspace = 0xfffffe000b1e7620, p_swtick = 249340, p_realtimer = {it_interval = {
       tv_sec = 0, tv_usec = 0}, it_value = {tv_sec = 0, tv_usec = 0}}, p_ru = {ru_utime = {
       tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 0, 
     ru_ixrss = 0, ru_idrss = 0, ru_isrss = 0, ru_minflt = 0, ru_majflt = 0, ru_nswap = 0, 
     ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, 
     ru_nvcsw = 0, ru_nivcsw = 0}, p_rux = {rux_runtime = 2178328093, rux_uticks = 77, 
     rux_sticks = 39, rux_iticks = 0, rux_uu = 722939, rux_su = 366163, rux_tu = 1089103}, 
   p_crux = {rux_runtime = 0, rux_uticks = 0, rux_sticks = 0, rux_iticks = 0, rux_uu = 0, 
     rux_su = 0, rux_tu = 0}, p_profthreads = 0, p_exitthreads = 0, p_traceflag = 0, 
   p_tracevp = 0x0, p_tracecred = 0x0, p_textvp = 0xfffffe003ebdfb40, p_lock = 0, 
   p_sigiolst = {slh_first = 0x0}, p_sigparent = 20, p_sig = 0, p_code = 0, p_stops = 0, 
   p_stype = 0, p_step = 0 '\0', p_pfsflags = 0 '\0', p_nlminfo = 0x0, p_aioinfo = 0x0, 
   p_singlethread = 0x0, p_suspcount = 0, p_xthread = 0x0, p_boundary_count = 0, 
   p_pendingcnt = 0, p_itimers = 0x0, p_procdesc = 0x0, p_magic = 3203398350, 
   p_osrel = 900044, p_comm = "serelog", '\0' <repeats 12 times>, p_pgrp = 0xfffffe003e76a200, 
   p_sysent = 0xffffffff81066440, p_args = 0xfffffe003e77f700, 
   p_cpulimit = 9223372036854775807, p_nice = 0 '\0', p_fibnum = 0, p_xstat = 0, p_klist = {
     kl_list = {slh_first = 0x0}, kl_lock = 0xffffffff807d4880 <knlist_mtx_lock>, 
     kl_unlock = 0xffffffff807d48a0 <knlist_mtx_unlock>, 
     kl_assert_locked = 0xffffffff807d4700 <knlist_mtx_assert_locked>, 
     kl_assert_unlocked = 0xffffffff807d4710 <knlist_mtx_assert_unlocked>, 
     kl_lockarg = 0xfffffe003eb71580}, p_numthreads = 1, p_md = {md_ldt = 0x0, md_ldt_sd = {
       sd_lolimit = 0, sd_lobase = 0, sd_type = 0, sd_dpl = 0, sd_p = 0, sd_hilimit = 0, 
       sd_xx0 = 0, sd_gran = 0, sd_hibase = 0, sd_xx1 = 0, sd_mbz = 0, sd_xx2 = 0}}, 
   p_itcallout = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, 
     c_time = 0, c_arg = 0x0, c_func = 0, c_lock = 0x0, c_flags = 16, c_cpu = 0}, 
   p_acflag = 0, p_peers = 0x0, p_leader = 0xfffffe003eb71488, p_emuldata = 0x0, 
   p_label = 0x0, p_sched = 0xfffffe003eb71910, p_ktr = {stqh_first = 0x0, 
     stqh_last = 0xfffffe003eb718c0}, p_mqnotifier = {lh_first = 0x0}, p_dtrace = 0x0, 
   p_pwait = {cv_description = 0xffffffff80ccb64f "ppwait", cv_waiters = 0}, p_dbgwait = {
     cv_description = 0xffffffff80ccb656 "dbgwait", cv_waiters = 0}, p_prev_runtime = 0, 
   p_racct = 0x0}
 Current language:  auto; currently asm
 (kgdb) p/x *(((struct proc *)0xfffffe003eb71488)->p_threads.tqh_first)
 $2 = {td_lock = 0xffffffff810d9400, td_proc = 0xfffffe003eb71488, td_plist = {tqe_next = 0x0,      
     tqe_prev = 0xfffffe003eb71498}, td_runq = {tqe_next = 0x0,                                     
     tqe_prev = 0xffffffff810d9628}, td_slpq = {tqe_next = 0x0, 
     tqe_prev = 0xfffffe000b1f0b80}, td_lockq = {tqe_next = 0x0,                                       
     tqe_prev = 0xffffff8465523920}, td_hash = {le_next = 0x0, le_prev = 0xffffff80006b3c10}, 
   td_cpuset = 0xfffffe0008349dc8, td_sel = 0xfffffe003e77f900,                                                       
   td_sleepqueue = 0xfffffe000b1f0b80, td_turnstile = 0xfffffe000bc6ec00,                                             
   td_umtxq = 0xfffffe000bce0680, td_tid = 0x18b82, td_sigqueue = {sq_signals = {__bits = {                           
         0x0, 0x0, 0x0, 0x0}}, sq_kill = {__bits = {0x0, 0x0, 0x0, 0x0}}, sq_list = {                                 
       tqh_first = 0x0, tqh_last = 0xfffffe000bcf7510}, sq_proc = 0xfffffe003eb71488,                                 
     sq_flags = 0x1}, td_lend_user_pri = 0xff, td_flags = 0x14, td_inhibitors = 0x2,                                  
   td_pflags = 0x0, td_dupfd = 0x0, td_sqqueue = 0x0, td_wchan = 0x0, td_wmesg = 0x0,                                 
   td_lastcpu = 0x3, td_oncpu = 0xff, td_owepreempt = 0x0, td_tsqueue = 0x0, td_locks = 0x1,                          
   td_rw_rlocks = 0x0, td_lk_slocks = 0x0, td_blocked = 0x0, td_lockname = 0x0, 
   td_contested = {lh_first = 0x0}, td_sleeplocks = 0xffffffff81249590,                                                              
   td_intr_nesting_level = 0x0, td_pinned = 0x0, td_ucred = 0xfffffe003e850200,                                                      
   td_estcpu = 0x0, td_slptick = 0x0, td_blktick = 0x0, td_swvoltick = 0x83351, td_ru = {                                            
     ru_utime = {tv_sec = 0x0, tv_usec = 0x0}, ru_stime = {tv_sec = 0x0, tv_usec = 0x0},                                             
     ru_maxrss = 0x1788, ru_ixrss = 0xd5e0, ru_idrss = 0x2c5cc, ru_isrss = 0x3a00,                                                   
     ru_minflt = 0x1e4, ru_majflt = 0x0, ru_nswap = 0x0, ru_inblock = 0x0, ru_oublock = 0x0,                                         
     ru_msgsnd = 0x1e18, ru_msgrcv = 0x3c2e, ru_nsignals = 0x0, ru_nvcsw = 0x6eef,                                                   
     ru_nivcsw = 0xcd7}, td_rux = {rux_runtime = 0x81d6a61d, rux_uticks = 0x4d, 
     rux_sticks = 0x27, rux_iticks = 0x0, rux_uu = 0xb07fb, rux_su = 0x59653,                                                                   
     rux_tu = 0x109e4f}, td_incruntime = 0x0, td_runtime = 0x81d6a61d, td_pticks = 0x0,                                                         
   td_sticks = 0x0, td_iticks = 0x0, td_uticks = 0x0, td_intrval = 0x0, td_oldsigmask = {                                                       
     __bits = {0x0, 0x0, 0x0, 0x0}}, td_sigmask = {__bits = {0x0, 0x0, 0x0, 0x0}},                                                              
   td_generation = 0x7bc6, td_sigstk = {ss_sp = 0x0, ss_size = 0x0, ss_flags = 0x4},                                                            
   td_xsig = 0x0, td_profil_addr = 0x0, td_profil_ticks = 0x0, td_name = {0x73, 0x65, 0x72, 
     0x65, 0x6c, 0x6f, 0x67, 0x0 <repeats 13 times>}, td_fpop = 0x0, td_dbgflags = 0x0, 
   td_dbgksi = {ksi_link = {tqe_next = 0x0, tqe_prev = 0x0}, ksi_info = {si_signo = 0x0, 
       si_errno = 0x0, si_code = 0x0, si_pid = 0x0, si_uid = 0x0, si_status = 0x0, 
       si_addr = 0x0, si_value = {sival_int = 0x0, sival_ptr = 0x0, sigval_int = 0x0, 
         sigval_ptr = 0x0}, _reason = {_fault = {_trapno = 0x0}, _timer = {_timerid = 0x0, 
           _overrun = 0x0}, _mesgq = {_mqd = 0x0}, _poll = {_band = 0x0}, __spare__ = {
           __spare1__ = 0x0, __spare2__ = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}}}, 
     ksi_flags = 0x0, ksi_sigq = 0x0}, td_ng_outbound = 0x0, td_osd = {osd_nslots = 0x0, 
     osd_slots = 0x0, osd_next = {le_next = 0x0, le_prev = 0x0}}, td_map_def_user = 0x0, 
   td_dbg_forked = 0x0, td_rqindex = 0x1e, td_base_pri = 0x7a, td_priority = 0x7a, 
   td_pri_class = 0x3, td_user_pri = 0x7a, td_base_user_pri = 0x7a, 
   td_pcb = 0xffffff8465262d10, td_state = 0x1, td_retval = {0x0, 0x4}, td_slpcallout = {
     c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, 
         tqe_prev = 0xffffff800083c110}}, c_time = 0x35fa11, c_arg = 0xfffffe000bcf7460, 
     c_func = 0xffffffff8083da40, c_lock = 0x0, c_flags = 0x16, c_cpu = 0x3}, 
   td_frame = 0xffffff8465262c50, td_kstack_obj = 0xfffffe003ecfad80, 
   td_kstack = 0xffffff846525f000, td_kstack_pages = 0x4, td_critnest = 0x1, td_md = {
     md_spinlock_count = 0x1, md_saved_flags = 0x246}, td_sched = 0xfffffe000bcf7888, 
   td_ar = 0x0, td_lprof = {{lh_first = 0x0}, {lh_first = 0x0}}, td_dtrace = 0x0, 
   td_errno = 0x0, td_vnet = 0x0, td_vnet_lpush = 0x0, td_intr_frame = 0x0}
 
 
 

From: Konstantin Belousov <kostikbel@gmail.com>
To: Christian Esken <Christian.Esken@trivago.com>
Cc: bug-followup@freebsd.org, avg@freebsd.org
Subject: Re: misc/166340: Process under FreeBSD 9.0 hangs in uninterruptable sleep with apparently no syscall (empty wchan)
Date: Mon, 26 Mar 2012 16:51:42 +0300

 Thank you for the data. Semi-obviously, the callout_stop() call in
 sleepq_check_timeout() have to return 0, otherwise we would not call
 mi_switch() there. But I do not see how this can happen, because
 the callout state, printed from kgdb, still indicates that callout
 is pending. Callout cannot be reset while in sleepq code.
 
 So there are two possible routes to go forward: preferrable is for
 you to extract the self-contained C program that would illustrate
 the issue and send this sample to me. Second is to recompile your
 kernel with INVARIANTS/WITNESS and possibly KTR and see what happen.

From: Christian Esken <Christian.Esken@trivago.com>
To: bug-followup@freebsd.org
Cc: Konstantin Belousov <kostikbel@gmail.com>, avg@freebsd.org
Subject: Re: misc/166340: Process under FreeBSD 9.0 hangs in
 uninterruptable sleep with apparently no syscall (empty wchan)
Date: Tue, 27 Mar 2012 17:30:27 +0200

 --=-0r5Lk3awdhxqDAjyo6lR
 Content-Type: text/plain; charset="UTF-8"
 Content-Transfer-Encoding: 7bit
 
 Konstantin Belousov wrote:
 > Thank you for the data. Semi-obviously, the callout_stop() call in
 > sleepq_check_timeout() have to return 0, otherwise we would not call
 > mi_switch() there. But I do not see how this can happen, because
 > the callout state, printed from kgdb, still indicates that callout
 > is pending. Callout cannot be reset while in sleepq code.
 > 
 > So there are two possible routes to go forward: preferrable is for
 > you to extract the self-contained C program that would illustrate
 > the issue and send this sample to me. Second is to recompile your
 > kernel with INVARIANTS/WITNESS and possibly KTR and see what happen.
 
 I repeated the test with INVARIANTS/WITNESS and KTR compiled in
 (actually WITNESS was already included during the last test).
 
 I ran KTR with nothing filtered out, and formatted the dump with
 "ktrdump -cftH  -i ktr.out". The whole log is excessive (1GB), so
 I have extrated two short sections (see attachment).
 
 The first section shows the last action of the application, namely a
 succselful sendto() to a TCP socket, and then waiting for an answer via
 recvfrom().
 The second section illustrates the lock/unlock sequence of the sleep
 mutex for the recfrom(). It goes like LOCK, LOCK, UNLOCK.
 
 This time the signal status is different. We have a pending signal:
 USER     PID  PPID  PENDING   CAUGHT  IGNORED  BLOCKED STAT WCHAN
 nobody  9163     1     4000 80005006 79f88010        0 D    -     
 
 Looks like SIGPROF (27). Just wondering where it comes from.
 
 
 
 By the way: I evaluated the possibility to implement a standalone test
 case. It would be extremely complicated, as the issue is while writing
 to the socket, and thus it would require extracting the socket code from
 the Thrift procect (http://thrift.apache.org/ ).
 
   Christian
 
 
 
 --=-0r5Lk3awdhxqDAjyo6lR
 Content-Disposition: attachment; filename="wait_recvfrom.txt"
 Content-Type: text/plain; name="wait_recvfrom.txt"; charset="UTF-8"
 Content-Transfer-Encoding: 7bit
 
 Last actions of pid 9163 (serelog):
 sendto() succesful
 recvfrom() waits for data, uisng sleep mutex
 
 
 7463551   5    5644560314159 /usr/src/sys/kern/kern_sx.c:352          0xfffffe01972b2480 XUNLOCK (sx) so_snd_sx 0xfffffe0344b07490 r = 0 at /usr/src/sys/kern/uipc_sockbuf.c:160
 7463552   1    5644560316280 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe01bcf4d000 LOCK (sleep mutex) kqueue 0xfffffe03eef5eb00 r = 0 at /usr/src/sys/kern/kern_event.c:1779
 7463553   5    5644560319107 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:149 0xfffffe01972b2480 syscall: p=0xfffffe01bc1b0910 error=0 return 0x77d 0x77d
 7463557   5    5644560329931 /usr/src/sys/kern/subr_trap.c:101        0xfffffe01972b2480 userret: thread 0xfffffe01972b2480 (pid 9163, serelog)
 7463559   4    5644560336733 /usr/src/sys/vm/uma_core.c:1975          0xfffffe0008364480 uma_zalloc_arg thread 8364480 zone mbuf flags 1
 7463561   6    5644560344432 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:84 0xfffffe01972ae900 syscall: td=0xfffffe01972ae900 pid 9767 gettimeofday (0x7fffffffac40, 0, 0x43fd18)
 7463562   1    5644560347528 /usr/src/sys/kern/kern_mutex.c:222       0xfffffe01bcf4d000 UNLOCK (sleep mutex) kqueue 0xfffffe03eef5eb00 r = 0 at /usr/src/sys/kern/kern_event.c:1796
 7463563   6    5644560348788 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:149 0xfffffe01972ae900 syscall: p=0xfffffe01bc1ad000 error=0 return 0 0x43fd18
 7463564   5    5644560351047 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:187 0xfffffe01972b2480 syscall sendto exit thread 0xfffffe01972b2480 pid 9163 proc serelog
 7463565   1    5644560354848 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe01bcf4d000 LOCK (sleep mutex) kqueue 0xfffffe02f7525700 r = 0 at /usr/src/sys/kern/kern_event.c:1779
 7463567   5    5644560360499 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:84 0xfffffe01972b2480 syscall: td=0xfffffe01972b2480 pid 9163 gettimeofday (0x7fffffffc600, 0, 0xffffffff
 )
 7463568   2    5644560364617 /usr/src/sys/kern/kern_mutex.c:356       0xfffffe004e384000 _mtx_lock_sleep: taskqueue contested (lock=0xfffffe0008375000) at /usr/src/sys/kern/kern_mutex.c
 :147
 7463569   5    5644560366559 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:149 0xfffffe01972b2480 syscall: p=0xfffffe01bc1b0910 error=0 return 0 0xffffffff
 7463570   3    5644560369374 /usr/src/sys/kern/kern_mutex.c:222       0xfffffe0008375000 UNLOCK (sleep mutex) taskqueue 0xfffffe004e2604b0 r = 0 at /usr/src/sys/kern/subr_taskqueue.c:21
 6
 7463572   6    5644560378372 /usr/src/sys/kern/subr_trap.c:101        0xfffffe01972ae900 userret: thread 0xfffffe01972ae900 (pid 9767, httpd)
 7463573   3    5644560380378 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe0008375000 LOCK (sleep mutex) bio queue 0xffffffff81146ad0 r = 0 at /usr/src/sys/geom/geom_io.c:77
 7463574   4    5644560385085 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe0008364480 LOCK (sleep mutex) mbuf 0xfffffe043ffe5e10 r = 0 at /usr/src/sys/vm/uma_core.c:2013
 7463575   1    5644560388724 /usr/src/sys/kern/kern_mutex.c:222       0xfffffe01bcf4d000 UNLOCK (sleep mutex) kqueue 0xfffffe02f7525700 r = 0 at /usr/src/sys/kern/kern_event.c:1796
 7463576   6    5644560391648 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:187 0xfffffe01972ae900 syscall gettimeofday exit thread 0xfffffe01972ae900 pid 9767 proc httpd
 7463577   3    5644560394202 /usr/src/sys/kern/kern_mutex.c:222       0xfffffe0008375000 UNLOCK (sleep mutex) bio queue 0xffffffff81146ad0 r = 0 at /usr/src/sys/geom/geom_io.c:84
 7463578   1    5644560396480 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe01bcf4d000 LOCK (sleep mutex) kqueue 0xfffffe01e96bd000 r = 0 at /usr/src/sys/kern/kern_event.c:1779
 7463580   5    5644560404315 /usr/src/sys/kern/subr_trap.c:101        0xfffffe01972b2480 userret: thread 0xfffffe01972b2480 (pid 9163, serelog)
 7463581   2    5644560408289 /usr/src/sys/kern/kern_mutex.c:374       0xfffffe004e384000 _mtx_lock_sleep: spinning on 0xfffffe004e2604b0 held by 0xfffffe0008375000
 7463582   5    5644560410639 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:187 0xfffffe01972b2480 syscall gettimeofday exit thread 0xfffffe01972b2480 pid 9163 proc serelog
 7463583   2    5644560414981 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe004e384000 LOCK (sleep mutex) taskqueue 0xfffffe004e2604b0 r = 0 at /usr/src/sys/kern/kern_mutex.c:147
 7463584   5    5644560419167 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:84 0xfffffe01972b2480 syscall: td=0xfffffe01972b2480 pid 9163 recvfrom (0x6, 0x7fffffffc734, 0x4)
 7463585   1    5644560421688 /usr/src/sys/kern/kern_mutex.c:222       0xfffffe01bcf4d000 UNLOCK (sleep mutex) kqueue 0xfffffe01e96bd000 r = 0 at /usr/src/sys/kern/kern_event.c:1796
 7463586   5    5644560423707 /usr/src/sys/kern/kern_sx.c:291          0xfffffe01972b2480 XLOCK (sx) so_rcv_sx 0xfffffe0344b073a0 r = 0 at /usr/src/sys/kern/uipc_sockbuf.c:148
 7463587   3    5644560426330 /usr/src/sys/geom/geom_io.c:678          0xfffffe0008375000 g_up biodone bp 0xfffffe004e8a0740 provider da0s1 off 134525138944 len 131072
 7463588   5    5644560428231 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe01972b2480 LOCK (sleep mutex) so_rcv 0xfffffe0344b07380 r = 0 at /usr/src/sys/kern/uipc_socket.c:1488
 
 This is the last place where I see 0xfffffe01972b2480 or "pid 9163".
 The value behind so_rcv os 0xfffffe0344b07380, and that one comes twice again:
 
 
 7464953   4    5644563693233 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe0008364480 LOCK (sleep mutex) so_rcv 0xfffffe0344b07380 r = 0 at /usr/src/sys/netinet/tcp_input.c:2834
 7464954   0    5644563696896 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe004e75a900 LOCK (sleep mutex) KNOTE 0xfffffe004e104010 r = 0 at /usr/src/sys/vm/uma_core.c:2013
 7464955   6    5644563699004 /usr/src/sys/kern/subr_trap.c:101        0xfffffe01972ae900 userret: thread 0xfffffe01972ae900 (pid 9767, httpd)
 7464956   1    5644563701340 /usr/src/sys/kern/kern_mutex.c:222       0xfffffe01bcf4d000 UNLOCK (sleep mutex) kqueue 0xfffffe01f3fb8900 r = 0 at /usr/src/sys/kern/kern_event.c:1796
 7464957   6    5644563704636 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:187 0xfffffe01972ae900 syscall gettimeofday exit thread 0xfffffe01972ae900 pid 9767 proc httpd
 7464958   3    5644563706986 /usr/src/sys/geom/geom_io.c:165          0xfffffe0008375000 #2 0xffffffff807c5a45 at g_io_schedule_up+0x175
 7464959   4    5644563708981 /usr/src/sys/kern/kern_mutex.c:222       0xfffffe0008364480 UNLOCK (sleep mutex) so_rcv 0xfffffe0344b07380 r = 0 at /usr/src/sys/kern/uipc_sockbuf.c:201
 
 What I find notewothy is, that the lock/unlock sequence with so_rcv 0xfffffe0344b07380 looks odd:
 LOCK, LOCK, UNLOCK
 
 Natively one would expect to see one more UNLOCK like this:
 LOCK, UNLOCK, LOCK, UNLOCK
 
 --=-0r5Lk3awdhxqDAjyo6lR--
 

From: Christian Esken <Christian.Esken@trivago.com>
To: bug-followup@freebsd.org
Cc: Konstantin Belousov <kostikbel@gmail.com>, avg@freebsd.org
Subject: Re: misc/166340: Process under FreeBSD 9.0 hangs in uninterruptable
 sleep with apparently no syscall (empty wchan)
Date: Tue, 27 Mar 2012 17:30:48 +0200

 --=-3sl93MaMYlu/dkvvUUBe
 Content-Type: text/plain; charset="UTF-8"
 Content-Transfer-Encoding: 7bit
 
 Konstantin Belousov wrote:
 > Thank you for the data. Semi-obviously, the callout_stop() call in
 > sleepq_check_timeout() have to return 0, otherwise we would not call
 > mi_switch() there. But I do not see how this can happen, because
 > the callout state, printed from kgdb, still indicates that callout
 > is pending. Callout cannot be reset while in sleepq code.
 > 
 > So there are two possible routes to go forward: preferrable is for
 > you to extract the self-contained C program that would illustrate
 > the issue and send this sample to me. Second is to recompile your
 > kernel with INVARIANTS/WITNESS and possibly KTR and see what happen.
 
 I repeated the test with INVARIANTS/WITNESS and KTR compiled in
 (actually WITNESS was already included during the last test).
 
 I ran KTR with nothing filtered out, and formatted the dump with
 "ktrdump -cftH  -i ktr.out". The whole log is excessive (1GB), so
 I have extrated two short sections (see attachment).
 
 The first section shows the last action of the application, namely a
 succselful sendto() to a TCP socket, and then waiting for an answer via
 recvfrom().
 The second section illustrates the lock/unlock sequence of the sleep
 mutex for the recfrom(). It goes like LOCK, LOCK, UNLOCK.
 
 This time the signal status is different. We have a pending signal:
 USER     PID  PPID  PENDING   CAUGHT  IGNORED  BLOCKED STAT WCHAN
 nobody  9163     1     4000 80005006 79f88010        0 D    -     
 
 Looks like SIGPROF (27). Just wondering where it comes from.
 
 
 
 By the way: I evaluated the possibility to implement a standalone test
 case. It would be extremely complicated, as the issue is while writing
 to the socket, and thus it would require extracting the socket code from
 the Thrift procect (http://thrift.apache.org/ ).
 
   Christian
 
 
 
 
 --=-3sl93MaMYlu/dkvvUUBe
 Content-Disposition: attachment; filename="wait_recvfrom.txt"
 Content-Type: text/plain; name="wait_recvfrom.txt"; charset="UTF-8"
 Content-Transfer-Encoding: 7bit
 
 Last actions of pid 9163 (serelog):
 sendto() succesful
 recvfrom() waits for data, uisng sleep mutex
 
 
 7463551   5    5644560314159 /usr/src/sys/kern/kern_sx.c:352          0xfffffe01972b2480 XUNLOCK (sx) so_snd_sx 0xfffffe0344b07490 r = 0 at /usr/src/sys/kern/uipc_sockbuf.c:160
 7463552   1    5644560316280 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe01bcf4d000 LOCK (sleep mutex) kqueue 0xfffffe03eef5eb00 r = 0 at /usr/src/sys/kern/kern_event.c:1779
 7463553   5    5644560319107 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:149 0xfffffe01972b2480 syscall: p=0xfffffe01bc1b0910 error=0 return 0x77d 0x77d
 7463557   5    5644560329931 /usr/src/sys/kern/subr_trap.c:101        0xfffffe01972b2480 userret: thread 0xfffffe01972b2480 (pid 9163, serelog)
 7463559   4    5644560336733 /usr/src/sys/vm/uma_core.c:1975          0xfffffe0008364480 uma_zalloc_arg thread 8364480 zone mbuf flags 1
 7463561   6    5644560344432 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:84 0xfffffe01972ae900 syscall: td=0xfffffe01972ae900 pid 9767 gettimeofday (0x7fffffffac40, 0, 0x43fd18)
 7463562   1    5644560347528 /usr/src/sys/kern/kern_mutex.c:222       0xfffffe01bcf4d000 UNLOCK (sleep mutex) kqueue 0xfffffe03eef5eb00 r = 0 at /usr/src/sys/kern/kern_event.c:1796
 7463563   6    5644560348788 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:149 0xfffffe01972ae900 syscall: p=0xfffffe01bc1ad000 error=0 return 0 0x43fd18
 7463564   5    5644560351047 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:187 0xfffffe01972b2480 syscall sendto exit thread 0xfffffe01972b2480 pid 9163 proc serelog
 7463565   1    5644560354848 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe01bcf4d000 LOCK (sleep mutex) kqueue 0xfffffe02f7525700 r = 0 at /usr/src/sys/kern/kern_event.c:1779
 7463567   5    5644560360499 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:84 0xfffffe01972b2480 syscall: td=0xfffffe01972b2480 pid 9163 gettimeofday (0x7fffffffc600, 0, 0xffffffff
 )
 7463568   2    5644560364617 /usr/src/sys/kern/kern_mutex.c:356       0xfffffe004e384000 _mtx_lock_sleep: taskqueue contested (lock=0xfffffe0008375000) at /usr/src/sys/kern/kern_mutex.c
 :147
 7463569   5    5644560366559 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:149 0xfffffe01972b2480 syscall: p=0xfffffe01bc1b0910 error=0 return 0 0xffffffff
 7463570   3    5644560369374 /usr/src/sys/kern/kern_mutex.c:222       0xfffffe0008375000 UNLOCK (sleep mutex) taskqueue 0xfffffe004e2604b0 r = 0 at /usr/src/sys/kern/subr_taskqueue.c:21
 6
 7463572   6    5644560378372 /usr/src/sys/kern/subr_trap.c:101        0xfffffe01972ae900 userret: thread 0xfffffe01972ae900 (pid 9767, httpd)
 7463573   3    5644560380378 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe0008375000 LOCK (sleep mutex) bio queue 0xffffffff81146ad0 r = 0 at /usr/src/sys/geom/geom_io.c:77
 7463574   4    5644560385085 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe0008364480 LOCK (sleep mutex) mbuf 0xfffffe043ffe5e10 r = 0 at /usr/src/sys/vm/uma_core.c:2013
 7463575   1    5644560388724 /usr/src/sys/kern/kern_mutex.c:222       0xfffffe01bcf4d000 UNLOCK (sleep mutex) kqueue 0xfffffe02f7525700 r = 0 at /usr/src/sys/kern/kern_event.c:1796
 7463576   6    5644560391648 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:187 0xfffffe01972ae900 syscall gettimeofday exit thread 0xfffffe01972ae900 pid 9767 proc httpd
 7463577   3    5644560394202 /usr/src/sys/kern/kern_mutex.c:222       0xfffffe0008375000 UNLOCK (sleep mutex) bio queue 0xffffffff81146ad0 r = 0 at /usr/src/sys/geom/geom_io.c:84
 7463578   1    5644560396480 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe01bcf4d000 LOCK (sleep mutex) kqueue 0xfffffe01e96bd000 r = 0 at /usr/src/sys/kern/kern_event.c:1779
 7463580   5    5644560404315 /usr/src/sys/kern/subr_trap.c:101        0xfffffe01972b2480 userret: thread 0xfffffe01972b2480 (pid 9163, serelog)
 7463581   2    5644560408289 /usr/src/sys/kern/kern_mutex.c:374       0xfffffe004e384000 _mtx_lock_sleep: spinning on 0xfffffe004e2604b0 held by 0xfffffe0008375000
 7463582   5    5644560410639 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:187 0xfffffe01972b2480 syscall gettimeofday exit thread 0xfffffe01972b2480 pid 9163 proc serelog
 7463583   2    5644560414981 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe004e384000 LOCK (sleep mutex) taskqueue 0xfffffe004e2604b0 r = 0 at /usr/src/sys/kern/kern_mutex.c:147
 7463584   5    5644560419167 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:84 0xfffffe01972b2480 syscall: td=0xfffffe01972b2480 pid 9163 recvfrom (0x6, 0x7fffffffc734, 0x4)
 7463585   1    5644560421688 /usr/src/sys/kern/kern_mutex.c:222       0xfffffe01bcf4d000 UNLOCK (sleep mutex) kqueue 0xfffffe01e96bd000 r = 0 at /usr/src/sys/kern/kern_event.c:1796
 7463586   5    5644560423707 /usr/src/sys/kern/kern_sx.c:291          0xfffffe01972b2480 XLOCK (sx) so_rcv_sx 0xfffffe0344b073a0 r = 0 at /usr/src/sys/kern/uipc_sockbuf.c:148
 7463587   3    5644560426330 /usr/src/sys/geom/geom_io.c:678          0xfffffe0008375000 g_up biodone bp 0xfffffe004e8a0740 provider da0s1 off 134525138944 len 131072
 7463588   5    5644560428231 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe01972b2480 LOCK (sleep mutex) so_rcv 0xfffffe0344b07380 r = 0 at /usr/src/sys/kern/uipc_socket.c:1488
 
 This is the last place where I see 0xfffffe01972b2480 or "pid 9163".
 The value behind so_rcv os 0xfffffe0344b07380, and that one comes twice again:
 
 
 7464953   4    5644563693233 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe0008364480 LOCK (sleep mutex) so_rcv 0xfffffe0344b07380 r = 0 at /usr/src/sys/netinet/tcp_input.c:2834
 7464954   0    5644563696896 /usr/src/sys/kern/kern_mutex.c:205       0xfffffe004e75a900 LOCK (sleep mutex) KNOTE 0xfffffe004e104010 r = 0 at /usr/src/sys/vm/uma_core.c:2013
 7464955   6    5644563699004 /usr/src/sys/kern/subr_trap.c:101        0xfffffe01972ae900 userret: thread 0xfffffe01972ae900 (pid 9767, httpd)
 7464956   1    5644563701340 /usr/src/sys/kern/kern_mutex.c:222       0xfffffe01bcf4d000 UNLOCK (sleep mutex) kqueue 0xfffffe01f3fb8900 r = 0 at /usr/src/sys/kern/kern_event.c:1796
 7464957   6    5644563704636 /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:187 0xfffffe01972ae900 syscall gettimeofday exit thread 0xfffffe01972ae900 pid 9767 proc httpd
 7464958   3    5644563706986 /usr/src/sys/geom/geom_io.c:165          0xfffffe0008375000 #2 0xffffffff807c5a45 at g_io_schedule_up+0x175
 7464959   4    5644563708981 /usr/src/sys/kern/kern_mutex.c:222       0xfffffe0008364480 UNLOCK (sleep mutex) so_rcv 0xfffffe0344b07380 r = 0 at /usr/src/sys/kern/uipc_sockbuf.c:201
 
 What I find notewothy is, that the lock/unlock sequence with so_rcv 0xfffffe0344b07380 looks odd:
 LOCK, LOCK, UNLOCK
 
 Natively one would expect to see one more UNLOCK like this:
 LOCK, UNLOCK, LOCK, UNLOCK
 
 --=-3sl93MaMYlu/dkvvUUBe--
 

From: Konstantin Belousov <kostikbel@gmail.com>
To: Christian Esken <Christian.Esken@trivago.com>
Cc: bug-followup@freebsd.org, avg@freebsd.org
Subject: Re: misc/166340: Process under FreeBSD 9.0 hangs in uninterruptable sleep with apparently no syscall (empty wchan)
Date: Tue, 27 Mar 2012 20:46:26 +0300

 --KldKAdupQSLqpq2E
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 Content-Transfer-Encoding: quoted-printable
 
 On Tue, Mar 27, 2012 at 05:30:48PM +0200, Christian Esken wrote:
 > Konstantin Belousov wrote:
 > > Thank you for the data. Semi-obviously, the callout_stop() call in
 > > sleepq_check_timeout() have to return 0, otherwise we would not call
 > > mi_switch() there. But I do not see how this can happen, because
 > > the callout state, printed from kgdb, still indicates that callout
 > > is pending. Callout cannot be reset while in sleepq code.
 > >=20
 > > So there are two possible routes to go forward: preferrable is for
 > > you to extract the self-contained C program that would illustrate
 > > the issue and send this sample to me. Second is to recompile your
 > > kernel with INVARIANTS/WITNESS and possibly KTR and see what happen.
 >=20
 > I repeated the test with INVARIANTS/WITNESS and KTR compiled in
 > (actually WITNESS was already included during the last test).
 >=20
 > I ran KTR with nothing filtered out, and formatted the dump with
 > "ktrdump -cftH  -i ktr.out". The whole log is excessive (1GB), so
 > I have extrated two short sections (see attachment).
 >=20
 > The first section shows the last action of the application, namely a
 > succselful sendto() to a TCP socket, and then waiting for an answer via
 > recvfrom().
 > The second section illustrates the lock/unlock sequence of the sleep
 > mutex for the recfrom(). It goes like LOCK, LOCK, UNLOCK.
 >=20
 > This time the signal status is different. We have a pending signal:
 > USER     PID  PPID  PENDING   CAUGHT  IGNORED  BLOCKED STAT WCHAN
 > nobody  9163     1     4000 80005006 79f88010        0 D    -    =20
 >=20
 > Looks like SIGPROF (27). Just wondering where it comes from.
 >=20
 This is irrelevant, and probably red-herring. The issue there is failing
 callout_stop() while callout seems to be still pending. Also, mask 0x4000
 of the pending signals indicates that SIGTERM is pending, not SIGPROF.
 
 I probably want the data from your ktr dump, either all entries for
 the stuck process and all entries for facility CALLOUT, or just the
 whole dump.
 
 Last entries of your log shred do not make much sense, since the process
 must enter _sleep() function which logs this fact right after locking
 sleepq. But log ends on so_rcv mutex lock.
 
 Please, when collecting the data, collect the whole set, i.e.
 include procstat -kk <pid> output together with the ktr, as well as kgdb
 output, so that I can be sure that we chasing one, and not N bugs.
 
 --KldKAdupQSLqpq2E
 Content-Type: application/pgp-signature
 Content-Disposition: inline
 
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (FreeBSD)
 
 iEYEARECAAYFAk9x/PIACgkQC3+MBN1Mb4hbeACfYyUTEE5GV/SeDO4fNf4ErfHY
 27oAoIGj2TMOBtQRi5P+q/v+nrKOFhFb
 =0tFs
 -----END PGP SIGNATURE-----
 
 --KldKAdupQSLqpq2E--

From: Christian Esken <Christian.Esken@trivago.com>
To: bug-followup@freebsd.org
Cc: Konstantin Belousov <kostikbel@gmail.com>, avg@freebsd.org
Subject: Re: misc/166340: Process under FreeBSD 9.0 hangs in uninterruptable
 sleep with apparently no syscall (empty wchan)
Date: Wed, 28 Mar 2012 15:30:29 +0200

 --=-KhwYlamRji7Rmc2IYneE
 Content-Type: text/plain; charset="UTF-8"
 Content-Transfer-Encoding: 7bit
 
 Yes. It sounds like a good idea to get a consistent "snapshot" of
 everything. So here it is. pid = 2527
 
 
 # ps -o user,pid,ppid,pending,caught,ignored,blocked,stat,wchan 2527
 USER     PID  PPID  PENDING   CAUGHT  IGNORED  BLOCKED STAT WCHAN
 nobody  2527  2525        0 80005006 79f88010        0 D    -     
 
 # ps -o pid,paddr 2527
   PID PADDR
  2527 fffffe025a476488
 
 # procstat -kk 2527
   PID    TID COMM             TDNAME           KSTACK
  2527 101206 serelog          -                mi_switch+0x2eb
 sleepq_check_timeout+0x9f sleepq_timedwait_sig+0x20 _sleep+0x2a1
 soreceive_generic+0xea9 kern_recvit+0x1c4 recvit+0x21 sys_recvfrom+0x82
 amd64_syscall+0x478 Xfast_syscall+0xf7 
 
 
 The kgdb data is attached.
 
 The smallest ktr trace I could produce is approximately 200MB. It's
 still rather big, so if you want a filtered version I would need to know
 how exactly this should be done (I haven't found a way to filter for a
 pid).
 Posting the full 200MB as followup is surely not appreciated, so I can
 either upload/send it somewhere or will find a place where you can
 download it. Any favored solution for you, Konstantin? We can use direct
 mail if preferred for discussing the details.
 
  
  Christian
 
 
 
 --=-KhwYlamRji7Rmc2IYneE
 Content-Disposition: attachment; filename="kgdb.2527.txt"
 Content-Type: text/plain; name="kgdb.2527.txt"; charset="UTF-8"
 Content-Transfer-Encoding: 7bit
 
 (kgdb) p *(struct proc *)0xfffffe025a476488
 $1 = {p_list = {le_next = 0xfffffe01a3f44910, le_prev = 0xfffffe02c73f8488}, p_threads = {tqh_first = 0xfffffe004edc0480, tqh_last = 0xfffffe004edc0490}, p_slock = {lock_object = {
       lo_name = 0xffffffff80d3f62c "process slock", lo_flags = 720896, lo_data = 0, lo_witness = 0xffffff8000689f80}, mtx_lock = 4}, p_ucred = 0xfffffe021f48d000, p_fd = 0xfffffe026f3bc600, p_fdtol = 0x0, 
   p_stats = 0xfffffe018e9c6a00, p_limit = 0xfffffe004ed0fb00, p_limco = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_lock = 0xfffffe025a476580, 
     c_flags = 0, c_cpu = 0}, p_sigacts = 0xfffffe035ebe5000, p_flag = 268452096, p_state = PRS_NORMAL, p_pid = 2527, p_hash = {le_next = 0x0, le_prev = 0xffffff80006aaef8}, p_pglist = {le_next = 0x0, 
     le_prev = 0xfffffe01a3827790}, p_pptr = 0xfffffe0008353910, p_sibling = {le_next = 0xfffffe004e8a6910, le_prev = 0xfffffe01b9f639f0}, p_children = {lh_first = 0x0}, p_mtx = {lock_object = {
       lo_name = 0xffffffff80d3f61f "process lock", lo_flags = 21168128, lo_data = 0, lo_witness = 0xffffff8000688400}, mtx_lock = 4}, p_ksi = 0xfffffe004e0f10e0, p_sigqueue = {sq_signals = {__bits = {16384, 0, 0, 0}}, 
     sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = {tqh_first = 0xfffffe02f5841540, tqh_last = 0xfffffe0287f372a0}, sq_proc = 0xfffffe025a476488, sq_flags = 1}, p_oppid = 0, p_dbg_child = 0, p_vmspace = 0xfffffe018e5d7000, 
   p_swtick = 340601, p_realtimer = {it_interval = {tv_sec = 0, tv_usec = 0}, it_value = {tv_sec = 0, tv_usec = 0}}, p_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 0, 
     ru_ixrss = 0, ru_idrss = 0, ru_isrss = 0, ru_minflt = 0, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 0, ru_nivcsw = 0}, p_rux = {
     rux_runtime = 798780190, rux_uticks = 12, rux_sticks = 45, rux_iticks = 0, rux_uu = 84077, rux_su = 315289, rux_tu = 399367}, p_crux = {rux_runtime = 0, rux_uticks = 0, rux_sticks = 0, rux_iticks = 0, rux_uu = 0, 
     rux_su = 0, rux_tu = 0}, p_profthreads = 0, p_exitthreads = 0, p_traceflag = 0, p_tracevp = 0x0, p_tracecred = 0x0, p_textvp = 0xfffffe007d9de000, p_lock = 0, p_sigiolst = {slh_first = 0x0}, p_sigparent = 20, 
   p_sig = 0, p_code = 0, p_stops = 0, p_stype = 0, p_step = 0 '\0', p_pfsflags = 0 '\0', p_nlminfo = 0x0, p_aioinfo = 0x0, p_singlethread = 0x0, p_suspcount = 0, p_xthread = 0x0, p_boundary_count = 0, p_pendingcnt = 2, 
   p_itimers = 0x0, p_procdesc = 0x0, p_magic = 3203398350, p_osrel = 900044, p_comm = "serelog", '\0' <repeats 12 times>, p_pgrp = 0xfffffe01a3827780, p_sysent = 0xffffffff810f1380, p_args = 0xfffffe020cf38300, 
   p_cpulimit = 9223372036854775807, p_nice = 0 '\0', p_fibnum = 0, p_xstat = 0, p_klist = {kl_list = {slh_first = 0x0}, kl_lock = 0xffffffff807f6ce0 <knlist_mtx_lock>, kl_unlock = 0xffffffff807f6d00 <knlist_mtx_unlock>, 
     kl_assert_locked = 0xffffffff807f6fe0 <knlist_mtx_assert_locked>, kl_assert_unlocked = 0xffffffff807f6fc0 <knlist_mtx_assert_unlocked>, kl_lockarg = 0xfffffe025a476580}, p_numthreads = 1, p_md = {md_ldt = 0x0, 
     md_ldt_sd = {sd_lolimit = 0, sd_lobase = 0, sd_type = 0, sd_dpl = 0, sd_p = 0, sd_hilimit = 0, sd_xx0 = 0, sd_gran = 0, sd_hibase = 0, sd_xx1 = 0, sd_mbz = 0, sd_xx2 = 0}}, p_itcallout = {c_links = {sle = {
         sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_lock = 0x0, c_flags = 16, c_cpu = 0}, p_acflag = 0, p_peers = 0x0, p_leader = 0xfffffe025a476488, p_emuldata = 0x0, 
   p_label = 0x0, p_sched = 0xfffffe025a476910, p_ktr = {stqh_first = 0x0, stqh_last = 0xfffffe025a4768c0}, p_mqnotifier = {lh_first = 0x0}, p_dtrace = 0x0, p_pwait = {cv_description = 0xffffffff80d4021d "ppwait", 
     cv_waiters = 0}, p_dbgwait = {cv_description = 0xffffffff80d40224 "dbgwait", cv_waiters = 0}, p_prev_runtime = 0, p_racct = 0x0}
 (kgdb) p/x *(((struct proc *)0xfffffe025a476488)->p_threads.tqh_first)
 $2 = {td_lock = 0xffffffff81179740, td_proc = 0xfffffe025a476488, td_plist = {tqe_next = 0x0, tqe_prev = 0xfffffe025a476498}, td_runq = {tqe_next = 0x0, tqe_prev = 0xffffffff81179968}, td_slpq = {tqe_next = 0x0, 
     tqe_prev = 0xfffffe01f7993280}, td_lockq = {tqe_next = 0x0, tqe_prev = 0xffffff800029a5f0}, td_hash = {le_next = 0x0, le_prev = 0xffffff80006b4ab0}, td_cpuset = 0xfffffe0008349dc8, td_sel = 0xfffffe020cfbe980, 
   td_sleepqueue = 0xfffffe01f7993280, td_turnstile = 0xfffffe030ac39a80, td_umtxq = 0xfffffe004e47a100, td_tid = 0x18b56, td_sigqueue = {sq_signals = {__bits = {0x0, 0x0, 0x0, 0x0}}, sq_kill = {__bits = {0x0, 0x0, 0x0, 
         0x0}}, sq_list = {tqh_first = 0x0, tqh_last = 0xfffffe004edc0530}, sq_proc = 0xfffffe025a476488, sq_flags = 0x1}, td_lend_user_pri = 0xff, td_flags = 0x20814, td_inhibitors = 0x2, td_pflags = 0x0, td_dupfd = 0x0, 
   td_sqqueue = 0x0, td_wchan = 0x0, td_wmesg = 0x0, td_lastcpu = 0x2, td_oncpu = 0xff, td_owepreempt = 0x0, td_tsqueue = 0xff, td_locks = 0x1, td_rw_rlocks = 0x0, td_lk_slocks = 0x0, td_blocked = 0x0, td_lockname = 0x0, 
   td_contested = {lh_first = 0x0}, td_sleeplocks = 0xffffffff812e1ee8, td_intr_nesting_level = 0x0, td_pinned = 0x0, td_ucred = 0xfffffe021f48d000, td_estcpu = 0x0, td_slptick = 0x0, td_blktick = 0x0, 
   td_swvoltick = 0x603cc, td_ru = {ru_utime = {tv_sec = 0x0, tv_usec = 0x0}, ru_stime = {tv_sec = 0x0, tv_usec = 0x0}, ru_maxrss = 0x17a0, ru_ixrss = 0x67c4, ru_idrss = 0x15af8, ru_isrss = 0x1c80, ru_minflt = 0x1e4, 
     ru_majflt = 0x0, ru_nswap = 0x0, ru_inblock = 0x0, ru_oublock = 0x0, ru_msgsnd = 0x294, ru_msgrcv = 0x526, ru_nsignals = 0x0, ru_nvcsw = 0xe8e, ru_nivcsw = 0x23c}, td_rux = {rux_runtime = 0x2f9c6b1e, 
     rux_uticks = 0xc, rux_sticks = 0x2d, rux_iticks = 0x0, rux_uu = 0x1486d, rux_su = 0x4cf99, rux_tu = 0x61807}, td_incruntime = 0x0, td_runtime = 0x2f9c6b1e, td_pticks = 0x0, td_sticks = 0x0, td_iticks = 0x0, 
   td_uticks = 0x0, td_intrval = 0x0, td_oldsigmask = {__bits = {0x0, 0x0, 0x0, 0x0}}, td_sigmask = {__bits = {0x0, 0x0, 0x0, 0x0}}, td_generation = 0x10ca, td_sigstk = {ss_sp = 0x0, ss_size = 0x0, ss_flags = 0x4}, 
   td_xsig = 0x0, td_profil_addr = 0x0, td_profil_ticks = 0x0, td_name = {0x73, 0x65, 0x72, 0x65, 0x6c, 0x6f, 0x67, 0x0 <repeats 13 times>}, td_fpop = 0x0, td_dbgflags = 0x0, td_dbgksi = {ksi_link = {tqe_next = 0x0, 
       tqe_prev = 0x0}, ksi_info = {si_signo = 0x0, si_errno = 0x0, si_code = 0x0, si_pid = 0x0, si_uid = 0x0, si_status = 0x0, si_addr = 0x0, si_value = {sival_int = 0x0, sival_ptr = 0x0, sigval_int = 0x0, 
         sigval_ptr = 0x0}, _reason = {_fault = {_trapno = 0x0}, _timer = {_timerid = 0x0, _overrun = 0x0}, _mesgq = {_mqd = 0x0}, _poll = {_band = 0x0}, __spare__ = {__spare1__ = 0x0, __spare2__ = {0x0, 0x0, 0x0, 0x0, 
             0x0, 0x0, 0x0}}}}, ksi_flags = 0x0, ksi_sigq = 0x0}, td_ng_outbound = 0x0, td_osd = {osd_nslots = 0x0, osd_slots = 0x0, osd_next = {le_next = 0x0, le_prev = 0x0}}, td_map_def_user = 0x0, td_dbg_forked = 0x0, 
   td_rqindex = 0x1e, td_base_pri = 0x7b, td_priority = 0x7b, td_pri_class = 0x3, td_user_pri = 0x7b, td_base_user_pri = 0x7b, td_pcb = 0xffffff8465698d10, td_state = 0x1, td_retval = {0x0, 0x4}, td_slpcallout = {
     c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0xffffff800078d8d0}}, c_time = 0x33ca8d, c_arg = 0xfffffe004edc0480, c_func = 0xffffffff80870e60, c_lock = 0x0, c_flags = 0x16, c_cpu = 0x2}, 
   td_frame = 0xffffff8465698c50, td_kstack_obj = 0xfffffe026f851510, td_kstack = 0xffffff8465695000, td_kstack_pages = 0x4, td_critnest = 0x1, td_md = {md_spinlock_count = 0x1, md_saved_flags = 0x246}, 
   td_sched = 0xfffffe004edc08a8, td_ar = 0x0, td_lprof = {{lh_first = 0x0}, {lh_first = 0x0}}, td_dtrace = 0x0, td_errno = 0x0, td_vnet = 0x0, td_vnet_lpush = 0x0, td_intr_frame = 0x0}
 
 --=-KhwYlamRji7Rmc2IYneE--
 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/166340: commit references a PR
Date: Thu,  3 May 2012 10:38:14 +0000 (UTC)

 Author: kib
 Date: Thu May  3 10:38:02 2012
 New Revision: 234952
 URL: http://svn.freebsd.org/changeset/base/234952
 
 Log:
   When callout_reset_on() cannot immediately migrate a callout since it
   is running on other cpu, the CALLOUT_PENDING flag is temporarily
   cleared. Then, callout_stop() on this, in fact active, callout fails
   because CALLOUT_PENDING is not set, and callout_stop() returns 0.
   
   Now, in sleepq_check_timeout(), the failed callout_stop() causes the
   sleepq code to execute mi_switch() without even setting the wmesg,
   since the switch-out is supposed to be transient. In fact, the thread
   is put off the CPU for full timeout interval, instead of being put on
   runq immediately.  Until timeout fires, the process is unkillable for
   obvious reasons.
   
   Fix this by marking the migrating callouts with CALLOUT_DFRMIGRATION
   flag. The flag is cleared by callout_stop_safe() when the function
   detects a migration, besides returning the success. The softclock()
   rechecks the flag for migrating callout and cancels its execution if
   the flag was cleared meantime.
   
   PR:	 misc/166340
   Reported, debugging traces provided and tested by:
   	Christian Esken <christian.esken trivago com>
   Reviewed by:	 avg, jhb
   MFC after:	 1 week
 
 Modified:
   head/sys/kern/kern_timeout.c
   head/sys/sys/callout.h
 
 Modified: head/sys/kern/kern_timeout.c
 ==============================================================================
 --- head/sys/kern/kern_timeout.c	Thu May  3 10:26:33 2012	(r234951)
 +++ head/sys/kern/kern_timeout.c	Thu May  3 10:38:02 2012	(r234952)
 @@ -645,6 +645,32 @@ softclock(void *arg)
  					cc_cme_cleanup(cc);
  
  					/*
 +					 * Handle deferred callout stops
 +					 */
 +					if ((c->c_flags & CALLOUT_DFRMIGRATION)
 +					    == 0) {
 +						CTR3(KTR_CALLOUT,
 +					"deferred cancelled %p func %p arg %p",
 +						    c, new_func, new_arg);
 +						if (cc->cc_next == c) {
 +							cc->cc_next =
 +							    TAILQ_NEXT(c,
 +							    c_links.tqe);
 +						}
 +						if (c->c_flags &
 +						    CALLOUT_LOCAL_ALLOC) {
 +							c->c_func = NULL;
 +							SLIST_INSERT_HEAD(
 +							    &cc->cc_callfree, c,
 +							    c_links.sle);
 +						}
 +						goto nextc;
 +					} else {
 +						c->c_flags &= ~
 +						    CALLOUT_DFRMIGRATION;
 +					}
 +
 +					/*
  					 * It should be assert here that the
  					 * callout is not destroyed but that
  					 * is not easy.
 @@ -659,6 +685,9 @@ softclock(void *arg)
  					panic("migration should not happen");
  #endif
  				}
 +#ifdef SMP
 +nextc:
 +#endif
  				steps = 0;
  				c = cc->cc_next;
  			}
 @@ -814,6 +843,7 @@ callout_reset_on(struct callout *c, int 
  			cc->cc_migration_ticks = to_ticks;
  			cc->cc_migration_func = ftn;
  			cc->cc_migration_arg = arg;
 +			c->c_flags |= CALLOUT_DFRMIGRATION;
  			CTR5(KTR_CALLOUT,
  		    "migration of %p func %p arg %p in %d to %u deferred",
  			    c, c->c_func, c->c_arg, to_ticks, cpu);
 @@ -984,6 +1014,12 @@ again:
  			CC_UNLOCK(cc);
  			KASSERT(!sq_locked, ("sleepqueue chain locked"));
  			return (1);
 +		} else if ((c->c_flags & CALLOUT_DFRMIGRATION) != 0) {
 +			c->c_flags &= ~CALLOUT_DFRMIGRATION;
 +			CTR3(KTR_CALLOUT, "postponing stop %p func %p arg %p",
 +			    c, c->c_func, c->c_arg);
 +			CC_UNLOCK(cc);
 +			return (1);
  		}
  		CTR3(KTR_CALLOUT, "failed to stop %p func %p arg %p",
  		    c, c->c_func, c->c_arg);
 
 Modified: head/sys/sys/callout.h
 ==============================================================================
 --- head/sys/sys/callout.h	Thu May  3 10:26:33 2012	(r234951)
 +++ head/sys/sys/callout.h	Thu May  3 10:38:02 2012	(r234952)
 @@ -46,6 +46,7 @@
  #define	CALLOUT_MPSAFE		0x0008 /* callout handler is mp safe */
  #define	CALLOUT_RETURNUNLOCKED	0x0010 /* handler returns with mtx unlocked */
  #define	CALLOUT_SHAREDLOCK	0x0020 /* callout lock held in shared mode */
 +#define	CALLOUT_DFRMIGRATION	0x0040 /* callout in deferred migration mode */
  
  struct callout_handle {
  	struct callout *callout;
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From: Christian Esken <Christian.Esken@trivago.com>
To: <bug-followup@freebsd.org>
Cc: <avg@freebsd.org>, Konstantin Belousov <kostikbel@gmail.com>
Subject: Re: misc/166340: Process under FreeBSD 9.0 hangs in uninterruptable
 sleep with apparently no syscall (empty wchan) [SOLVED]
Date: Wed, 13 Jun 2012 15:32:33 +0200

 Hello,
 
 immediately after the fix discussed in
 http://lists.freebsd.org/pipermail/freebsd-bugs/2012-May/048610.html has
 made it into 9.0-STABLE, we deployed it various machines. The updated
 machines are running since several weeks, and the issue has not happened
 again.
 So I can now confirm that the problem is solved. A big extra thanks for
 his continuous support goes to Konstantin. :-)
 
   Christian
 
 
State-Changed-From-To: open->closed 
State-Changed-By: linimon 
State-Changed-When: Thu Feb 28 09:32:00 UTC 2013 
State-Changed-Why:  
Submitter noted some time ago that this had been fixed. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=166340 
>Unformatted:
