From nobody@FreeBSD.org  Sat Dec 24 14:17:40 2011
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3766B106564A
	for <freebsd-gnats-submit@FreeBSD.org>; Sat, 24 Dec 2011 14:17:40 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id 072618FC14
	for <freebsd-gnats-submit@FreeBSD.org>; Sat, 24 Dec 2011 14:17:40 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.4/8.14.4) with ESMTP id pBOEHdY1011139
	for <freebsd-gnats-submit@FreeBSD.org>; Sat, 24 Dec 2011 14:17:39 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.4/8.14.4/Submit) id pBOEHdv6011138;
	Sat, 24 Dec 2011 14:17:39 GMT
	(envelope-from nobody)
Message-Id: <201112241417.pBOEHdv6011138@red.freebsd.org>
Date: Sat, 24 Dec 2011 14:17:39 GMT
From: Oleg Ginzburg <olevole@olevole.ru>
To: freebsd-gnats-submit@FreeBSD.org
Subject: cpuset by twice kill SMP functionality
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         163585
>Category:       kern
>Synopsis:       cpuset(1) by twice kill SMP functionality
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Dec 24 14:20:09 UTC 2011
>Closed-Date:    
>Last-Modified:  Tue Feb 18 20:00:00 UTC 2014
>Originator:     Oleg Ginzburg
>Release:        9.0-RC3, 10-CURRENT, 9-STABLE
>Organization:
>Environment:
FreeBSD gizmo.my.domain 9.0-PRERELEASE FreeBSD 9.0-PRERELEASE #12: Sun Dec 11 00:00:04 MSK 2011     root@gizmo.my.domain:/usr/obj/usr/src/sys/GENERIC  amd64
>Description:
The problem is observed at attempt to specify cpu-list for any process by a quantity of times. For example:

Before problem:
/ head of top /
146 processes: 1 running, 145 sleeping
CPU 0:  4.7% user,  0.0% nice,  0.4% system,  0.4% interrupt, 94.5% idle
CPU 1:  2.8% user,  0.0% nice,  0.4% system,  0.0% interrupt, 96.9% idle
CPU 2:  0.8% user,  0.0% nice,  0.0% system,  0.0% interrupt, 99.2% idle
CPU 3:  1.2% user,  0.0% nice,  0.0% system,  0.0% interrupt, 98.8% idle
CPU 4:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 5:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 4195M Active, 287M Inact, 2699M Wired, 13M Cache, 8642M Free
Swap: 4096M Total, 4096M Free

  PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
 3629 oleg          4  20    0   321M 54816K select  0   0:47  1.46% psi
 3122 oleg          1  20    0   928M 51348K select  1   1:26  1.07% Xorg

After occurrence resource deadlock avoided system come with unusable SMP: 
All new process spawn by one core (make -j6 -C /usr/src buildworld):

177 processes: 8 running, 169 sleeping
CPU 0:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 1:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 2:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 3:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 4:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 5: 74.4% user,  0.0% nice, 22.4% system,  0.4% interrupt,  2.8% idle
Mem: 4393M Active, 300M Inact, 2759M Wired, 13M Cache, 8371M Free
Swap: 4096M Total, 4096M Free

  PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
30442 root          1  72    0 63204K 48324K RUN     5   0:00  2.59% cc1plus
30448 root          1  72    0 61156K 47264K RUN     5   0:00  2.29% cc1plus
 3629 oleg          4  20    0   321M 54880K select  5   0:55  2.20% psi
30452 root          1  72    0 60132K 44768K RUN     5   0:00  2.10% cc1plus
30454 root          1  72    0 49868K 38392K RUN     5   0:00  2.10% cc1plus

 3122 oleg          1  21    0   928M 51320K select  5   1:37  0.88% Xorg
 30455 root          1  52    0  6280K  3992K piperd  5   0:00  0.29% as
 
 (after cpuset for Xorg to 5 core)
 
 It leads to a situation when one core is 100 % occupied, the others core
 - 100% idle. There is no possibility to correct a situation without reboot
>How-To-Repeat:
A little bit to play with cpuset:
cpuset -l N -p <PID>

>Fix:
>Release-Note:
>Audit-Trail:

From: Ivan Klymenko <fidaj@ukr.net>
To: Oleg Ginzburg <olevole@olevole.ru>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/163585: cpuset by twice kill SMP functionality
Date: Sat, 24 Dec 2011 16:25:41 +0200

 > It leads to a situation when one core is 100 % occupied, the others
 > core - 100% idle. There is no possibility to correct a situation
 > without reboot
 
 In particular this also applies to FreeBSD 10.0-CURRENT and on both
 schedulers (ULE and 4BSD).

From: Ivan Klymenko <fidaj@ukr.net>
To: bug-followup@FreeBSD.org, olevole@olevole.ru
Cc:  
Subject: Re: kern/163585: cpuset(1) by twice kill SMP functionality
Date: Tue, 8 May 2012 13:30:00 +0300

 We have to check - if somehow the situation changed after the change in
 the ULE scheduler?

From: Andrey Chernov <ache@freebsd.org>
To: fidaj@ukr.net, bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/163585: cpuset(1) by twice kill SMP functionality
Date: Wed, 15 Jan 2014 03:53:28 +0400

 Right now I notice the same situation for 9.2-stable, but only with 
 SCHED_ULE, SCHED_4BSD looks working.
 See http://lists.freebsd.org/pipermail/freebsd-stable/2014-January/076894.html
 (cpuminer calls cpuset_setaffinity(CPU_LEVEL_WHICH, CPU_WHICH_CPUSET), 
 so few restarts show the bug)
 

From: John Baldwin <jhb@freebsd.org>
To: bug-followup@freebsd.org,
 olevole@olevole.ru
Cc:  
Subject: Re: kern/163585: cpuset(1) by twice kill SMP functionality
Date: Tue, 18 Feb 2014 14:49:25 -0500

 I suspect you were using 'cpuset -c -l N -p <PID>' rather than
 'cpuset -l N -p <PID>' in which case this is working as designed.  When you 
 use '-c', you change the mask of the global cpuset that the process belongs 
 to, not the mask that is private to the process itself.
 
 The cpuminer bug referenced was exactly due to this (it was using 
 CPU_WHICH_CPUSET incorrectly which is identical to using cpuset -c)
 
 -- 
 John Baldwin
>Unformatted:
