From nobody@FreeBSD.org  Sat Oct 30 23:07:12 2004
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id AB9F616A4CE
	for <freebsd-gnats-submit@FreeBSD.org>; Sat, 30 Oct 2004 23:07:12 +0000 (GMT)
Received: from www.freebsd.org (www.freebsd.org [216.136.204.117])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 9D0B543D46
	for <freebsd-gnats-submit@FreeBSD.org>; Sat, 30 Oct 2004 23:07:12 +0000 (GMT)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.12.11/8.12.11) with ESMTP id i9UN7CdW045296
	for <freebsd-gnats-submit@FreeBSD.org>; Sat, 30 Oct 2004 23:07:12 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.12.11/8.12.11/Submit) id i9UN7Cg0045288;
	Sat, 30 Oct 2004 23:07:12 GMT
	(envelope-from nobody)
Message-Id: <200410302307.i9UN7Cg0045288@www.freebsd.org>
Date: Sat, 30 Oct 2004 23:07:12 GMT
From: "J. Porter Clark" <jpc@drum.msfc.nasa.gov>
To: freebsd-gnats-submit@FreeBSD.org
Subject: top shows NICE as -111 on processes started by idprio
X-Send-Pr-Version: www-2.3

>Number:         73328
>Category:       kern
>Synopsis:       [patch] top(1) shows NICE as -111 on processes started by idprio
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    edwin
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Oct 30 23:10:28 GMT 2004
>Closed-Date:    
>Last-Modified:  Thu Sep 25 22:01:25 UTC 2008
>Originator:     J. Porter Clark
>Release:        5.3-STABLE
>Organization:
NASA MSFC
>Environment:
FreeBSD hammer.msfc.nasa.gov 5.3-STABLE FreeBSD 5.3-STABLE #0: Thu Oct 28 08:40:49 CDT 2004     jpc@hammer.msfc.nasa.gov:/usr/obj/usr/src/sys/HAMMER  i386     
>Description:
      The "top" program shows a NICE value of -111 for programs started by
idprio 31.  On 4.X, the "correct" value of 52 is shown.
>How-To-Repeat:
      As root, run "idprio 31 sleep 500 &".
Then run "top -Uroot".  The sleep process is listed as having a NICE value
of -111.  Other values besides 31 produce the same result.
>Fix:
      My guess is that it's probably in /usr/src/usr.bin/top/machine.c about line 747 or so, but I haven't had time to dig into it.
>Release-Note:
>Audit-Trail:

From: Bruce Evans <bde@zeta.org.au>
To: "J. Porter Clark" <jpc@drum.msfc.nasa.gov>
Cc: freebsd-gnats-submit@FreeBSD.org, freebsd-i386@FreeBSD.org
Subject: Re: i386/73328: top shows NICE as -111 on processes started by idprio
Date: Sun, 31 Oct 2004 23:15:43 +1100 (EST)

 On Sat, 30 Oct 2004, J. Porter Clark wrote:
 
 > >Description:
 >       The "top" program shows a NICE value of -111 for programs started by
 > idprio 31.  On 4.X, the "correct" value of 52 is shown.
 > >How-To-Repeat:
 >       As root, run "idprio 31 sleep 500 &".
 > Then run "top -Uroot".  The sleep process is listed as having a NICE value
 > of -111.  Other values besides 31 produce the same result.
 > >Fix:
 >       My guess is that it's probably in /usr/src/usr.bin/top/machine.c about line 747 or so, but I haven't had time to dig into it.
 
 I use this fix.  It may be out of date, and the comments about the "base"
 priority are too verbose and not quite right.
 
 %%%
 Index: machine.c
 ===================================================================
 RCS file: /home/ncvs/src/usr.bin/top/machine.c,v
 retrieving revision 1.51
 diff -u -2 -r1.51 machine.c
 --- machine.c	6 Jun 2004 19:59:06 -0000	1.51
 +++ machine.c	7 Jun 2004 04:37:20 -0000
 @@ -573,18 +573,67 @@
  	    smpmode ? smp_Proc_format : up_Proc_format,
  	    pp->ki_pid,
 -	    namelength, namelength,
 -	    (*get_userid)(pp->ki_ruid),
 -	    pp->ki_pri.pri_level - PZERO,
 -
 -	    /*
 -	     * normal time      -> nice value -20 - +20
 -	     * real time 0 - 31 -> nice value -52 - -21
 -	     * idle time 0 - 31 -> nice value +21 - +52
 +	    namelength, namelength, (*get_userid)(pp->ki_ruid),
 +  	    pp->ki_pri.pri_level - PZERO,
 +	    /*-
 +	     * Mapping from various disorganized priority schemes to ordered
 +	     * pseudo-nice values:
 +	     *
 +	     * interrupt thread        base pri  0 - 63   -> nice -180 - -117
 +	     * top half kernel thread  base pri  64 - 127 -> nice -116 -  -53
 +	     * realtime user threads   rtprio    0 - 31   -> nice  -52 -  -21
 +	     * normal user threads     nice     -20 - +20 -> nice  -20 -  +20
 +	     * idle user threads       idprio    0 - 31   -> nice  +21 -  +52
 +	     *
 +	     * The number of interest is really the "base" priority of the
 +	     * process, not the niceness of the process directly.  The base
 +	     * priority should be what is is td->td_base_pri in the kernel,
 +	     * which is ki_pri.pri_native here.  In practice, that can't
 +	     * be used directly and the workarounds are complicated because
 +	     * of the following bugs:
 +	     *     o td->td_base_pri is changed by priority propagation and
 +	     *       not even restored.  Thus it cannot be used to determine
 +	     *       the priority class.  The other priorities in k_pri can
 +	     *       be used for this, but they are set inconsistently too so
 +	     *       there is no one place that determines the correct base
 +	     *       priority.
 +	     *     o td->td_base_pri is not set to a useful value for normal
 +	     *       user threads.  It is initialized to 0 and only changed
 +	     *       by priority propagation.  Workaround: use the actual
 +	     *       nice value for the "base priority" of normal user
 +	     *       threads.
 +	     *     o kg->kg_user_pri (pri_user here) is not set to a useful
 +	     *       value for kernel threads.  It is initialized to PUSER
 +	     *       and never changed.  Something like it should be used
 +	     *       for all classes of threads to hold the previous priority
 +	     *       during priority propagation.  Then there might not need
 +	     *       to be a special variable for the user -> kernel
 +	     *       transitions (which are a type of priority propagation).
 +	     *       I think a stack of such variables is needed in general
 +	     *       though -- kg->kg_user_pri is special because it is at
 +	     *       the top.
 +	     *
 +	     * We scale the base priority so that it agrees with the
 +	     * historical nice value for normal user threads, although this
 +	     * gives negative numbers for higher priority threads.
 +	     *
 +	     * PRI_BASE() strips the fifo scheduling bit from the priority
 +	     * class.  This is not relevant for the conversion to niceness,
 +	     * but it should be shown somewhere other as a raw number in
 +	     * an abnormal ps format.  We don't use PRI_IS_REALTIME()
 +	     * because there is no corresponding classification macro for
 +	     * non-realtime priority classes and the details are too
 +	     * messy to be hidden in macros.
 +	     *
 +	     * KNF indent -ci4 is intentionally violated here.
  	     */
 -	    (pp->ki_pri.pri_class ==  PRI_TIMESHARE ?
 -	    	pp->ki_nice - NZERO :
 -	    	(PRI_IS_REALTIME(pp->ki_pri.pri_class) ?
 -		    (PRIO_MIN - 1 - (PRI_MAX_REALTIME - pp->ki_pri.pri_level)) :
 -		    (PRIO_MAX + 1 + pp->ki_pri.pri_level - PRI_MIN_IDLE))),
 +	    PRI_BASE(pp->ki_pri.pri_class) == PRI_ITHD ?
 +		PRIO_MIN + (pp->ki_pri.pri_native - PRI_MIN_TIMESHARE) :
 +	    PRI_BASE(pp->ki_pri.pri_class) == PRI_REALTIME ?
 +		PRIO_MIN + (pp->ki_pri.pri_user - PRI_MIN_TIMESHARE) :
 +	    PRI_BASE(pp->ki_pri.pri_class) == PRI_TIMESHARE ?
 +		pp->ki_nice - NZERO :
 +	    PRI_BASE(pp->ki_pri.pri_class) == PRI_IDLE ?
 +		PRIO_MAX + 1 + (pp->ki_pri.pri_user - PRI_MIN_IDLE) :
 +	    666,
  	    format_k2(PROCSIZE(pp)),
  	    format_k2(pagetok(pp->ki_rssize)),
 %%%
 
 This area is broken in ps too.  The most obvious ones are:
 
 - My ntpd process (which has realtime priority 0 and is correctly
   displayed by top as having "nice" -52) is displayed by `ps -o rtprio'
   as having priority "real:12".  The bogus 12 is just ntpd's current
   priority less PZERO.  This bug is the same as one of the ones fixed
   above.  It is that pri_level gives the current priority so it gives
   a wrong value to subtract from when the process is running at an
   elevated priority in kernel mode.  top and ps seem to get this wrong
   in RELENG_4 too.
 
 - ps.1 says that `-o rtprio' causes a display of "101" for non-rtprio
   processes, but the actual display is "normal" for normal ones and
   a "%u.%u" format for unknown ones.  The man page became inconsistent
   with the code about "101" back in 1998 in rev.1.26 of ps/print.c.
   I think unknown cases occur for at least POSIX scheduling classes.
 
 Bruce

From: "J. Porter Clark" <j.porter.clark@nasa.gov>
To: Bruce Evans <bde@zeta.org.au>
Cc: freebsd-gnats-submit@FreeBSD.org, freebsd-i386@FreeBSD.org
Subject: Re: i386/73328: top shows NICE as -111 on processes started by idprio
Date: Sun, 31 Oct 2004 13:48:45 -0600

 On Sun, Oct 31, 2004 at 11:15:43PM +1100, Bruce Evans wrote:
 > 
 > I use this fix.  It may be out of date, and the comments about the "base"
 > priority are too verbose and not quite right.
 
 Seems to work, offset by a couple hundred lines.  Thanks!
 
 > This area is broken in ps too.  The most obvious ones are:
 
 Yes, I had noticed that ps's output is somewhat goofy in these
 cases.
 
 -- 
 J. Porter Clark           j.porter.clark@nasa.gov
 NASA/MSFC Flight & Ground Computers Branch (EI31)
Responsible-Changed-From-To: freebsd-i386->freebsd-bugs 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Tue Apr 12 00:27:28 GMT 2005 
Responsible-Changed-Why:  
This does not sound i386-specific. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=73328 
State-Changed-From-To: open->feedback 
State-Changed-By: pav 
State-Changed-When: Sat May 12 09:35:28 UTC 2007 
State-Changed-Why:  
Can you retry with top(1) from -CURRENT? It shows 
PRI 171, NICE i31 (idprio 31) 
for my processes, which looks correct. 


Responsible-Changed-From-To: freebsd-bugs->pav 
Responsible-Changed-By: pav 
Responsible-Changed-When: Sat May 12 09:35:28 UTC 2007 
Responsible-Changed-Why:  
Take replies 

http://www.freebsd.org/cgi/query-pr.cgi?pr=73328 

From: Bruce Evans <bde@zeta.org.au>
To: Pav Lucistnik <pav@freebsd.org>
Cc: jpc@drum.msfc.nasa.gov, freebsd-gnats-submit@freebsd.org
Subject: Re: kern/73328: top shows NICE as -111 on processes started by idprio
Date: Sun, 13 May 2007 00:01:03 +1000 (EST)

 On Sat, 12 May 2007, Pav Lucistnik wrote:
 
 > State-Changed-Why:
 > Can you retry with top(1) from -CURRENT? It shows
 > PRI 171, NICE i31 (idprio 31)
 > for my processes, which looks correct.
 
 I fixed some of the bugs near here -current, but not all of them.  The
 following may still be needed for a complete fix:
 
 % Index: machine.c
 % ===================================================================
 % RCS file: /home/ncvs/src/usr.bin/top/machine.c,v
 % retrieving revision 1.78
 % diff -u -2 -r1.78 machine.c
 % --- machine.c	7 Nov 2006 10:03:10 -0000	1.78
 % +++ machine.c	18 Nov 2006 14:36:55 -0000
 % @@ -814,5 +814,26 @@
 %  		return ("-");
 %  	case PRI_REALTIME:
 % -		rtpri = pp->ki_pri.pri_level - PRI_MIN_REALTIME;
 % +		/*
 % +		 * XXX: the kernel doesn't tell us the original rtprio and
 % +		 * doesn't really know what it was, so to recover it we
 % +		 * must be more chummy with the implementation than the
 % +		 * implementation is with itself.  pri_user gives a
 % +		 * constant "base" priority, but is only initialized
 % +		 * properly for user threads.  pri_native gives what the
 % +		 * kernel calls the "base" priority, but it isn't constant
 % +		 * since it is changed by priority propagation.  pri_native
 % +		 * also isn't properly intialized for all threads, but it
 % +		 * is properly initialized for kernel realtime and idletime
 % +		 * threads.  Thus we use pri_user for the base priority of
 % +		 * user threads (it is always correct) and pri_native for
 % +		 * the base priority of kernel realtime and idletime threads
 % +		 * (there is nothing better, and it is usually correct).
 % +		 *
 % +		 * The field width and thus the buffer are too small for
 % +		 * values like "kr31F", but such values shouldn't occur,
 % +		 * and if they do then the tailing "F" is not displayed.
 % +		 */
 % +		rtpri = ((pp->ki_flag & P_KTHREAD) ? pp->ki_pri.pri_native :
 % +		    pp->ki_pri.pri_user) - PRI_MIN_REALTIME;
 %  		snprintf(nicebuf, sizeof(nicebuf), "%sr%d%s",
 %  		    kthread, rtpri, fifo);
 % @@ -824,5 +845,7 @@
 %  		break;
 %  	case PRI_IDLE:
 % -		rtpri = pp->ki_pri.pri_level - PRI_MIN_IDLE;
 % +		/* XXX: as above. */
 % +		rtpri = ((pp->ki_flag & P_KTHREAD) ? pp->ki_pri.pri_native :
 % +		    pp->ki_pri.pri_user) - PRI_MIN_IDLE;
 %  		snprintf(nicebuf, sizeof(nicebuf), "%si%d%s",
 %  		    kthread, rtpri, fifo);
 
 This hasn't been committed since it is extremely ugly and I've been hoping
 for many years that Someone would fix the kernel so that it is not needed.
 
 ps(1) has related bugs: partial output of top(1) with the above fix:
 
 %   PID USERNAME  THR PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
 %    10 root        1 171 ki31     0K     8K RUN    324.7H 99.02% idle
 %    11 root        1 -32    -     0K     8K WAIT     4:29  0.00% swi4: clock cy
 %    26 root        1 171 ki31     0K     8K pgzero   0:10  0.00% pagezero
 %   537 root        1  44   r0   836K   480K select   0:09  0.00% ntpd
 %    27 root        1 171 ki31     0K     8K pollid   0:00  0.00% idlepoll
 
 Output of ps jx -o rtprio for the corresponding processes:
 
 % USER   PID  PPID  PGID   SID JOBC STAT  TT       TIME COMMAND           RTPRIO
 % root    10     0     0     0    0 RL    ??  19486:11.06 [idle]           idle:25
 % root    11     0     0     0    0 WL    ??    4:28.78 [swi4: clock cy  intr:52
 % root    26     0     0     0    0 DL    ??    0:09.57 [pagezero]       idle:25
 % root    27     0     0     0    0 DL    ??    0:00.30 [idlepoll]       idle:25
 % root   537     1   537   537    0 Ss    ??    0:09.36 /usr/sbin/ntpd - real:12
 
 This shows ps mangling idle priority 31 to 25 and realtime priority 0 to 12.
 
 This shows ps getting the interrupt priority more or less correct.
 Both top and ps print pri_level, which is problematic as partly explained
 above since it is not always the same as the "base" priority; top
 obfuscates all priorities by subtracting PZERO = 84, so priority 52
 becomes -32.  It isn't clear how the interrupt priority should be
 displayed in either top; top simply doesn't translate pri_level to an
 interrupt pseudo-priority class, so its PRI column gives the class as
 a priority modulo the race and the PZERO obfuscation, while ps prints
 the priority in the class column without translation.
 
 This doesn't show ps not understanding the priority class code for FIFO
 scheduling.
 
 Bruce
State-Changed-From-To: feedback->open 
State-Changed-By: pav 
State-Changed-When: Sat May 12 14:18:09 UTC 2007 
State-Changed-Why:  
Looks like there are still some loose ends 


Responsible-Changed-From-To: pav->freebsd-bugs 
Responsible-Changed-By: pav 
Responsible-Changed-When: Sat May 12 14:18:09 UTC 2007 
Responsible-Changed-Why:  
Looks like there are still some loose ends 

http://www.freebsd.org/cgi/query-pr.cgi?pr=73328 
Responsible-Changed-From-To: freebsd-bugs->edwin 
Responsible-Changed-By: edwin 
Responsible-Changed-When: Thu Sep 25 22:01:13 UTC 2008 
Responsible-Changed-Why:  
Grab it in the upgrade to 3.8b1 

http://www.freebsd.org/cgi/query-pr.cgi?pr=73328 
>Unformatted:
