From netch@lucky.net  Tue Jun 20 08:57:39 2000
Return-Path: <netch@lucky.net>
Received: from burka.carrier.kiev.ua (burka.carrier.kiev.ua [193.193.193.107])
	by hub.freebsd.org (Postfix) with ESMTP id 5FBF937BF1D
	for <FreeBSD-gnats-submit@freebsd.org>; Tue, 20 Jun 2000 08:57:23 -0700 (PDT)
	(envelope-from netch@lucky.net)
Received: from netch@localhost
	by burka.carrier.kiev.ua  id SWA63293;
	Tue, 20 Jun 2000 18:57:16 +0300 (EEST)
	(envelope-from netch)
Message-Id: <200006201557.SWA63293@burka.carrier.kiev.ua>
Date: Tue, 20 Jun 2000 18:57:16 +0300 (EEST)
From: netch@segfault.kiev.ua (Valentin Nechayev)
Sender: netch@lucky.net
Reply-To: netch@segfault.kiev.ua
To: FreeBSD-gnats-submit@freebsd.org
Subject: Signals 127 and 128 cannot be detected normally in wait4() interface
X-Send-Pr-Version: 3.2

>Number:         19402
>Category:       kern
>Synopsis:       Signals 127 and 128 cannot be detected in wait4() interface
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          suspended
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Jun 20 09:00:01 PDT 2000
>Closed-Date:    
>Last-Modified:  Sun May  6 07:10:08 UTC 2012
>Originator:     Valentin Nechayev <netch@netch.kiev.ua>
>Release:        FreeBSD 4.0
>Organization:
Kiev
>Environment:

FreeBSD 4.0; possibly, 5.0 also

>Description:

Syscall wait4() and libc routines wait3(), waitpid() return status of
terminated/stopped process in "status" parameter passed by pointer in some
encoded format. Standard header <sys/wait.h> provides macros for its decoding.
Some of them are:

#define _WSTATUS(x)     (_W_INT(x) & 0177)
#define _WSTOPPED       0177            /* _WSTATUS if process is stopped */
#define WIFSTOPPED(x)   (_WSTATUS(x) == _WSTOPPED)
#define WIFSIGNALED(x)  (_WSTATUS(x) != _WSTOPPED && _WSTATUS(x) != 0)

But FreeBSD 4 & 5 has signal with number 127, and terminating on this signal
mixes mistakely with stopping. Stopping on signal 128 mixes with
coredumping without signal.

>How-To-Repeat:

Compile two following test programs:

=== cut si.c ===
#include <sys/types.h>
#include <unistd.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
int main( int argc, char *argv[] )
{
        int sig;
        sig = strtol( argv[1], NULL, 0 );
        signal( sig, SIG_DFL );
        kill( getpid(), sig );
        printf( "si: trace: after kill\n" );
        return 0;
}
=== end cut ===

=== cut sic.c ===
#include <sys/types.h>
#include <sys/wait.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main( int argc, char *argv[] )
{
        int rr;
        pid_t cpid;
#if 0
        char bb[1000];
        snprintf( bb, sizeof bb, "./si %s", argv[1] );
        rr = system( bb );
#endif
        cpid = fork();
        if( cpid == -1 ) { fprintf( stderr, "fork(): failed\n" ); return 2; }
        if( cpid == 0 )
                execl( "./si", "./si", argv[1], NULL );
        else
                waitpid( cpid, &rr, 0 );
        printf( "rr==%d==0x%X\n", rr, rr );
        printf( "Exited: %s\n", WIFEXITED(rr) ? "yes" : "no" );
        printf( "Stopped: %s\n", WIFSTOPPED(rr) ? "yes" : "no" );
        printf( "Signaled: %s\n", WIFSIGNALED(rr) ? "yes" : "no" );
        printf( "Exit status: %d\n", WEXITSTATUS(rr) );
        printf( "Stop sig: %d\n", WSTOPSIG(rr) );
        printf( "Term sig: %d\n", WTERMSIG(rr) );
        printf( "Coredumped: %s\n", WCOREDUMP(rr) ? "yes" : "no" );
        return 0;
}
=== end cut ===

Compile them:
cc -o si si.c
cc -o sic sic.c

and run "sic 126", "sic 127" and "sic 128". Result print attached:

=== cut result log ===
netch@ox:~/tmp>./sic 126
rr==126==0x7E
Exited: no
Stopped: no
Signaled: yes
Exit status: 0
Stop sig: 0
Term sig: 126
Coredumped: no
netch@ox:~/tmp>./sic 127
rr==127==0x7F
Exited: no
Stopped: yes
Signaled: no
Exit status: 0
Stop sig: 0
Term sig: 127
Coredumped: no
netch@ox:~/tmp>./sic 128
rr==128==0x80
Exited: yes
Stopped: no
Signaled: no
Exit status: 0
Stop sig: 0
Term sig: 0
Coredumped: yes
=== end cut ===

With signal 127, WIFSTOPPED() is true.
With signal 128, WCOREDUMP() is true and WIFEXITED() is true. ;(

Also another test:

netch@ox:~/tmp>./si 127
[1]+  Stopped                 ./si 127
netch@ox:~/tmp>fg
./si 127

and in this case bash falls to infinite cycle on waitpid() with eating of
all available CPU. Of course, this is ugly bash bug, but it is called by
kernel interface inconsistency.

Version of system on testing host:

netch@ox:~>uname -mrs
FreeBSD 4.0-STABLE i386
netch@ox:~>fgrep __FreeBSD_version /usr/include/sys/param.h
#undef __FreeBSD_version
#define __FreeBSD_version 400019        /* Master, propagated to newvers */
netch@ox:~>

>Fix:
	
As a quick-and-dirty fix, disable signals 127-128 at all (desrease value of
_SIG_MAXSIG by 2, new value should be 126).

As normal fix, change kernel interface (wait4() syscall).

--
NVA

>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->feedback 
State-Changed-By: mike 
State-Changed-When: Sat Jul 21 21:09:54 PDT 2001 
State-Changed-Why:  

Does this problem still occur in newer versions of FreeBSD, 
such as 4.3-RELEASE? 

http://www.FreeBSD.org/cgi/query-pr.cgi?pr=19402 

From: Mike Barcroft <mike@FreeBSD.org>
To: freebsd-gnats-submit@FreeBSD.org
Cc:  
Subject: Re: kern/19402: Signals 127 and 128 cannot be detected in wait4() interface
Date: Sun, 22 Jul 2001 13:51:19 -0400

 Adding to Audit-Trail.
 
 On Sun, Jul 22, 2001 at 09:54:00AM +0300, Valentin Nechayev wrote:
 >  Sat, Jul 21, 2001 at 21:10:23, mike wrote about "Re: kern/19402: Signals 127 and 128 cannot be detected in wait4() interface": 
 > 
 > > Synopsis: Signals 127 and 128 cannot be detected in wait4() interface
 > > 
 > > State-Changed-From-To: open->feedback
 > > State-Changed-By: mike
 > > State-Changed-When: Sat Jul 21 21:09:54 PDT 2001
 > > State-Changed-Why: 
 > > 
 > > Does this problem still occur in newer versions of FreeBSD,
 > > such as 4.3-RELEASE?
 > 
 > Yes, it still occurs. Nobody changed macros in <sys/wait.h> to resolve this
 > conflict, neither in RELENG_4 nor in HEAD.
 > 
 > I can create proposition (in form of patch) how they should be changed
 > (this will use fact that wait4() status is 32 bits, but only low 16 bits
 > are used) but this will be ABI change with incompatibility for
 > signals 64...128 when bit shifts are used and only 128 in expensive
 > variant of multiply/delete.
 > Yet another variant is to exclude signals 127 and 128, this variant
 > AFAIU conflicts with POSIX.
 > 
 > Another point view is this problem is most architectural and should be
 > first discussed in -arch or -hackers, not in -bugs, and it (problem)
 > is too complicated to fit in frames of gnats db. But IMO it does _not_
 > mean the PR should be closed, because problem keeps.
 > 
 > 
 > /netch
State-Changed-From-To: feedback->suspended 
State-Changed-By: mike 
State-Changed-When: Sun Jul 22 11:03:04 PDT 2001 
State-Changed-Why:  

This is still a problem.  See the originator's comments in the 
Audit-Trail.  Awaiting fix and committer. 

http://www.FreeBSD.org/cgi/query-pr.cgi?pr=19402 

From: Jilles Tjoelker <jilles@stack.nl>
To: bug-followup@FreeBSD.org, netch@segfault.kiev.ua
Cc: bde@freebsd.org
Subject: Re: kern/19402: Signals 127 and 128 cannot be detected in wait4()
 interface
Date: Mon, 30 Apr 2012 00:46:19 +0200

 > [problems with signals 127 and 128]
 
 First, note that "clean" programs cannot use signals 127 and 128 because
 they do neither have a SIG* constant nor are in the range SIGRTMIN to
 SIGRTMAX. Therefore, I think it is inappropriate to make large changes
 to make them work. It suffices if wait4() and other interfaces cannot
 cause confusion.
 
 Because sh returns exit status 128+sig for signal sig, signal 128 cannot
 be represented in an 8-bit exit status and would have to be aliased to
 another signal if it is kept.
 
 The suggestion to modify wait4() ABI seems inappropriate for that
 reason.
 
 Another option is to modify the highest signal number accepted by
 interfaces (while leaving the size of sigset_t and the like unchanged).
 This effectively removes signals 127 and 128 from the system. One
 problem results when having posix_spawn() from an old libc reset all
 signals to default (by passing posix_spawnattr_setsigdefault() a
 sigfillset()'ed sigset_t and enabling the POSIX_SPAWN_SETSIGDEF flag in
 the posix_spawnattr_t). It will then attempt to set all signals from 1
 to 128 to the default action and fail the entire spawn if sigaction()
 fails. This could be allowed by having certain calls (such as
 sigaction() with SIG_DFL) return success without doing anything for
 signals 127 and 128. This is likely to get messy.
 
 Alternatively, the default action for signals 127 and 128 could be
 changed to ignore (like SIGCHLD, SIGURG and SIGINFO), so that no process
 may terminate because of them. Processes can still send the signals and
 set handlers for them. Apart from the obvious effect that the process
 will not terminate when it receives such a signal without handling or
 masking it, FreeBSD also discards ignored signals even when they are
 masked (POSIX permits this). This could lead to unexpected results if a
 process is using sigwait() or a similar function.
 
 Yet another approach would modify the wait4() system call, changing
 signals 127 and 128 to something that does not cause confusion. This
 seems ugly.
 
 -- 
 Jilles Tjoelker

From: Bruce Evans <brde@optusnet.com.au>
To: Jilles Tjoelker <jilles@stack.nl>
Cc: bug-followup@FreeBSD.org, netch@segfault.kiev.ua, bde@FreeBSD.org
Subject: Re: kern/19402: Signals 127 and 128 cannot be detected in wait4()
 interface
Date: Mon, 30 Apr 2012 18:24:51 +1000 (EST)

 On Mon, 30 Apr 2012, Jilles Tjoelker wrote:
 
 >> [problems with signals 127 and 128]
 >
 > First, note that "clean" programs cannot use signals 127 and 128 because
 > they do neither have a SIG* constant nor are in the range SIGRTMIN to
 > SIGRTMAX. Therefore, I think it is inappropriate to make large changes
 > to make them work. It suffices if wait4() and other interfaces cannot
 > cause confusion.
 
 I agree with not making large changes, of course.
 
 I wonder if there is a technical reason why 127 and and 128 were left out
 of SIGRTMAX.  In 4.4BSD, NSIG was only 32, with a comment saying that 33
 is possible (since NSIG counts signal 0).  Signal 32 would have caused
 fewer problems than signal 128 does now, but was left out.  In Linux
 (2.6.10 for x86-64, i386 and many others), NSIG is 32 and _NSIG is 64;
 apparently NSIG counts signal 0 but _NSIG doesn't, similarly to
 FreeBSD except for the spelling and value of _NSIG and all signals up
 to and including _NSIG being supported (SIGRTMIN is NSIG = 32 and
 SIGRTMAX is _NSIG = 64; FreeBSD uses the better spelling _SIG_MAXSIG
 for _NSIG); a max of 64 causes fewer technical problems and less bloat.
 
 > Because sh returns exit status 128+sig for signal sig, signal 128 cannot
 > be represented in an 8-bit exit status and would have to be aliased to
 > another signal if it is kept.
 >
 > The suggestion to modify wait4() ABI seems inappropriate for that
 > reason.
 >
 > Another option is to modify the highest signal number accepted by
 > interfaces (while leaving the size of sigset_t and the like unchanged).
 > This effectively removes signals 127 and 128 from the system. One
 > problem results when having posix_spawn() from an old libc reset all
 > signals to default (by passing posix_spawnattr_setsigdefault() a
 > sigfillset()'ed sigset_t and enabling the POSIX_SPAWN_SETSIGDEF flag in
 > the posix_spawnattr_t). It will then attempt to set all signals from 1
 > to 128 to the default action and fail the entire spawn if sigaction()
 > fails. This could be allowed by having certain calls (such as
 > sigaction() with SIG_DFL) return success without doing anything for
 > signals 127 and 128. This is likely to get messy.
 >
 > Alternatively, the default action for signals 127 and 128 could be
 > changed to ignore (like SIGCHLD, SIGURG and SIGINFO), so that no process
 > may terminate because of them. Processes can still send the signals and
 > set handlers for them. Apart from the obvious effect that the process
 > will not terminate when it receives such a signal without handling or
 > masking it, FreeBSD also discards ignored signals even when they are
 > masked (POSIX permits this). This could lead to unexpected results if a
 > process is using sigwait() or a similar function.
 > 
 > Yet another approach would modify the wait4() system call, changing
 > signals 127 and 128 to something that does not cause confusion. This
 > seems ugly.
 
 I think I prefer disallowing signal 128 and not worry about unportable
 programs using it, and not changing anything for signal 127 and not worry
 about the ambiguous wait status from this.
 
 Emulators give interesting problems with signal ranges.  FreeBSD seems
 to handle these problems mostly correctly in the Linux emulator.  First,
 it needs a host signal range larger than the target signal range.
 [0..126], [0..127] and [0..128] all exceed the Linux range of [0..64],
 so there is no problem yet.  However, for mips under Linux, _NSIG is 128,
 so the full FreeBSD range might be needed, depending on how Linux handles
 the problem with wait statuses.   FreeBSD mostly uses the Linux _NSIG
 correctly, so it gets target limits.  It also translates signal numbers
 below NSIG, so it knows a little about NSIG counting signal 0.  However,
 in linux_ioctl.c, it still uses the old FreeBSD signal number NSIG in a
 private ISSIGVALID() macro instead of using its standard macro
 LINUX_SIG_VALID() which uses _NSIG correctly.  ISSIGVALID() is only used
 for the VT_SETMODE ioctl, and FreeBSD's signal handling for this differs
 in other ways than Linux's (FreeBSD fixes up mode.frsig (but only if it
 and mode.acqsig are invalid according to the private macro), while Linux
 ignores mode.frsig.  The private macro might even be correct, with making
 it look like a standard macro just obfuscating any magic for NSIG here.
 
 Bruce

From: Valentin Nechayev <netch@netch.kiev.ua>
To: Jilles Tjoelker <jilles@stack.nl>
Cc: bug-followup@FreeBSD.org, bde@FreeBSD.org
Subject: Re: kern/19402: Signals 127 and 128 cannot be detected in wait4()
 interface
Date: Mon, 30 Apr 2012 12:04:54 +0300

 Hi,
 
  Mon, Apr 30, 2012 at 00:46:19, jilles wrote about "Re: kern/19402: Signals 127 and 128 cannot be detected in wait4() interface": 
 
 > > [problems with signals 127 and 128]
 > 
 > First, note that "clean" programs cannot use signals 127 and 128 because
 > they do neither have a SIG* constant nor are in the range SIGRTMIN to
 > SIGRTMAX.
 
 You are correct here now, but not at the time I have issued the original
 request. Values for SIGRTMIN, SIGRTMAX initially appeared only in
 version 1.47 (Oct 2005) and was incorrect. Revision 1.53 reduced
 SIGRTMAX from 128 to 126 exactly concerning this my PR. So, if we stick
 on treating 126 as maximal possible signal number which doesn't break
 existing ABI, all seems satisfied and I suggest simply to close it as
 fixed. No need to change any more.
 
 
 -netch-

From: Valentin Nechayev <netch@netch.kiev.ua>
To: Bruce Evans <brde@optusnet.com.au>
Cc: Jilles Tjoelker <jilles@stack.nl>, bug-followup@FreeBSD.org,
        bde@FreeBSD.org
Subject: Re: kern/19402: Signals 127 and 128 cannot be detected in wait4()
 interface
Date: Mon, 30 Apr 2012 12:22:54 +0300

  Mon, Apr 30, 2012 at 18:24:51, brde wrote about "Re: kern/19402: Signals 127 and 128 cannot be detected in wait4() interface": 
 
 > I think I prefer disallowing signal 128 and not worry about unportable
 > programs using it, and not changing anything for signal 127 and not worry
 > about the ambiguous wait status from this.
 
 As soon as realtime signals are already kind of feature very limited in
 use, and correct program doesn't allocate them in manner linear
 dependent on checked descriptor count, I guess it's too improbable to
 see a program which uses more than 10-16 realtime signals. Our current
 limit 62 is much more.
 
 > However, for mips under Linux, _NSIG is 128,
 
 If they didn't change the wait*() exitstatus ABI under MIPS (and as far
 as I see at the code, this ABI is platform independent), Linux have the
 same problems with signals 127 and 128 and their usage is incorrect.
 I guess it's better to discuss the issue in LKML and wait for Linux reaction.
 
 
 -netch-

From: Valentin Nechayev <netch@netch.kiev.ua>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/19402: Signals 127 and 128 cannot be detected in wait4()
 interface
Date: Sun, 6 May 2012 10:09:29 +0300

 > You are correct here now, but not at the time I have issued the original
 > request. Values for SIGRTMIN, SIGRTMAX initially appeared only in
 > version 1.47 (Oct 2005) and was incorrect. Revision 1.53 reduced
 > SIGRTMAX from 128 to 126 exactly concerning this my PR. So, if we stick
 > on treating 126 as maximal possible signal number which doesn't break
 > existing ABI, all seems satisfied and I suggest simply to close it as
 > fixed. No need to change any more.
 
 Forgot to mention _SIG_MAXSIG which also should be reduced if used.
 
 
 -netch-
>Unformatted:
