From nobody@FreeBSD.org  Tue Nov 25 12:24:05 2008
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0DA04106564A
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 25 Nov 2008 12:24:05 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id F37628FC24
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 25 Nov 2008 12:24:04 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id mAPCO4I9082748
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 25 Nov 2008 12:24:04 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id mAPCO4ta082741;
	Tue, 25 Nov 2008 12:24:04 GMT
	(envelope-from nobody)
Message-Id: <200811251224.mAPCO4ta082741@www.freebsd.org>
Date: Tue, 25 Nov 2008 12:24:04 GMT
From: Roman <Roman.Gritsulyak@gmail.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: signals are not delivered always
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         129172
>Category:       kern
>Synopsis:       [libc] signals are not delivered always
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Nov 25 12:30:09 UTC 2008
>Closed-Date:    Sat Nov 07 15:37:09 UTC 2009
>Last-Modified:  Sat Nov 07 15:37:09 UTC 2009
>Originator:     Roman
>Release:        6.3-RELEASE FreeBSD 6.3-RELEASE
>Organization:
>Environment:
FreeBSD host.myrtghost.ru 6.3-RELEASE FreeBSD 6.3-RELEASE #0: Wed Jan 16 01:43:02 UTC 2008     root@palmer.cse.buffalo.edu:/usr/obj/usr/src/sys/SMP  amd64
>Description:
Seems, that signals that had been set by sigaction, or just by signal are not robust.

Child counts is 100 in following example.

Just 39 childs in following test appeared.

If we remove protection by semaphore, it was just about 17 signals catched.

Under linux all 100 childs signals passed in signal handler, and worked fine for both tests.


>How-To-Repeat:
Compile sem kernel module and load it into the kernel.
gcc3.4 used

gcc sigsem3.c -o sigsem3
./signal3
 - not all signals processed.

gcc sigsem2.c -o sigsem2
./signal2

coredump obtained.

gcc sigsem1.c -o sigsem1
./signal1

sigsem3.c:
==cut==%<==========
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <stdlib.h>
#include <semaphore.h>

volatile int chld_count;

sem_t asem;

void handler(int sig)
{
  pid_t pid;
  sem_wait(&asem);
  chld_count ++;
  pid = wait(NULL);

  printf("Pid %d exited. count: %d\n", pid, chld_count);
  sem_post(&asem);
}

int main(void)
{
  int i;
  chld_count=0;
  struct sigaction sa;
  sem_init(&asem,0,1);

  signal(SIGCHLD, handler);

  sigemptyset(&sa.sa_mask);
  sa.sa_flags = SA_RESTART;
  sa.sa_handler = handler;

  sigaction(SIGCHLD, &sa, NULL);

  for(i=0;i<100;i++)
  {
          int ret_val;
          ret_val=fork();

          if(ret_val==-1)
          {
                  perror("fork()");
          }
          else if(!ret_val)
          {
                  printf("in child;Child pid is %d\n", getpid());
                  exit(0);
          }
          else
          {
                  printf("Parent pid is %d; child is %d;\n",
                                  getpid(),ret_val);
          }
  }

  return 0;
}

==cut==>%==========
sigsem2.c:
==========
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <stdlib.h>
#include <semaphore.h>
// linux: -lpthread
volatile int chld_count;

sem_t asem;

void handler(int sig)
{
  pid_t pid;
  sem_wait(&asem);
  chld_count ++;
  pid = wait(NULL);

  printf("Pid %d exited. count: %d\n", pid, chld_count);
  sem_post(&asem);
}

int main(void)
{
  int i;
  chld_count=0;

  sem_init(&asem,0,1);

  signal(SIGCHLD, handler);

  for(i=0;i<100;i++)
  {
          int ret_val;
          ret_val=fork();

          if(ret_val==-1)
          {
                  perror("fork()");
          }
          else if(!ret_val)
          {
                  printf("in child;Child pid is %d\n", getpid());
                  exit(0);
          }
          else
          {
                  printf("Parent pid is %d; child is %d;\n",
                                  getpid(),ret_val);
          }
  }

  return 0;
}
==cut==>%==========



>Fix:


>Release-Note:
>Audit-Trail:

From: "Roman Gritsulyak" <roman.gritsulyak@gmail.com>
To: bug-followup@freebsd.org, Roman.Gritsulyak@gmail.com
Cc:  
Subject: Re: kern/129172: [libc] signals are not delivered always
Date: Sat, 29 Nov 2008 14:51:25 +0300

 ------=_Part_20827_7805432.1227959485453
 Content-Type: text/plain; charset=ISO-8859-1
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline
 
 Upd:
 Adding wait() for in loop for 100 children did not solved problem.
 Can it be solved under BSD by some specific Signal Flags?
 
 ------=_Part_20827_7805432.1227959485453
 Content-Type: text/html; charset=ISO-8859-1
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline
 
 Upd:<br>Adding wait() for in loop for 100 children did not solved problem.<br>Can it be solved under BSD by some specific Signal Flags?<br>
 
 ------=_Part_20827_7805432.1227959485453--

From: Jilles Tjoelker <jilles@stack.nl>
To: bug-followup@FreeBSD.org, Roman.Gritsulyak@gmail.com
Cc:  
Subject: Re: kern/129172: [libc] signals are not delivered always
Date: Sat, 24 Oct 2009 20:26:29 +0200

 It seems what you are looking for is not reliable delivery of signals,
 but queuing of SIGCHLD in particular. This is implemented in FreeBSD 7.0
 and newer. In FreeBSD 6 and older, SIGCHLD from child processes is not
 queued: if another SIGCHLD signal arrives when one is already pending,
 the two signals are coalesced and the handler is only called once.
 
 Your test program should work fine if it calls waitpid(-1, NULL,
 WNOHANG) from the signal handler until it returns 0 or -1.
 
 Even when run on FreeBSD 7, the test program has some problems.
 
 Firstly, it may exit before all the child processes.
 
 Secondly, it assumes that wait() returns terminated child processes in
 the same order as SIGCHLD signals. Apparently this is the case on Linux,
 but it is not the case on FreeBSD. Then, when wait() returns status for
 a different child process than the signal was for, the signal for that
 child process is dequeued (POSIX prescribes this, and it must be that
 way to limit the number of pending SIGCHLD signals to the number of
 child processes). As a result, the zombie for the original child process
 is never removed and the signal handler is called less than 100 times.
 If you want to wait for one process per signal handler call, you can fix
 this by making the handler a SA_SIGINFO one, and calling waitpid() with
 si->si_pid where si is the siginfo_t pointer passed to the handler.
 Otherwise use the simpler fix I mentioned above. Note that POSIX also
 says that implementations may avoid queuing if SA_SIGINFO is not
 enabled, but this is not the case in FreeBSD.
 
 Thirdly, it uses unsafe functions with signal handlers. The use of
 sem_wait() in a signal handler is not safe (apart from data consistency
 issues with fast userspace implementations, the risk of deadlock is
 pretty high -- a signal handler is not a thread). Only sem_post() is
 async-signal-safe. It seems that the objective for the semaphore is
 already met by the automatic blocking of a signal while its handler is
 executing. printf() may also cause problems with signal handlers, and
 also with fork() (double output).
 
 -- 
 Jilles Tjoelker
State-Changed-From-To: open->closed 
State-Changed-By: jilles 
State-Changed-When: Sat Nov 7 15:37:09 UTC 2009 
State-Changed-Why:  
Signal queuing is in 7.0 and will not be backported to 6.x. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=129172 
>Unformatted:
