From nobody@FreeBSD.ORG  Sat Oct 21 09:09:56 2000
Return-Path: <nobody@FreeBSD.ORG>
Received: by hub.freebsd.org (Postfix, from userid 32767)
	id 9375F37B4C5; Sat, 21 Oct 2000 09:09:56 -0700 (PDT)
Message-Id: <20001021160956.9375F37B4C5@hub.freebsd.org>
Date: Sat, 21 Oct 2000 09:09:56 -0700 (PDT)
From: grubba@roxen.com
Sender: nobody@FreeBSD.ORG
To: freebsd-gnats-submit@FreeBSD.org
Subject: A threaded read(2) from a socketpair(2) fd can sometimes fail with errno 19 (ENODEV)
X-Send-Pr-Version: www-1.0

>Number:         22190
>Category:       kern
>Synopsis:       A threaded read(2) from a socketpair(2) fd can sometimes fail with errno 19 (ENODEV)
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    freebsd-threads
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Oct 21 09:10:01 PDT 2000
>Closed-Date:    Sun Nov 18 08:42:23 UTC 2007
>Last-Modified:  Sun Nov 18 08:42:23 UTC 2007
>Originator:     Henrik Grubbstrm
>Release:        4.0-RELEASE #0
>Organization:
Roxen Internet Software
>Environment:
FreeBSD snok.idonex.se 4.0-RELEASE FreeBSD 4.0-RELEASE #0: Wed Mar 15 02:16:55 GMT 2000     jkh@monster.cdrom.com:/usr/src/sys/compile/GENERIC  i386

>Description:
In the testsuite for a threaded application, a process spawning test
that spawns 1000 /bin/cat /dev/null and waits for them sometimes fails
because read(2) returns -1 with errno set to 19 (ENODEV). ENODEV is
not a documented error code for read(2).

Down-stripped code that triggs the bug:
  {
    pid_t pid=-2;
    int control_pipe[2];	/* Used for communication with the child. */
    char buf[4];

    if (socketpair(AF_UNIX, SOCK_STREAM, 0, control_pipe) < 0) {
      error("Failed to create child communication pipe.\n");
    }

    {
      int loop_cnt = 0;
      sigset_t new_sig, old_sig;
      sigfillset(&new_sig);
      while(sigprocmask(SIG_BLOCK, &new_sig, &old_sig))
	;

      do {

	pid=fork();
	if (pid == -1) {
	  if (errno == EAGAIN) {
	    /* Process table full or similar.
	     * Try sleeping for a bit.
	     */
	    if (loop_cnt++ < 60) {
	      /* Don't sleep for too long... */
	      poll(NULL, 0, 100);

	      /* Try again */
	      continue;
	    }
	  } else if (errno == EINTR) {
	    /* Try again */
	    continue;
	  }
	}
	break;
      } while(1);

      while(sigprocmask(SIG_SETMASK, &old_sig, 0))
	;
    }

    if(pid == -1) {
      int e = errno;
      /*
       * fork() failed
       */

      while(close(control_pipe[0]) < 0 && errno==EINTR);
      while(close(control_pipe[1]) < 0 && errno==EINTR);

      error("Process.create_process(): fork() failed. errno:%d\n",
	    e);
    } else if(pid) {
      int olderrno;

      /*
       * The parent process
       */

      /* Close our child's end of the pipe. */
      while(close(control_pipe[1]) < 0 && errno==EINTR);

      /* Wake up the child. */
      buf[0] = 0;

      while (((e = write(control_pipe[0], buf, 1)) < 0) && (errno == EINTR))
	;
      if(e!=1) {
	/* Paranoia in case close() sets errno. */
	olderrno = errno;
	while(close(control_pipe[0]) < 0 && errno==EINTR)
          ;
	error("Child process died prematurely. (e=%d errno=%d)\n",
	      e, olderrno);
      }

      /* Wait for exec or error */
      while (((e = read(control_pipe[0], buf, 3)) < 0) && (errno == EINTR))
	;
      /* Paranoia in case close() sets errno. */
      olderrno = errno;

      while(close(control_pipe[0]) < 0 && errno==EINTR)
        ;

      if (!e) {
	/* OK! */
	pop_n_elems(args);
	push_int(0);
	return;
      } else {
	/* Something went wrong. */
	switch(buf[0]) {
	  /* ... */
	case 0:
	  /* read() probably failed. */
	default:
	  /******************************************************************
           * This point is reached with buf = {0, 4, 0}, e = -1, olderrno=19.
           *****************************************************************/
	  error("Process.create_process(): "
		"Child failed: %d, %d, %d, %d, %d!\n",
		buf[0], buf[1], buf[2], e, olderrno);
	  break;
	}
      }
    }else{
      /*
       * The child process
       */
      /* Close our parent's end of the pipe. */
      while(close(control_pipe[0]) < 0 && errno==EINTR);
      /* Ensure that the pipe will be closed when the child starts. */
      if(set_close_on_exec(control_pipe[1], 1) < 0)
	PROCERROR(PROCE_CLOEXEC, 0);

      /* Wait for parent to get ready... */
      while ((( e = read(control_pipe[1], buf, 1)) < 0) && (errno == EINTR))
	;

      /* ... */
      execvp(argv[0], argv);
      PROCERROR(PROCE_EXEC, 0);
      exit(99);
    }
  }

For the full source, please check src/signal_handler.c:f_create_process() in a Pike distribution.

Testsuite report:

testsuite: Test 9406 (shift 0) (CRNL) failed.
  1: mixed a() {  for(int x=0;x<10;x++) { for(int e=0;e<100;e++) if(Process.create_process(({"/bin/cat","/dev/null"}))->wait()) return e; __signal_watchdog(); } return -1;; }
  2: mixed b() { return -1; }
Error: Process.create_process(): Child failed: 0, 4, 0, -1, 19!
__builtin.create_process: create(({"/bin/cat","/dev/null"}))
__builtin: create_process()
testsuite: Test 9406 (shift 0) (CRNL):1: a()
/tmp/autobuild/pike7.1-20001021082826.tar/bin/test_pike.pike:572: main(3,({"/tmp/autobuild/pike7.1-20001021082826.tar/bin/test_pike.pike","modules/CommonLog/module_testsuite","modules/Gdbm/module_testsuite","modules/Gettext/module_testsuite","modules/Gmp/module_testsuite",,,34}))

>How-To-Repeat:
Unfortunately, the problem is intermittent.
It may be triggered by resource exhaustion.
>Fix:


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->jasone 
Responsible-Changed-By: jasone 
Responsible-Changed-When: Mon Oct 23 14:39:44 PDT 2000 
Responsible-Changed-Why:  
Over to maintainer. 
Responsible-Changed-From-To: jasone->freebsd-bugs 
Responsible-Changed-By: jasone 
Responsible-Changed-When: Sat May 11 15:23:08 PDT 2002 
Responsible-Changed-Why:  


http://www.freebsd.org/cgi/query-pr.cgi?pr=22190 
State-Changed-From-To: open->feedback 
State-Changed-By: iedowse 
State-Changed-When: Sun Aug 11 12:56:44 PDT 2002 
State-Changed-Why:  

Does this problem still occur on more recent releases? 

http://www.freebsd.org/cgi/query-pr.cgi?pr=22190 

From: Dan Nelson <dnelson@allantgroup.com>
To: freebsd-gnats-submit@FreeBSD.org, grubba@roxen.com
Cc:  
Subject: Re: misc/22190: A threaded read(2) from a socketpair(2) fd can sometimes
 fail with errno 19 (ENODEV)
Date: Fri, 30 Aug 2002 10:13:48 -0500

 Yes, it does.  The pike developers have a build farm, similar to 
 tinderbox, and my -current machine just failed the testsuite with the 
 error "read(2) failed with ENODEV!".  It seems to be very infrequent; 
 it's probably run a couple dozen builds with no problem.
 
 I'm going to add a PTHREAD_ASSERT in uthread_read.c to see if I other 
 programs are also getting ENODEV but ignoring it.  I haven't been able 
 to get crashdumps working on my -current box, so I can't put a panic in 
 the kernel's read().
 
State-Changed-From-To: feedback->open 
State-Changed-By: ceri 
State-Changed-When: Sun Jun 8 10:57:08 PDT 2003 
State-Changed-Why:  
Feedback has been requested and received; throw this PR back open. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=22190 

From: Ceri Davies <ceri@FreeBSD.org>
To: FreeBSD Gnats Submit <freebsd-gnats-submit@FreeBSD.org>
Cc:  
Subject: Re: misc/22190: A threaded read(2) from a socketpair(2) fd can sometimes fail with errno 19 (ENODEV)
Date: Mon, 9 Jun 2003 11:49:36 +0100

 Adding to audit trail:
 
 
 Date: Mon, 9 Jun 2003 12:41:32 +0200 (MET DST)
 From: Henrik Grubbstr <grubba@roxen.com>
 Message-ID: <Pine.GSO.4.21.0306091233430.13083-100000@jms.roxen.com>
 
 Well, since the last followup was from august last year, I can inform you
 that the bug was last triggered on Dan's FreeBSD 5.1-BETA machine
 yesterday:
 
 Fatal error 'read(2) may not return ENODEV' at line 98 in file /usr/src/lib/libc_r/uthread/uthread_read.c (errno = 19)
 Abort trap (core dumped)
 
 Core was generated by `pike'.
 Program terminated with signal 6, Aborted.
 #0  0x2826239f in kill () at {standard input}:15
 	in {standard input}
 
 Active threads
 Current language:  auto; currently asm
 * 1 process 33497  0x2826239f in kill () at {standard input}:15
 
 Backtrace
 #0  0x2826239f in kill () at {standard input}:15
 #1  0x282c219a in abort () at /usr/src/lib/libc/stdlib/abort.c:72
 #2  0x2820f443 in _thread_exit ()
     at /usr/src/lib/libc_r/uthread/uthread_exit.c:99
 #3  0x28209d65 in _read (fd=12, buf=0xbf966fe8, nbytes=3)
     at /usr/src/lib/libc_r/uthread/uthread_read.c:98
 #4  0x28209d9b in __read (fd=12, buf=0xbf966fe8, nbytes=3)
     at /usr/src/lib/libc_r/uthread/uthread_read.c:108
 #5  0x080b725a in f_create_process (args=1)
     at /usr/tmp/xenofarm/pike-7.5/dan.emsphone.com/buildtmp/Pike7.5-20030608-215603/src/signal_handler.c:3512
 #6  0x0807073d in low_mega_apply (type=APPLY_LOW, args=1, arg1=0x85dced8, 
     arg2=0x6) at apply_low.h:195
 #7  0x08071734 in mega_apply (type=APPLY_LOW, args=1, arg1=0x85dced8, arg2=0x6)
     at /usr/tmp/xenofarm/pike-7.5/dan.emsphone.com/buildtmp/Pike7.5-20030608-215603/src/interpret.c:1702
 #8  0x080cd7a5 in call_pike_initializers (o=0x85dced8, args=1)
     at /usr/tmp/xenofarm/pike-7.5/dan.emsphone.com/buildtmp/Pike7.5-20030608-215603/src/object.c:326
 #9  0x080cd894 in debug_clone_object (p=0x5, args=1)
     at /usr/tmp/xenofarm/pike-7.5/dan.emsphone.com/buildtmp/Pike7.5-20030608-215603/src/object.c:352
 #10 0x080711fe in low_mega_apply (type=APPLY_SVALUE_STRICT, args=1, 
     arg1=0x8533554, arg2=0x0)
     at /usr/tmp/xenofarm/pike-7.5/dan.emsphone.com/buildtmp/Pike7.5-20030608-215603/src/interpret.c:1500
 #11 0x0806e77d in opcode_F_APPLY (arg1=33496) at interpret_functions.h:1873
 #12 0x08533166 in ?? ()
 #13 0x08071750 in mega_apply (type=APPLY_STACK, args=1, arg1=0x0, arg2=0x0)
     at /usr/tmp/xenofarm/pike-7.5/dan.emsphone.com/buildtmp/Pike7.5-20030608-215603/src/interpret.c:1704
 #14 0x08071874 in f_call_function (args=1)
     at /usr/tmp/xenofarm/pike-7.5/dan.emsphone.com/buildtmp/Pike7.5-20030608-215603/src/interpret.c:1769
 #15 0x080f6fed in new_thread_func (data=0xbfbff404)
     at /usr/tmp/xenofarm/pike-7.5/dan.emsphone.com/buildtmp/Pike7.5-20030608-215603/src/threads.c:788
 #16 0x28204e6d in _thread_start ()
     at /usr/src/lib/libc_r/uthread/uthread_create.c:275
 #17 0xbf91c000 in ?? ()
 
 sysname: FreeBSD
 release: 5.1-BETA
 version: FreeBSD 5.1-BETA #271: Thu May 29 16:33:28 CDT 2003
 dan@dan.emsphone.com:/usr/src/sys/i386/compile/DANSMP 
 machine: i386
 nodename: dan.emsphone.com
 testname: default
 command: make xenofarm
 clientversion: $Id: client.sh,v 1.73 2003/05/20 12:48:33 mani Exp $
 putversion: $Id: put.c,v 1.14 2003/01/12 21:14:16 ceder Exp $
 contact: dnelson@allantgroup.com
 
 Thanks,
 
 --
 Henrik Grubbstrm					grubba@roxen.com
 Roxen Internet Software AB
 
 
Responsible-Changed-From-To: freebsd-bugs->freebsd-threads 
Responsible-Changed-By: kris 
Responsible-Changed-When: Sat Jul 12 18:40:15 PDT 2003 
Responsible-Changed-Why:  
Assign to threads mailing list 

http://www.freebsd.org/cgi/query-pr.cgi?pr=22190 
State-Changed-From-To: open->suspended 
State-Changed-By: maxim 
State-Changed-When: Mon Apr 24 19:33:14 UTC 2006 
State-Changed-Why:  
In RELENG_5,6 and HEAD libc_r is deprecated in favour of 
libpthread and libthr.  Nobody is working on libc_r bugs 
so mark this PR as suspended. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=22190 
State-Changed-From-To: suspended->closed 
State-Changed-By: kmacy 
State-Changed-When: Sun Nov 18 08:42:06 UTC 2007 
State-Changed-Why:  

libc_r is no longer supported 

http://www.freebsd.org/cgi/query-pr.cgi?pr=22190 
>Unformatted:
