From nobody@FreeBSD.org  Mon Oct 10 13:05:06 2005
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id B7DD616A41F
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 10 Oct 2005 13:05:06 +0000 (GMT)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [216.136.204.117])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 7347A43D46
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 10 Oct 2005 13:05:06 +0000 (GMT)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.13.1/8.13.1) with ESMTP id j9AD56rA083080
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 10 Oct 2005 13:05:06 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.13.1/8.13.1/Submit) id j9AD561C083079;
	Mon, 10 Oct 2005 13:05:06 GMT
	(envelope-from nobody)
Message-Id: <200510101305.j9AD561C083079@www.freebsd.org>
Date: Mon, 10 Oct 2005 13:05:06 GMT
From: "Norbert P. Copones" <norbert@feu-nrmf.ph>
To: freebsd-gnats-submit@FreeBSD.org
Subject: /dev/cuad[0/1] bad file descriptor error during mgetty read
X-Send-Pr-Version: www-2.3

>Number:         87208
>Category:       kern
>Synopsis:       [patch] [regression] /dev/cuad[0/1] bad file descriptor error during mgetty read
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    csjp
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Oct 10 13:10:14 GMT 2005
>Closed-Date:    Thu Mar 23 16:26:45 GMT 2006
>Last-Modified:  Thu Mar 23 16:26:45 GMT 2006
>Originator:     Norbert P. Copones
>Release:        FreeBSD 6.0-RC1
>Organization:
FEU-NRMF
>Environment:
FreeBSD proxy.feu-nrmf.ph 6.0-RC1 FreeBSD 6.0-RC1 #0: Mon Oct 10 17:38:35 PHT 2005     norbert@feu-nrmf.ph:/usr/src/sys/i386/compile/GENERIC   i386
>Description:
The device /dev/cuad[0/1] cannot be accessed by mgetty during startup (mgetty entry in /etc/tty). mgetty log shows this output:

10/10 18:32:46 ad1  mgetty: interim release 1.1.33-Apr10
10/10 18:32:46 ad1  check for lockfiles
10/10 18:32:46 ad1  locking the line
10/10 18:32:47 ad1  mod: cannot make /dev/cuad1 stdin: Bad file descriptor
10/10 18:32:47 ad1  open device /dev/cuad1 failed: Bad file descriptor
10/10 18:32:47 ad1  cannot get terminal line dev=cuad1, exiting: Bad file descriptor
>How-To-Repeat:
setting "cuad1 "/usr/local/sbin/mgetty" unknown  on insecure" in /etc/ttys and rebooting the system.     
>Fix:
i tried changing cuad1 to ttyd1 in /etc/ttys entry and sending HUP signal to init. first attempt always fails (the usual bad file descriptor error), second attempt sometimes succeeds. then switching back ttyd1 to cuad1 again in /etc/ttys and sending HUP signal will error (cuad1: Device busy). so i killed mgetty and again send HUP signal to init and it will succeed at this time.
>Release-Note:
>Audit-Trail:

From: HASHI Hiroaki <hashiz@tomba.cskk-sv.co.jp>
To: bug-followup@FreeBSD.org, norbert@feu-nrmf.ph
Cc:  
Subject: Re: i386/87208: /dev/cuad[0/1] bad file descriptor error during
 mgetty read
Date: Tue, 18 Oct 2005 18:30:58 +0900 (JST)

 In this case, mgetty open a /dev/cuad? and dup(2) to stdin.
 
     int fd;
     
     fd = open(devname, O_RDWR | O_NDELAY | O_NOCTTY );
 
     /* make new fd == stdin if it isn't already */
 
     if (fd > 0)
     {
         (void) close(0);
 --->    if (dup(fd) != 0)
         {
             lprintf( L_FATAL, "mod: cannot make %s stdin", devname );
             return ERROR;
         }
     }
 
 Bad dup() was not return descriptor 0.
 
 Is this a dup(3)'s bug?
 (or imcompatible change?)
 
 Workaround:
   mgetty use dup2(3) instead of use dup(3).
 
   dup2(fd, 0)
   .
   .
   dup2(0, 1)
   .
   .
   dup2(0, 2)
   .
   .

From: Dmitry Pryanishnikov <dmitry@atlantis.dp.ua>
To: bug-followup@FreeBSD.org
Cc: freebsd-stable@FreeBSD.org
Subject: Re: i386/87208 : /dev/cuad[0/1] bad file descriptor error during
Date: Fri, 11 Nov 2005 01:39:44 +0200 (EET)

 Hello!
 
   I'm CCing this follow-up to freebsd-stable because this problem can
 prevent use of RELENG_6 machines in production (mgetty is quite usual
 example of such a use). This bug is a regression vs. RELENG_5/4.
 
   My analysis shows that it isn't only dup() problem. File descriptor 0
 get somehow "reserved" in RELENG_6, but only IF process has been started
 by the init via /etc/ttys! Look at this simple program:
 
 #include <unistd.h>
 #include <syslog.h>
 #include <fcntl.h>
 
 #include <stdio.h>
 #include <string.h>
 #include <stdarg.h>
 
 main()
 {
      int res;
 
      while((res=open("/dev/null",O_RDONLY)) < 3)
          if (res == -1) syslog(LOG_ERR,"open(): %m");
      syslog(LOG_ERR,"Started"); sleep(10);
      if (close(0) == -1) syslog(LOG_ERR,"close(0): %m");
      if (close(2) == -1) syslog(LOG_ERR,"close(2): %m");
      if ((res=dup(1)) == -1) syslog(LOG_ERR,"dup(1): %m");
      syslog(LOG_ERR,"dup() gave %d\n",res);
      sleep(10);
      return 0;
 }
 
 One can watch the file descriptor usage in two points where program is 
 sleeping: first after program has opened enough files to use descriptor
 #3, and second after closing descriptors #0 and #2 and copying descriptor
 #1. So, when I start this program under 6.0-RELEASE in usual way (./a.out),
 in first point lsof shows me the following (I'll show only plain descriptors
 and omit cwd/rtd/txt information):
 
 At first sleep:
 
 a.out   837 root    0u  VCHR       0,70  0t77713     70 /dev/ttyv1
 a.out   837 root    1u  VCHR       0,70  0t77713     70 /dev/ttyv1
 a.out   837 root    2u  VCHR       0,70  0t77713     70 /dev/ttyv1
 a.out   837 root    3r  VCHR       0,13      0t0     13 /dev/null
 a.out   837 root    4u  unix 0xc1c7b9bc      0t0        ->0xc1bf7de8
 
 (descriptor #4 has been created by syslog()). Program logged the following:
 
 a.out: dup() gave 0
 
 At the second sleep:
 
 a.out   837 root    0u  VCHR       0,70  0t77713     70 /dev/ttyv1
 a.out   837 root    1u  VCHR       0,70  0t77713     70 /dev/ttyv1
 a.out   837 root    3r  VCHR       0,13      0t0     13 /dev/null
 a.out   837 root    4u  unix 0xc1c7b9bc      0t0        ->0xc1bf7de8
 
 So all OK in this mode: there were 3 standard files open at the beginning
 (descr. 0-2), program has opened descr. 3 (and 4), closed 0 and 2 
 successfully, and copied 1 to 0. Now let's start this program from the
 /etc/ttys:
 
 cuad0  "/root/tmp/a.out"       unknown on insecure
 
 Now we have the following at the first sleep():
 
 a.out   817 root    1r  VCHR       0,13      0t0     13 /dev/null
 a.out   817 root    2r  VCHR       0,13      0t0     13 /dev/null
 a.out   817 root    3r  VCHR       0,13      0t0     13 /dev/null
 a.out   817 root    4u  unix 0xc1c7bde8      0t0        ->0xc1bf7de8
 
 Note that open() has also skipped descr. 0! Then program tries to close it,
 gives an error:
 
 close(0): Bad file descriptor
 dup() gave 2
 
 Note that descriptor 0 isn't open: close() refuses to close it. But dup()
 doesn't "see" it and returns descr. 2 instead. At the second sleep, we
 have exactly the same open file table: descr. 0 is not in use, 1-3 point
 at /dev/null. So it seems to me that open() suffers from the same problem 
 here as a dup(): descriptor 0 becomes "reserved" somehow.
 
 
 Sincerely, Dmitry
 -- 
 Atlantis ISP, System Administrator
 e-mail:  dmitry@atlantis.dp.ua
 nic-hdl: LYNX-RIPE
Responsible-Changed-From-To: freebsd-i386->freebsd-bugs 
Responsible-Changed-By: glebius 
Responsible-Changed-When: Tue Nov 15 14:55:49 GMT 2005 
Responsible-Changed-Why:  
Probably not i386 specific. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=87208 

From: Holger Kipp <hk@alogis.com>
To: bug-followup@FreeBSD.org, norbert@feu-nrmf.ph
Cc:  
Subject: Re: kern/87208: /dev/cuad[0/1] bad file descriptor error during mgetty read
Date: Tue, 22 Nov 2005 12:24:27 +0100

 Hello,
 
 this problem is preventing production use here. Currently I
 can use /dev/cuad0 if I have the entry
 
 cuad0   "/usr/local/sbin/mgetty"        unknown on insecure
 
 twice(*) in "/etc/ttys" and issue "kill -HUP 1" after booting
 to multi-user. Having only the first entry, sending SIGHUP
 to init won't work, but with both entries, so far the first
 SIGHUP to init gets everything working.
 
 Maybe this is helpful in finding the culprit. This is on a
 ASRock CPU EX Upgrade Board (K7UPGRADE-880/A/ASR) with
 AMD Athlon(tm) XP 2800+ and 512MB Memory, running
 FreeBSD 6.0-STABLE #3: Sun Nov 20 19:50:43 CET 2005
 
 (*) one entry comes before pseudo terminal entries, the other
     afterwards.
 
 Regards,
 Holger Kipp

From: "Peter Blok" <pblok@bsd4all.org>
To: <bug-followup@FreeBSD.org>, <norbert@feu-nrmf.ph>
Cc:  
Subject: Re: kern/87208: /dev/cuad[0/1] bad file descriptor error during mgetty read
Date: Thu, 1 Dec 2005 22:30:04 +0100

 This is a multi-part message in MIME format.
 
 ------=_NextPart_000_0004_01C5F6C6.C7445970
 Content-Type: text/plain;
 	charset="US-ASCII"
 Content-Transfer-Encoding: 7bit
 
 Hi,
 
  
 
 Problem is caused by sys/kern/kern_descrip.c 1.279.2.1. When the changes are
 undone, mgetty works. I am still figuring out if the kernel patch is wrong,
 or that mgetty is doing something iffy.
 
  
 
 Peter
 
 
 ------=_NextPart_000_0004_01C5F6C6.C7445970
 Content-Type: text/html;
 	charset="US-ASCII"
 Content-Transfer-Encoding: quoted-printable
 
 <html xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
 xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
 xmlns=3D"http://www.w3.org/TR/REC-html40">
 
 <head>
 <META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
 charset=3Dus-ascii">
 <meta name=3DGenerator content=3D"Microsoft Word 11 (filtered medium)">
 <style>
 <!--
  /* Style Definitions */
  p.MsoNormal, li.MsoNormal, div.MsoNormal
 	{margin:0in;
 	margin-bottom:.0001pt;
 	font-size:12.0pt;
 	font-family:"Times New Roman";}
 a:link, span.MsoHyperlink
 	{color:blue;
 	text-decoration:underline;}
 a:visited, span.MsoHyperlinkFollowed
 	{color:purple;
 	text-decoration:underline;}
 span.EmailStyle17
 	{mso-style-type:personal-compose;
 	font-family:Arial;
 	color:windowtext;}
 @page Section1
 	{size:8.5in 11.0in;
 	margin:1.0in 1.25in 1.0in 1.25in;}
 div.Section1
 	{page:Section1;}
 -->
 </style>
 
 </head>
 
 <body lang=3DEN-US link=3Dblue vlink=3Dpurple>
 
 <div class=3DSection1>
 
 <p class=3DMsoNormal><font size=3D2 face=3DArial><span =
 style=3D'font-size:10.0pt;
 font-family:Arial'>Hi,<o:p></o:p></span></font></p>
 
 <p class=3DMsoNormal><font size=3D2 face=3DArial><span =
 style=3D'font-size:10.0pt;
 font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
 
 <p class=3DMsoNormal><font size=3D2 face=3DArial><span =
 style=3D'font-size:10.0pt;
 font-family:Arial'>Problem is caused by sys/kern/kern_descrip.c =
 1.279.2.1. When
 the changes are undone, mgetty works. I am still figuring out if the =
 kernel
 patch is wrong, or that mgetty is doing something =
 iffy.<o:p></o:p></span></font></p>
 
 <p class=3DMsoNormal><font size=3D2 face=3DArial><span =
 style=3D'font-size:10.0pt;
 font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
 
 <p class=3DMsoNormal><font size=3D2 face=3DArial><span =
 style=3D'font-size:10.0pt;
 font-family:Arial'>Peter<o:p></o:p></span></font></p>
 
 </div>
 
 </body>
 
 </html>
 
 ------=_NextPart_000_0004_01C5F6C6.C7445970--
 
 

From: Kostik Belousov <kostikbel@gmail.com>
To: bug-followup@FreeBSD.org, norbert@feu-nrmf.ph
Cc:  
Subject: Re: kern/87208: /dev/cuad[0/1] bad file descriptor error during mgetty read
Date: Mon, 19 Dec 2005 18:37:42 +0200

 --LQksG6bCIzRHxTLp
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 
 Ok,
 it seems I have found the problem. Please, test the patch below:
 
 Index: sys/kern/kern_descrip.c
 ===================================================================
 RCS file: /usr/local/arch/ncvs/src/sys/kern/kern_descrip.c,v
 retrieving revision 1.289
 diff -u -r1.289 kern_descrip.c
 --- sys/kern/kern_descrip.c	30 Nov 2005 05:12:03 -0000	1.289
 +++ sys/kern/kern_descrip.c	19 Dec 2005 16:36:44 -0000
 @@ -1512,6 +1512,8 @@
  				newfdp->fd_freefile = i;
  		}
  	}
 +	if (newfdp->fd_freefile == -1)
 +		newfdp->fd_freefile = i;
  	FILEDESC_UNLOCK_FAST(fdp);
  	FILEDESC_LOCK(newfdp);
  	for (i = 0; i <= newfdp->fd_lastfile; ++i)
 @@ -1519,9 +1521,9 @@
  			fdused(newfdp, i);
  	FILEDESC_UNLOCK(newfdp);
  	FILEDESC_LOCK_FAST(fdp);
 -	if (newfdp->fd_freefile == -1)
 -		newfdp->fd_freefile = i;
  	newfdp->fd_cmask = fdp->fd_cmask;
 +	KASSERT(fd_first_free(newfdp, 0, newfdp->fd_nfiles) == newfdp->fd_freefile,
 +		("fd_first_free != fd_freefile fdp %p newfdp %p p %p", fdp, newfdp, curproc));
  	FILEDESC_UNLOCK_FAST(fdp);
  	return (newfdp);
  }
 
 --LQksG6bCIzRHxTLp
 Content-Type: application/pgp-signature
 Content-Disposition: inline
 
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.2 (FreeBSD)
 
 iD8DBQFDpuHVC3+MBN1Mb4gRAsWYAJ47ZIwOzd0XXHS8p3h6Zz+VUQBCPACgjbHb
 xtwn+aOwayji2I07Zbr+dxw=
 =CMsY
 -----END PGP SIGNATURE-----
 
 --LQksG6bCIzRHxTLp--

From: "Norbert P. Copones" <norbert@feu-nrmf.ph>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/87208: /dev/cuad[0/1] bad file descriptor error during 
     mgetty read
Date: Tue, 20 Dec 2005 01:43:10 +0800 (PHT)

 seems the workaround is already commited in the ports tree. it makes
 mgetty use dup2(2) instead of dup(2). mgetty works fine now.
 
 

From: Kostik Belousov <kostikbel@gmail.com>
To: bug-followup@FreeBSD.org, norbert@feu-nrmf.ph
Cc:  
Subject: Re: kern/87208: /dev/cuad[0/1] bad file descriptor error during mgetty read
Date: Tue, 20 Dec 2005 11:42:05 +0200

 --BwCQnh7xodEAoBMC
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 
 Yes,
 workaround just hide real kernel bug, that I'm trying to fix in the
 submitted patch.
 
 --BwCQnh7xodEAoBMC
 Content-Type: application/pgp-signature
 Content-Disposition: inline
 
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.2 (FreeBSD)
 
 iD8DBQFDp9HsC3+MBN1Mb4gRAlAlAJ9Zn50CUqrk8SVpafB3ZLUa0hr7QgCgv0bS
 qrYzPZnpn8Mcb2fsEikBLKA=
 =D/Dr
 -----END PGP SIGNATURE-----
 
 --BwCQnh7xodEAoBMC--
Responsible-Changed-From-To: freebsd-bugs->des 
Responsible-Changed-By: glebius 
Responsible-Changed-When: Fri Dec 23 10:15:22 UTC 2005 
Responsible-Changed-Why:  
Dag-Erling, please handle this. Looks like you have introduced the problem. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=87208 
Responsible-Changed-From-To: des->csjp 
Responsible-Changed-By: csjp 
Responsible-Changed-When: Sun Mar 19 21:47:30 UTC 2006 
Responsible-Changed-Why:  
I will take ownership of this PR as I am working on a fix. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=87208 
State-Changed-From-To: open->patched 
State-Changed-By: csjp 
State-Changed-When: Mon Mar 20 05:48:02 UTC 2006 
State-Changed-Why:  
An experimental fix has been commited to -CURRENT, once it's testing 
period expires, we will merge it into RELENG_6 

http://www.freebsd.org/cgi/query-pr.cgi?pr=87208 
State-Changed-From-To: patched->closed 
State-Changed-By: csjp 
State-Changed-When: Thu Mar 23 16:26:25 UTC 2006 
State-Changed-Why:  
Merged to RELENG_6 

http://www.freebsd.org/cgi/query-pr.cgi?pr=87208 
>Unformatted:
