From netch@segfault.kiev.ua  Thu Aug  6 10:53:43 2009
Return-Path: <netch@segfault.kiev.ua>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 784CA1065673
	for <FreeBSD-gnats-submit@freebsd.org>; Thu,  6 Aug 2009 10:53:43 +0000 (UTC)
	(envelope-from netch@segfault.kiev.ua)
Received: from segfault.kiev.ua (segfault.kiev.ua [193.193.193.4])
	by mx1.freebsd.org (Postfix) with ESMTP id EDCC18FC1A
	for <FreeBSD-gnats-submit@freebsd.org>; Thu,  6 Aug 2009 10:53:42 +0000 (UTC)
Received: from segfault.kiev.ua (localhost.segfault.kiev.ua [127.0.0.1])
	by segfault.kiev.ua (8.14.3/8.14.3/8.Who.Cares) with ESMTP id n76Arbm7097959;
	Thu, 6 Aug 2009 13:53:37 +0300 (EEST)
	(envelope-from netch@segfault.kiev.ua)
Received: (from netch@localhost)
	by segfault.kiev.ua (8.14.3/8.14.3/Submit) id n76Arbfh097956;
	Thu, 6 Aug 2009 13:53:37 +0300 (EEST)
	(envelope-from netch)
Message-Id: <200908061053.n76Arbfh097956@segfault.kiev.ua>
Date: Thu, 6 Aug 2009 13:53:37 +0300 (EEST)
From: Valentin Nechayev <netch@segfault.kiev.ua>
Reply-To: Valentin Nechayev <netch@segfault.kiev.ua>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: script(1) doesn't read tail of program output, pty is suspected
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         137482
>Category:       kern
>Synopsis:       script(1) doesn't read tail of program output; pty(4) is suspected
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    ed
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Aug 06 11:00:14 UTC 2009
>Closed-Date:    Mon Sep 20 19:33:48 UTC 2010
>Last-Modified:  Mon Sep 20 19:33:48 UTC 2010
>Originator:     Valentin Nechayev
>Release:        FreeBSD 7.2-RELEASE-p1 i386
>Organization:
private
>Environment:
System: FreeBSD segfault.kiev.ua 7.2-RELEASE-p1 FreeBSD 7.2-RELEASE-p1 #1: Sun Jun 14 10:32:33 EEST 2009 root@segfault.kiev.ua:/usr/BSD/obj/usr/BSD/src/sys/sf72 i386

>Description:

script(1) doesn't read tail of program output when program in child process
finishes. On very quick & short output, this leads to empty output. In
other cases, this can give partial data loss.

I treat this as kernel issue, not userland, due to reasons explained below,
that's why category is "kern".

>How-To-Repeat:

Enter a directory with very few files (less than 10 and short named).
Run a few times:

script s01 ls

In most cases, script output is empty. Ktrace shows that ls(1) prints
its output, but script(1) ignores it.

Adding explicit timeout:

script s01 sh -c 'ls; sleep 1'

causes ls(1) output to be recorded in 100% runs.

With more long and complex cases, e.g.:

script s01 ls -lR /usr/lib

output contents are unstable and change from one run to another.

All this means race condition is reached.

But, detailed ktrace analysis shows this seems not to be happen in script(1),
but in pty(4). Essential parts of kdump:

 97747 script   CALL  open(0xbfbfdea1,O_RDWR|O_NOCTTY,<unused>0x280897b8)
 97747 script   NAMI  "/dev/ptyqa"
 97747 script   RET   open 4
 97747 script   CALL  fstat(0x4,0xbfbfddec)
 97747 script   STRU  struct stat {dev=83951360, ino=186, mode=crw-rw-rw- , nlin
k=1, uid=0, gid=0, rdev=186, atime=1249554087, stime=1249511994, ctime=124951199
4, birthtime=-1, size=0, blksize=4096, blocks=0, flags=0x0 }
 97747 script   RET   fstat 0
[...]
 97749 ls       CALL  write(0x1,0x28216000,0x1d)
 97749 ls       GIO   fd 1 wrote 29 bytes
       "file1           file2           ktrace.out      s01
       "
 97749 ls       RET   write 29/0x1d
 97749 ls       CALL  sigprocmask(SIG_BLOCK,0x28074600,0xbfbfe12c)
 97749 ls       RET   sigprocmask 0
 97749 ls       CALL  sigprocmask(SIG_SETMASK,0x28074610,0)
 97749 ls       RET   sigprocmask 0
 97749 ls       CALL  sigprocmask(SIG_BLOCK,0x28074600,0xbfbfe0fc)
 97749 ls       RET   sigprocmask 0
 97749 ls       CALL  sigprocmask(SIG_SETMASK,0x28074610,0)
 97749 ls       RET   sigprocmask 0
 97749 ls       CALL  exit(0)
 97747 script   RET   select 1
 97747 script   CALL  read(0x4,0xbfbfe39c,0x400)
 97747 script   GIO   fd 4 read 0 bytes
       ""
 97747 script   RET   read 0
 97747 script   CALL  wait4(0xffffffff,0xbfbfded8,WNOHANG,0)
 97747 script   RET   wait4 97749/0x17dd5
 97747 script   CALL  wait4(0xffffffff,0xbfbfded8,WNOHANG,0)
 97747 script   RET   wait4 -1 errno 10 No child processes

So, read() from /dev/ptyqa gives 0 when data are written to slave side but
seems to be dropped because process terminated and pty is closed.
Earlier, they were kept in pty state until master side is closed.
>Fix:
>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->feedback 
State-Changed-By: arundel 
State-Changed-When: Mon Sep 20 16:04:44 UTC 2010 
State-Changed-Why:  
We need to check out if this PR is still an issue after r181905 (new tty layer). 


Responsible-Changed-From-To: freebsd-bugs->ed 
Responsible-Changed-By: arundel 
Responsible-Changed-When: Mon Sep 20 16:04:44 UTC 2010 
Responsible-Changed-Why:  
Hi Ed. Could you have a look at this PR, since it seems to be pty(4) related 
and might have been resolved with the introduction of your new tty layer? 
I wasn't able to reproduce it running HEAD (r212773; amd64). 
If you don't have time atm or think that this is not actually related to your 
tty work, please assign it back into the pool. ;) 

http://www.freebsd.org/cgi/query-pr.cgi?pr=137482 
State-Changed-From-To: feedback->closed 
State-Changed-By: ed 
State-Changed-When: Mon Sep 20 19:33:47 UTC 2010 
State-Changed-Why:  
This bug has been fixed and MFCd quite some time ago. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=137482 
>Unformatted:
