From nobody@FreeBSD.org  Tue Nov 27 17:59:51 2007
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A1F0B16A421
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 27 Nov 2007 17:59:51 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 901BC13C4CE
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 27 Nov 2007 17:59:51 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.2/8.14.2) with ESMTP id lARHxjCx064228
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 27 Nov 2007 17:59:45 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.2/8.14.1/Submit) id lARHxi16064227;
	Tue, 27 Nov 2007 17:59:44 GMT
	(envelope-from nobody)
Message-Id: <200711271759.lARHxi16064227@www.freebsd.org>
Date: Tue, 27 Nov 2007 17:59:44 GMT
From: Charles Hardin <chardin@2wire.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: [PATCH] tty write is not always atomic for small lines
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         118287
>Category:       kern
>Synopsis:       [kernel] [patch] tty write is not always atomic for small lines
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    ed
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Nov 27 18:00:02 UTC 2007
>Closed-Date:    Mon Apr 13 11:21:22 UTC 2009
>Last-Modified:  Mon Apr 13 11:21:22 UTC 2009
>Originator:     Charles Hardin
>Release:        FreedBSD-CURRENT
>Organization:
2Wire Inc.
>Environment:
>Description:
During test automation there has been some unexpected failures because output from different processes has been interspersed.
>How-To-Repeat:
Using the following test code

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int
main(int argc, char **argv)
{
    char buf[512];
    int res, len;

    if (daemon(0, 1) != 0) {
        return 1;
    }

    len = snprintf(buf, sizeof(buf), "<This is writerd pid %d.>\n", getpid());
    if (len >= sizeof(buf)) {
        return 2;
    }
    while (1) {
        res = write(1, buf, len);
        if (res != len) {
            return 3;
        }
    }
}

when running a couple copies of this, it sometimes prints out this:

    <This is writerd pid 28.><This is writerd pid 30.>
    <This is writerd pid 30.>
    ...more lines like that...
    <This is writerd pid 30.>

    <This is writerd pid 28.>

In other words, several writes from pid 30 happen in the middle of a write from
pid 28

>Fix:
So, the attached diff modifies the behavior of ttwrite to try and dump at least OBUFSIZ at a time in the clist so that lines shorter than OBUFSIZ should never be intermingled... This solves 99% of the cases and is easy enough for us to fix the user code to keep it solved...

Patch attached with submission follows:

Index: tty.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/tty.c,v
retrieving revision 1.273
diff -u -r1.273 tty.c
--- tty.c	20 Jul 2007 09:41:54 -0000	1.273
+++ tty.c	27 Nov 2007 17:49:13 -0000
@@ -2046,13 +2046,16 @@
 	int i, hiwat, cnt, error, s;
 	char obuf[OBUFSIZ];
 
-	hiwat = tp->t_ohiwat;
 	cnt = uio->uio_resid;
 	error = 0;
 	cc = 0;
 	td = curthread;
 	p = td->td_proc;
 loop:
+	/* set a slightly different hi water mark to transfer at least
+	 * OBUFSIZ characters from the user write at a time.
+	 */
+	hiwat = imax(tp->t_ohiwat - OBUFSIZ, tp->t_olowat);
 	s = spltty();
 	if (ISSET(tp->t_state, TS_ZOMBIE)) {
 		splx(s);
@@ -2165,7 +2168,7 @@
 					cp++;
 					cc--;
 					if (ISSET(tp->t_lflag, FLUSHO) ||
-					    tp->t_outq.c_cc > hiwat)
+					    tp->t_outq.c_cc > tp->t_ohiwat)
 						goto ovhiwat;
 					continue;
 				}
@@ -2198,7 +2201,7 @@
 				goto loop;
 			}
 			if (ISSET(tp->t_lflag, FLUSHO) ||
-			    tp->t_outq.c_cc > hiwat)
+			    tp->t_outq.c_cc > tp->t_ohiwat)
 				break;
 		}
 		ttstart(tp);


>Release-Note:
>Audit-Trail:

From: Ed Schouten <ed@80386.nl>
To: bug-followup@FreeBSD.org, chardin@2wire.com
Cc: rwatson@FreeBSD.org
Subject: Re: kern/118287: [kernel] [patch] tty write is not always atomic
	for small lines
Date: Tue, 10 Feb 2009 19:32:17 +0100

 --RwGu8mu1E+uYXPWP
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 Content-Transfer-Encoding: quoted-printable
 
 Hello Charles,
 
 Robert Watson poked me to take a look at this PR. Unfortunately the new
 TTY layer still has the same `guarantees', so this problem still exists.
 The fix you proposed sounds pretty good in theory, but as you mentioned,
 it only solves a certain class of races. The problem is that output
 processing may cause data to grow 8 times, because of tab expansion.
 
 What's your opinion on this patch?
 
 	http://80386.nl/pub/tty-sync.diff
 
 It adds synchronisation to the call to ttydisc_write(), which should be
 pretty solid. Because I want to keep non-blocking writes as they are
 now, they can still cause data to get mangled, but I guess this is
 already pretty good.
 
 --=20
  Ed Schouten <ed@80386.nl>
  WWW: http://80386.nl/
 
 --RwGu8mu1E+uYXPWP
 Content-Type: application/pgp-signature
 Content-Disposition: inline
 
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.9 (FreeBSD)
 
 iEYEARECAAYFAkmRyDEACgkQ52SDGA2eCwVscwCfSvYGx0aC4aVouf1frP3QaqHf
 GIIAn33vIUgCYz6twqtdnBq/GbiAKLbQ
 =nknL
 -----END PGP SIGNATURE-----
 
 --RwGu8mu1E+uYXPWP--
Responsible-Changed-From-To: freebsd-bugs->ed 
Responsible-Changed-By: rwatson 
Responsible-Changed-When: Tue Feb 10 18:42:31 UTC 2009 
Responsible-Changed-Why:  
Assign to Ed Schouten, who is maintaining our TTY layer. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=118287 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/118287: commit references a PR
Date: Wed, 11 Feb 2009 16:29:08 +0000 (UTC)

 Author: ed
 Date: Wed Feb 11 16:28:49 2009
 New Revision: 188487
 URL: http://svn.freebsd.org/changeset/base/188487
 
 Log:
   Serialize write() calls on TTYs.
   
   Just like the old TTY layer, the current MPSAFE TTY layer does not make
   any attempt to serialize calls of write(). Data is copied into the
   kernel in 256 (TTY_STACKBUF) byte chunks. If a write() call occurs at
   the same time, the data may interleave. This is especially likely when
   the TTY starts blocking, because the output queue reaches the high
   watermark.
   
   I've implemented this by adding a new flag, TTY_BUSY_OUT, which is used
   to mark a TTY as having a thread stuck in write(). Because I don't want
   non-blocking processes to be possibly blocked by a sleeping thread, I'm
   still allowing it to bypass the protection. According to this message,
   the Linux kernel returns EAGAIN in such cases, but I think that's a
   little too restrictive:
   
   	http://kerneltrap.org/index.php?q=mailarchive/linux-kernel/2007/5/2/85418/thread
   
   PR:		kern/118287
 
 Modified:
   head/sys/kern/tty.c
   head/sys/sys/tty.h
   head/usr.sbin/pstat/pstat.8
   head/usr.sbin/pstat/pstat.c
 
 Modified: head/sys/kern/tty.c
 ==============================================================================
 --- head/sys/kern/tty.c	Wed Feb 11 15:55:01 2009	(r188486)
 +++ head/sys/kern/tty.c	Wed Feb 11 16:28:49 2009	(r188487)
 @@ -438,15 +438,28 @@ ttydev_write(struct cdev *dev, struct ui
  
  	if (tp->t_termios.c_lflag & TOSTOP) {
  		error = tty_wait_background(tp, curthread, SIGTTOU);
 -		if (error) {
 -			tty_unlock(tp);
 -			return (error);
 -		}
 +		if (error)
 +			goto done;
  	}
  
 -	error = ttydisc_write(tp, uio, ioflag);
 -	tty_unlock(tp);
 +	if (ioflag & IO_NDELAY && tp->t_flags & TF_BUSY_OUT) {
 +		/* Allow non-blocking writes to bypass serialization. */
 +		error = ttydisc_write(tp, uio, ioflag);
 +	} else {
 +		/* Serialize write() calls. */
 +		while (tp->t_flags & TF_BUSY_OUT) {
 +			error = tty_wait(tp, &tp->t_bgwait);
 +			if (error)
 +				goto done;
 +		}
 + 
 + 		tp->t_flags |= TF_BUSY_OUT;
 +		error = ttydisc_write(tp, uio, ioflag);
 + 		tp->t_flags &= ~TF_BUSY_OUT;
 +		cv_broadcast(&tp->t_bgwait);
 +	}
  
 +done:	tty_unlock(tp);
  	return (error);
  }
  
 @@ -1880,6 +1893,11 @@ static struct {
  	{ TF_ZOMBIE,		'Z' },
  	{ TF_HOOK,		's' },
  
 +	/* Keep these together -> 'bi' and 'bo'. */
 +	{ TF_BUSY,		'b' },
 +	{ TF_BUSY_IN,		'i' },
 +	{ TF_BUSY_OUT,		'o' },
 +
  	{ 0,			'\0'},
  };
  
 
 Modified: head/sys/sys/tty.h
 ==============================================================================
 --- head/sys/sys/tty.h	Wed Feb 11 15:55:01 2009	(r188486)
 +++ head/sys/sys/tty.h	Wed Feb 11 16:28:49 2009	(r188487)
 @@ -83,6 +83,9 @@ struct tty {
  #define	TF_BYPASS	0x04000	/* Optimized input path. */
  #define	TF_ZOMBIE	0x08000	/* Modem disconnect received. */
  #define	TF_HOOK		0x10000	/* TTY has hook attached. */
 +#define	TF_BUSY_IN	0x20000	/* Process busy in read() -- not supported. */
 +#define	TF_BUSY_OUT	0x40000	/* Process busy in write(). */
 +#define	TF_BUSY		(TF_BUSY_IN|TF_BUSY_OUT)
  	unsigned int	t_revokecnt;	/* (t) revoke() count. */
  
  	/* Buffering mechanisms. */
 
 Modified: head/usr.sbin/pstat/pstat.8
 ==============================================================================
 --- head/usr.sbin/pstat/pstat.8	Wed Feb 11 15:55:01 2009	(r188486)
 +++ head/usr.sbin/pstat/pstat.8	Wed Feb 11 16:28:49 2009	(r188487)
 @@ -206,6 +206,11 @@ block mode input routine in use
  connection lost
  .It s
  i/o being snooped
 +.It b
 +busy in
 +.Xr read 2
 +or
 +.Xr write 2
  .El
  .Pp
  The
 
 Modified: head/usr.sbin/pstat/pstat.c
 ==============================================================================
 --- head/usr.sbin/pstat/pstat.c	Wed Feb 11 15:55:01 2009	(r188486)
 +++ head/usr.sbin/pstat/pstat.c	Wed Feb 11 16:28:49 2009	(r188487)
 @@ -315,6 +315,11 @@ static struct {
  	{ TF_ZOMBIE,		'Z' },
  	{ TF_HOOK,		's' },
  
 +	/* Keep these together -> 'bi' and 'bo'. */
 +	{ TF_BUSY,		'b' },
 +	{ TF_BUSY_IN,		'i' },
 +	{ TF_BUSY_OUT,		'o' },
 +
  	{ 0,			'\0'},
  };
  
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: open->patched 
State-Changed-By: ed 
State-Changed-When: Wed Feb 11 16:47:07 UTC 2009 
State-Changed-Why:  
Patch committed to HEAD, but does not apply to RELENG_*. 


http://www.freebsd.org/cgi/query-pr.cgi?pr=118287 

From: Charles Hardin <chardin@2wire.com>
To: Ed Schouten <ed@80386.nl>,
	<bug-followup@FreeBSD.org>
Cc: <rwatson@FreeBSD.org>
Subject: Re: kern/118287: [kernel] [patch] tty write is not always atomicfor
 small lines
Date: Wed, 11 Feb 2009 11:17:58 -0800

 I repatched our code to a similar setup =AD we are still using the legacy tty
 layer, and this does appear to be a much better design choice then the 1 da=
 y
 hack :)
 
 The async cases getting mangled is fine with me, our contention was the
 blocking cases getting mangled since we need predictable output for expect
 matches on the test runs and the user code isn't doing async writes during
 these tests.
 
 Thanks for taking the time on this,
 Charles Hardin
 
 On 2/10/09 10:32 AM, "Ed Schouten" <ed@80386.nl> wrote:
 
 > Hello Charles,
 >=20
 > Robert Watson poked me to take a look at this PR. Unfortunately the new
 > TTY layer still has the same `guarantees', so this problem still exists.
 > The fix you proposed sounds pretty good in theory, but as you mentioned,
 > it only solves a certain class of races. The problem is that output
 > processing may cause data to grow 8 times, because of tab expansion.
 >=20
 > What's your opinion on this patch?
 >=20
 >         http://80386.nl/pub/tty-sync.diff
 >=20
 > It adds synchronisation to the call to ttydisc_write(), which should be
 > pretty solid. Because I want to keep non-blocking writes as they are
 > now, they can still cause data to get mangled, but I guess this is
 > already pretty good.
 >=20
 > --
 >  Ed Schouten <ed@80386.nl>
 >  WWW: http://80386.nl/
 >=20
 
State-Changed-From-To: patched->closed 
State-Changed-By: ed 
State-Changed-When: Mon Apr 13 11:21:21 UTC 2009 
State-Changed-Why:  
I guess we should just leave RELENG_*'s TTY code alone, which is why I'm 
closing this PR. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=118287 
>Unformatted:
