From sa@mail.svzserv.kemerovo.su  Mon Oct 29 21:36:25 2001
Return-Path: <sa@mail.svzserv.kemerovo.su>
Received: from mail.svzserv.kemerovo.su (mail.svzserv.kemerovo.su [213.184.65.66])
	by hub.freebsd.org (Postfix) with ESMTP id 6781137B408
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 29 Oct 2001 21:36:22 -0800 (PST)
Received: (from root@localhost)
	by mail.svzserv.kemerovo.su (8.11.6/8.11.6) id f9U5aJR41685;
	Tue, 30 Oct 2001 12:36:19 +0700 (NKZ)
	(envelope-from sa)
Message-Id: <200110300536.f9U5aJR41685@mail.svzserv.kemerovo.su>
Date: Tue, 30 Oct 2001 12:36:19 +0700 (NKZ)
From: eugen@grosbein.pp.ru
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: /bin/sh's hangling of some characters is wrong - loss of data
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         31627
>Category:       bin
>Synopsis:       /bin/sh's hangling of some characters is wrong - loss of data
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Oct 29 21:40:00 PST 2001
>Closed-Date:    Tue Nov 6 11:53:51 PST 2001
>Last-Modified:  Tue Nov  6 20:40:01 PST 2001
>Originator:     Eugene Grosbein
>Release:        FreeBSD 4.4-STABLE i386
>Organization:
ISP Svyaz Service
>Environment:
System: FreeBSD mail.svzserv.kemerovo.su 4.4-STABLE FreeBSD 4.4-STABLE #0: Thu Oct 11 13:22:30 NKZS 2001 sa@mail.svzserv.kemerovo.su:/usr/obj/usr/src/sys/MAIL i386

>Description:
	/bin/sh 'eats' some characters resulting in loss of data

>How-To-Repeat:
	
	run this script using sh -x:

	#!/bin/sh -x
	string=`printf "test\201string"`
	echo $string | hd

	You will see that a symbol '' (dec 129, hex 0x81, oct 0201)
	is missing in echo's parameter and hd approves this.

	This also leads to impossibility for shell script to process
	a file with a name containing this symbol if it's created by
	another program.

>Fix:

	Unknown for me.
>Release-Note:
>Audit-Trail:

From: Thomas Quinot <thomas@cuivre.fr.eu.org>
To: Eugene Grosbein <eugen@svzserv.kemerovo.su>
Cc: stable@freebsd.org, freebsd-gnats-submit@freebsd.org
Subject: Re: bin/31627 sh(1) is broken - loss of data!
Date: Tue, 6 Nov 2001 18:18:34 +0100

 Le 2001-11-06, Eugene Grosbein crivait :
 
 > #!/bin/sh
 > string=`printf "\21"`
 > echo $string | hd
  
 > Replace 21 with 201 and rerun. You see:
 > 00000000  0a                                                |.|
 > 00000001
 
 Can't reproduce here for the value \201, but for the other values
 you mention it looks like perfectly normal and expected behaviour
 from sh(1). It is not surprising at all that some characters "disappear"
 here: since $string appears unquoted, any character which is whitespace
 w.r.t. shell parsing rules won't be passed to echo.
 Try to quote your string:
   echo "$string" | hd
 
 In your other example, you use the 'read' builtin to get characters
 from jot, but read is /also/ defined to apply shell field splitting
 rules. 
 
 A correct version of your test follows:
 
 #!/bin/sh -x
 
 for n in `jot 256 0`
 do
   c="`jot -c 1 $n`"
   echo "$c" | wc -c | grep -v 2 && echo "$n"
 done
 
 which correctly produces the following output:
 
        1
 0
        1
 10
 
 because a shell variable cannot contain a null character (which is
 a string end marker), and backquote expansion is defined to remove
 trailing newlines.
 
 This is legal and expected behaviour, not a bug.
 
 Thomas.
 
 -- 
     Thomas.Quinot@Cuivre.FR.EU.ORG

From: Eugene Grosbein <eugen@grosbein.pp.ru>
To: Thomas Quinot <thomas@cuivre.fr.eu.org>
Cc: Eugene Grosbein <eugen@svzserv.kemerovo.su>, stable@freebsd.org,
	freebsd-gnats-submit@freebsd.org
Subject: Re: bin/31627 sh(1) is broken - loss of data!
Date: Wed, 7 Nov 2001 01:00:20 +0700

 On Tue, Nov 06, 2001 at 06:18:34PM +0100, Thomas Quinot wrote:
 
 > > #!/bin/sh
 > > string=`printf "\21"`
 > > echo $string | hd
 >  
 > > Replace 21 with 201 and rerun. You see:
 > > 00000000  0a                                                |.|
 > > 00000001
 > 
 > Can't reproduce here for the value \201, but for the other values
 > you mention it looks like perfectly normal and expected behaviour
 > from sh(1). It is not surprising at all that some characters "disappear"
 > here: since $string appears unquoted, any character which is whitespace
 > w.r.t. shell parsing rules won't be passed to echo.
 > Try to quote your string:
 >   echo "$string" | hd
 
 I still get unexpected results:
 
 #!/bin/sh
 string=`printf "\210"`
 echo "$string" | hd
 
 gives me:
 00000000  0a                                                |.|
 00000001
 
 The same with \12 and \201. Other codes are Ok, thank you for explanation.
 I see that \12 is removed by backquotes but wonder what with \201 and \210.
 
 Eugene Grosbein
State-Changed-From-To: open->closed 
State-Changed-By: dwmalone 
State-Changed-When: Tue Nov 6 11:53:51 PST 2001 
State-Changed-Why:  
Fixed by tegge in -current and RELENG_4. 

http://www.FreeBSD.org/cgi/query-pr.cgi?pr=31627 

From: Thomas Quinot <thomas@cuivre.fr.eu.org>
To: Eugene Grosbein <eugen@grosbein.pp.ru>
Cc: Thomas Quinot <thomas@cuivre.fr.eu.org>,
	Eugene Grosbein <eugen@svzserv.kemerovo.su>, stable@freebsd.org,
	freebsd-gnats-submit@freebsd.org
Subject: Re: bin/31627 sh(1) is broken - loss of data!
Date: Tue, 6 Nov 2001 20:51:13 +0100

 Le 2001-11-06, Eugene Grosbein crivait :
 
 > I still get unexpected results:
 
 You are absolutely right. My tests succeeded because I tried your
 script on -CURRENT, where this bug was fixed a few weeks ago.
 The fix to -STABLE was MFC'd last week:
 
 Revision 1.31.2.3
 Branch: RELENG_4
 
 MFC: BASESYNTAX, DQSYNTAX, SQSYNTAX and ARISYNTAX handles negative
 indexes.
      Allow those to be used to properly quote characters in the shell
      control character range.
 
 PR:		31627
 
 so updating your /bin/sh with the latest -STABLE version should resolve
 your problem.
 
 Thomas.
 
 -- 
     Thomas.Quinot@Cuivre.FR.EU.ORG

From: Eugene Grosbein <eugen@grosbein.pp.ru>
To: Thomas Quinot <thomas@cuivre.fr.eu.org>
Cc: stable@FreeBSD.org, freebsd-gnats-submit@FreeBSD.org
Subject: Re: bin/31627 sh(1) is broken - loss of data!
Date: Wed, 7 Nov 2001 11:22:02 +0700

 On Tue, Nov 06, 2001 at 08:51:13PM +0100, Thomas Quinot wrote:
 
 > > I still get unexpected results:
 > You are absolutely right. My tests succeeded because I tried your
 > script on -CURRENT, where this bug was fixed a few weeks ago.
 > The fix to -STABLE was MFC'd last week:
 > 
 > Revision 1.31.2.3
 > Branch: RELENG_4
 > 
 > MFC: BASESYNTAX, DQSYNTAX, SQSYNTAX and ARISYNTAX handles negative
 > indexes.
 >      Allow those to be used to properly quote characters in the shell
 >      control character range.
 > 
 > PR:		31627
 > 
 > so updating your /bin/sh with the latest -STABLE version should resolve
 > your problem.
 
 I've updated to -STABLE and this works now as expected.
 Thank you very much. PR should be closed now.
 
 Eugene Grosbein
 
>Unformatted:
