From jin@adv-pc-1.lbl.gov  Fri Apr 25 12:04:46 1997
Received: from adv-pc-1.lbl.gov (adv-pc-1.lbl.gov [128.3.196.189])
          by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id MAA04984
          for <FreeBSD-gnats-submit@freebsd.org>; Fri, 25 Apr 1997 12:04:45 -0700 (PDT)
Received: (from jin@localhost)
	by adv-pc-1.lbl.gov (8.8.5/8.8.5) id MAA00477;
	Fri, 25 Apr 1997 12:04:36 -0700 (PDT)
Message-Id: <199704251904.MAA00477@adv-pc-1.lbl.gov>
Date: Fri, 25 Apr 1997 12:04:36 -0700 (PDT)
From: "Jin Guojun[ITG]" <jin@adv-pc-1.lbl.gov>
Reply-To: jin@adv-pc-1.lbl.gov
To: FreeBSD-gnats-submit@freebsd.org
Subject: sh mis-interpret the file name / awk failure
X-Send-Pr-Version: 3.2

>Number:         3387
>Category:       bin
>Synopsis:       sh mis-interpret the file name / awk failure
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    steve
>State:          closed
>Quarter:
>Keywords:
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Apr 25 12:10:01 PDT 1997
>Closed-Date:    Sun Apr 27 20:58:59 PDT 1997
>Last-Modified:  Tue Apr 29 01:40:01 PDT 1997
>Originator:     Jin Guojun[ITG]
>Release:        All FreeBSD RELEASE i386
>Organization:
/-------------- Jin Guojun ------------ v -- Internet: j_guojun@lbl.gov ---\
|	Imaging & Distributed Computing | Usenet: ucbvax!j_guojun@lbl.gov  |
|	Lawrence Berkeley Laboratory	| Bitnet:	--		   |
|	50B-2239, Berkeley, CA 94720	-  jin%george.lbl.gov@Csa3.LBL.Gov |
\--Ph#:(510) 486-7531 + Fax: 486-6363 --^--http://www-itg.lbl.gov/ITG.html-/
>Environment:

	All FreeBSD RELEASEs

>Description:

	In a script, if the variable for a file name had a white space at
	the end, the sh will treat it differently when the file name variable
	is used at redirected I/O, from used at fopen(). The white space is
	interpreted as part of the file name when I/O redirections, such as,
	'<', '<<', '>>', and '>' are used. This is a problem for using in
	a script.
	The FreeBSD awk generates such withe space:

		MyArch=`uname -v | awk -F/ '{printf $NF}`
		echo	"=$MyArch="
		=GENERIC =

	I am not sure which one is the problem, maybe both.

>How-To-Repeat:

	For awk problem, see Description above.
	Below is the sh issue:

	% more abc
	file abc is a testing file

	% more test.script
	FILE_NAME="abc "
	more < $FILE_NAME
	echo	"=$FILE_NAME="
	more $FILE_NAME

	% test.script
test.script: cannot open abc : no such file
=abc =
file abc is a testing file


>Fix:
	
	one of them has to be fixed, but I do know which (-:
	Maybe both?

>Release-Note:
>Audit-Trail:

From: John-Mark Gurney <jmg@hydrogen.nike.efn.org>
To: jin@adv-pc-1.lbl.gov
Cc: FreeBSD-gnats-submit@freebsd.org
Subject: Re: bin/3387: sh mis-interpret the file name / awk failure
Date: Sat, 26 Apr 1997 04:51:39 -0700

 Jin Guojun[ITG] scribbled this message on Apr 25:
 > >Description:
 > 
 > 	In a script, if the variable for a file name had a white space at
 > 	the end, the sh will treat it differently when the file name variable
 > 	is used at redirected I/O, from used at fopen(). The white space is
 > 	interpreted as part of the file name when I/O redirections, such as,
 > 	'<', '<<', '>>', and '>' are used. This is a problem for using in
 > 	a script.
 > 	The FreeBSD awk generates such withe space:
 > 
 > 		MyArch=`uname -v | awk -F/ '{printf $NF}`
 > 		echo	"=$MyArch="
 > 		=GENERIC =
 > 
 > 	I am not sure which one is the problem, maybe both.
 
 but of course...  when you do the MyArch... it will store EXACTLY what awk
 outputed... and that is with a \n at the end... one simple fix is:
 MyArch=`uname -v | awk -F/ '{printf("%s", $NF)}'`
 MyArch=`echo $MyArch`
 echo "=$MyArch="
 
 the second setting of MyArch eliminates the \n at the end...  I had a
 simlar problem with a script I was writing...  it's just that borne shell
 does EXACTLY what you tell it... :)
 
 > >How-To-Repeat:
 > 
 > 	For awk problem, see Description above.
 > 	Below is the sh issue:
 > 
 > 	% more abc
 > 	file abc is a testing file
 > 
 > 	% more test.script
 > 	FILE_NAME="abc "
 > 	more < $FILE_NAME
 > 	echo	"=$FILE_NAME="
 > 	more $FILE_NAME
 > 
 > 	% test.script
 > test.script: cannot open abc : no such file
 > =abc =
 > file abc is a testing file
 
 this is just as easy...  basicly sh does the variable expansion.. then
 it does the "parsing" of the command line options... if you replaced
 the second more with:
 more "$FILE_NAME"
 you would have the same problem...
 
 if you don't have any objections (as this is borne shell's expected
 behavior), I'll close this.. ttyl..
 
 -- 
   John-Mark
   Cu Networking                             Modem/FAX: +1 541 683 6954
 
   Live in Peace, destroy Micro$oft, support free software, run FreeBSD
State-Changed-From-To: open->closed 
State-Changed-By: steve 
State-Changed-When: Sun Apr 27 20:58:59 PDT 1997 
State-Changed-Why:  
This is not a bug, rather a misfeature that ksh also 
exhibits.  Maybe our sh(1) could be smarter, but POSIX 
backs ksh with all it's inequities. :( 

From: "Jin Guojun[ITG]" <jin@george.lbl.gov>
To: gurney_j@resnet.uoregon.edu
Cc: FreeBSD-gnats-submit@freebsd.org, bugs@freebsd.org
Subject: Re: bin/3387: sh mis-interpret the file name / awk failure
Date: Mon, 28 Apr 1997 12:39:08 -0700

 } > >Description:
 } > 
 } >       In a script, if the variable for a file name had a white space at
 } >       the end, the sh will treat it differently when the file name variable
 } >       is used at redirected I/O, from used at fopen(). The white space is
 } >       interpreted as part of the file name when I/O redirections, such as,
 } >       '<', '<<', '>>', and '>' are used. This is a problem for using in
 } >       a script.
 } >       The FreeBSD awk generates such withe space:
 } > 
 } >               MyArch=`uname -v | awk -F/ '{printf $NF}`
 } >               echo    "=$MyArch="
 } >               =GENERIC =
 } > 
 } >       I am not sure which one is the problem, maybe both.
 } 
 } but of course...  when you do the MyArch... it will store EXACTLY what awk
 } outputed... and that is with a \n at the end... one simple fix is:
 } MyArch=`uname -v | awk -F/ '{printf("%s", $NF)}'`
 } MyArch=`echo $MyArch`
 } echo "=$MyArch="
 } 
 } the second setting of MyArch eliminates the \n at the end...  I had a
 } simlar problem with a script I was writing...  it's just that borne shell
 } does EXACTLY what you tell it... :)
 
 It is not true. The awk for all other platforms does not output such \n.
 May it is GNU awk's problem. At least, we need make things consistant around
 the world. Why the tool with same name behaves differnetly cross platforms?
 Also, I doubt GNU awk output \n. If you do this:
 
 	cat something > $MyArch
 
 The file created will be "GENERIC " instead of "GENERIC\n".
 
 } > >How-To-Repeat:
 } > 
 } >       For awk problem, see Description above.
 } >       Below is the sh issue:
 } > 
 } >       % more abc
 } >       file abc is a testing file
 } > 
 } >       % more test.script
 } >       FILE_NAME="abc "
 } >       more < $FILE_NAME
 } >       echo    "=$FILE_NAME="
 } >       more $FILE_NAME
 } > 
 } >       % test.script
 } > test.script: cannot open abc : no such file
 } > =abc =
 } > file abc is a testing file
 } 
 } this is just as easy...  basicly sh does the variable expansion.. then
 } it does the "parsing" of the command line options... if you replaced
 } the second more with:
 } more "$FILE_NAME"
 } you would have the same problem...
 
 I do NOT understand what you mean --
 	replacing the second more with:
 	more "$FILE_NAME"
 
 the second more is -- more "$FILE_NAME" -- if it is repleaced by itself,
 what makes different? Also, the second more has no problem at all. Why first
 one has?
 
 } if you don't have any objections (as this is borne shell's expected
 } behavior), I'll close this.. ttyl..
 
 If you think that using standard I/O should be different from using fopen(),
 then, you may close the case. The both problems just make whole world not
 consistant.
 
 Some original spec./stuff may have flaw. This does not mean defective is good.
 That is why the new technology always replaces the old technology. Then, things
 are moving forward. DOS expected behavior is like DOS, but why people try to
 get rid of it? Intel 286 is expected to behave like 286, why Pentium? or
 even Pentium-II?  Why NOT Z80 or 8086?
 If you think this is my objections, please take a consideration. Otherwise,
 please ignore it.
 
 Thanks,
 
 -Jin
 

From: John-Mark Gurney <jmg@hydrogen.nike.efn.org>
To: "Jin Guojun[ITG]" <jin@george.lbl.gov>
Cc: FreeBSD-gnats-submit@freebsd.org, bugs@freebsd.org
Subject: Re: bin/3387: sh mis-interpret the file name / awk failure
Date: Mon, 28 Apr 1997 14:51:32 -0700

 Jin Guojun[ITG] scribbled this message on Apr 28:
 > } the second setting of MyArch eliminates the \n at the end...  I had a
 > } simlar problem with a script I was writing...  it's just that borne shell
 > } does EXACTLY what you tell it... :)
 > 
 > It is not true. The awk for all other platforms does not output such \n.
 > May it is GNU awk's problem. At least, we need make things consistant around
 > the world. Why the tool with same name behaves differnetly cross platforms?
 > Also, I doubt GNU awk output \n. If you do this:
 > 
 > 	cat something > $MyArch
 > 
 > The file created will be "GENERIC " instead of "GENERIC\n".
 
 yeh... this does look like a awk bug...  I just did some more testing
 and you are right:
 hydrogen,ttyp2,~,501$uname -v | awk -F/ '{printf("%s", $NF)}'|hexdump
 0000000 7968 7264 676f 6e65 0020
 
 sorry.. I was confusing an awk bug for a bornse shell bug.. :)
 
 > } > >How-To-Repeat:
 > } > 
 > } >       For awk problem, see Description above.
 > } >       Below is the sh issue:
 > } > 
 > } >       % more abc
 > } >       file abc is a testing file
 > } > 
 > } >       % more test.script
 > } >       FILE_NAME="abc "
 > } >       more < $FILE_NAME
 > } >       echo    "=$FILE_NAME="
 > } >       more $FILE_NAME
 > } > 
 > } >       % test.script
 > } > test.script: cannot open abc : no such file
 > } > =abc =
 > } > file abc is a testing file
 > } 
 > } this is just as easy...  basicly sh does the variable expansion.. then
 > } it does the "parsing" of the command line options... if you replaced
 > } the second more with:
 > } more "$FILE_NAME"
 > } you would have the same problem...
 > 
 > I do NOT understand what you mean --
 > 	replacing the second more with:
 > 	more "$FILE_NAME"
 
 see.. the second more you have (the last one) is not `more "$FILE_NAME"'
 but `more $FILE_NAME'... so bornse first does variable expansion and you
 end up with `more "abc "' and `more abc '... then it does the parsing for
 arguments and get: more, "abc " for the first one and the second gets:
 more, abc... see the difference?
 
 > the second more is -- more "$FILE_NAME" -- if it is repleaced by itself,
 > what makes different? Also, the second more has no problem at all. Why first
 > one has?
 
 the difference is the double quotes around it... quoting changes MANY
 things...  it tried to preserve spaces and when you quote stuff...
 
 > } if you don't have any objections (as this is borne shell's expected
 > } behavior), I'll close this.. ttyl..
 > 
 > If you think that using standard I/O should be different from using fopen(),
 > then, you may close the case. The both problems just make whole world not
 > consistant.
 
 well... I just investigated the borne shell redirection problem...  basicly
 I went to a solaris box with ksh on it...  and ksh behaves the same way
 (in the second problem) as borne does...  as our borne shell is trying to
 be "POSIX" standard, if we changed behavior we would no longer have that
 adherance...
 
 h,pts75,W,4$FILE_NAME="abc "
 h,pts75,W,5$more <$FILE_NAME
 ksh: abc : cannot open
 h,pts75,W,6$echo "=$FILE_NAME="
 =abc =
 h,pts75,W,7$more $FILE_NAME
 test file
 
 I would recommend you file a seperate bug report on the awk problem..
 and send mail to the GNU awk people and find out what the fix is...
 it might just be a simple upgrade...
 
 > Some original spec./stuff may have flaw. This does not mean defective is good.
 > That is why the new technology always replaces the old technology. Then, things
 > are moving forward. DOS expected behavior is like DOS, but why people try to
 > get rid of it? Intel 286 is expected to behave like 286, why Pentium? or
 > even Pentium-II?  Why NOT Z80 or 8086?
 
 yes... but the hole x86 line has some major flaws... so why haven't we gone
 to another chip that does the same work but better?? because it's avaliable
 and inexpensive...
 
 > If you think this is my objections, please take a consideration. Otherwise,
 > please ignore it.
 
 I agree that there is a awk bug... but that is the only bug here...  if
 you object to maintaining POSIX compliance... talk with Joerg Wunsch
 (joerg_wunsch@uriah.heep.sax.de) and try to argue him out of it.. :)
 
 hope this message clarifies it...
 
 -- 
   John-Mark
   Cu Networking                             Modem/FAX: +1 541 683 6954
 
   Live in Peace, destroy Micro$oft, support free software, run FreeBSD

From: "Jin Guojun[ITG]" <jin@george.lbl.gov>
To: gurney_j@resnet.uoregon.edu
Cc: FreeBSD-gnats-submit@freebsd.org, bugs@freebsd.org
Subject: Re: bin/3387: sh mis-interpret the file name / awk failure
Date: Mon, 28 Apr 1997 15:29:47 -0700

 John-Mark Gurney (gurney_j@resnet.uoregon.edu) states that:
 > I agree that there is a awk bug... but that is the only bug here...  if
 > you object to maintaining POSIX compliance... talk with Joerg Wunsch
 > (joerg_wunsch@uriah.heep.sax.de) and try to argue him out of it.. :)
 > 
 > hope this message clarifies it...
 
 Thank you, and you did clarify the most parts. 
 
 One thing left is "maintaining POSIX compliance". Other people and I had
 argued with Joerg Wunsch before for other sh problem for the same answer:
 	"It is POSIX compliance..."
 Same reason, the "compliance" does NOT mean "correct", right?
 Otherwise, the standrard should not be revised; just to add more stuff in it.
 I would not argue this issue (compliance) any more. Along with the time
 elapsed, things will have evolution, but not everything, like dinosaur.
 
 Keep in healthy status is good for everyone, but specailly for the speaker.
 If one is not healthy, how can this one expect others healthy. 
 NT may be POSIX compliant, do we care about it? We make FreeBSD healthy,
 let NT be compliant with POSIX with worm inside. Any thing is wrong? :-)
 
 -Jin
 
 P.S.	I will file another report for awk bug. Thanks!
 

From: Bill Fenner <fenner@parc.xerox.com>
To: "Jin Guojun[ITG]" <jin@george.lbl.gov>
Cc: gurney_j@resnet.uoregon.edu, FreeBSD-gnats-submit@freebsd.org,
        bugs@freebsd.org
Subject: Re: bin/3387: sh mis-interpret the file name / awk failure 
Date: Mon, 28 Apr 1997 15:55:20 PDT

 "Jin Guojun[ITG]" <jin@george.lbl.gov> wrote:
 >} but of course...  when you do the MyArch... it will store EXACTLY what awk
 >} outputed... and that is with a \n at the end...
 >
 >It is not true.
 
 That's correct.  What's really going on is that "uname -v" outputs a
 space at the end.
 
 % echo ">>`uname -v`<<"
 >>FreeBSD 2.2-RELEASE #0: Mon Mar 24 11:03:31 GMT 1997     root@sundae.parc.xerox.com:/usr/src/sys/compile/SUNDAE <<
 
 so if you split on slashes and get the last piece, it is indeed "SUNDAE ".
 awk is performing as you ask it to.
 
 Just as a point of interest, both awk and sh on SunOS 4.1.4 behave the
 same way as the FreeBSD ones.
 
   Bill

From: "Jin Guojun[ITG]" <jin@george.lbl.gov>
To: fenner@parc.xerox.com
Cc: FreeBSD-gnats-submit@freebsd.org, bugs@freebsd.org,
        gurney_j@resnet.uoregon.edu
Subject: Re: bin/3387: sh mis-interpret the file name / awk failure
Date: Mon, 28 Apr 1997 21:48:46 -0700

 } That's correct.  What's really going on is that "uname -v" outputs a
 } space at the end.
 } 
 } % echo ">>`uname -v`<<"
 } >>FreeBSD 2.2-RELEASE #0: Mon Mar 24 11:03:31 GMT 1997     root@sundae.parc.xero
 } x.com:/usr/src/sys/compile/SUNDAE <<
 } 
 } so if you split on slashes and get the last piece, it is indeed "SUNDAE ".
 } awk is performing as you ask it to.
 
 Thanks for pointing it out. This tells that GNU awk is innocent.
 Then, the futher question is "Can we change uname to make no space at the end?"
 I am happy to fix it, but every one have to agree to do so. Comments?
 
 -Jin
 

From: John-Mark Gurney <jmg@hydrogen.nike.efn.org>
To: "Jin Guojun[ITG]" <jin@george.lbl.gov>
Cc: fenner@parc.xerox.com, FreeBSD-gnats-submit@freebsd.org, bugs@freebsd.org
Subject: Re: bin/3387: sh mis-interpret the file name / awk failure
Date: Mon, 28 Apr 1997 22:29:53 -0700

 Jin Guojun[ITG] scribbled this message on Apr 28:
 > } That's correct.  What's really going on is that "uname -v" outputs a
 > } space at the end.
 > } 
 > } % echo ">>`uname -v`<<"
 > } >>FreeBSD 2.2-RELEASE #0: Mon Mar 24 11:03:31 GMT 1997     root@sundae.parc.xero
 > } x.com:/usr/src/sys/compile/SUNDAE <<
 > } 
 > } so if you split on slashes and get the last piece, it is indeed "SUNDAE ".
 > } awk is performing as you ask it to.
 > 
 > Thanks for pointing it out. This tells that GNU awk is innocent.
 > Then, the futher question is "Can we change uname to make no space at the end?"
 > I am happy to fix it, but every one have to agree to do so. Comments?
 
 well.. I just checked with Solaris' awk.. and it doesn't do it like
 this...
 
 echo "sdlfk/sdfkj " | awk -F/ "{printf $NF }"
 
 so the real fix is stop uname for including that extra space... also..
 it would fundamentally break awk...  with awk you expect to get
 everything EXACTLY between the field seperator (except for in the special
 case the man page mentions)...  there is no extra field seperator between
 the end of text and the new line...  so the space is part of the field...
 
 I personally would object to this modification the awk.. and hope that
 this crazy idea doesn't every get into your mind again.. :)
 
 what I wouldn't object to is another flag that strips that white space
 from around the fields...
 
 -- 
   John-Mark
   Cu Networking                             Modem/FAX: +1 541 683 6954
 
   Live in Peace, destroy Micro$oft, support free software, run FreeBSD

From: Bill Fenner <fenner@parc.xerox.com>
To: John-Mark Gurney <gurney_j@resnet.uoregon.edu>
Cc: "Jin Guojun[ITG]" <jin@george.lbl.gov>, fenner@parc.xerox.com,
        FreeBSD-gnats-submit@freebsd.org
Subject: Re: bin/3387: sh mis-interpret the file name / awk failure 
Date: Tue, 29 Apr 1997 01:29:48 PDT

 John-Mark Gurney <jmg@hydrogen.nike.efn.org> wrote:
 >what I wouldn't object to is another flag that strips that white space
 >from around the fields...
 
 Like "-F'[ /]'" as Joerg suggested?
 
 sundae% uname -v | awk -F'[ /]' '{printf ">>" $(NF-1) "<<"}'
 >>SUNDAE<<sundae% 
 
 Of course, this is non-portable due to the $(NF-1) requirement.
 
   Bill
>Unformatted:
