From tom@uniserve.com  Tue Dec  9 18:02:04 1997
Received: from shell.uniserve.com (shell.uniserve.com [204.244.210.252])
          by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id SAA00871
          for <FreeBSD-gnats-submit@freebsd.org>; Tue, 9 Dec 1997 18:02:03 -0800 (PST)
          (envelope-from tom@uniserve.com)
Received: (from tom@localhost)
	by shell.uniserve.com (8.8.5/8.8.5) id SAA14215;
	Tue, 9 Dec 1997 18:01:52 -0800 (PST)
Message-Id: <199712100201.SAA14215@shell.uniserve.com>
Date: Tue, 9 Dec 1997 18:01:52 -0800 (PST)
From: tom@sdf.com
Reply-To: tom@sdf.com
To: FreeBSD-gnats-submit@freebsd.org
Subject: sh bug (with example)
X-Send-Pr-Version: 3.2

>Number:         5263
>Category:       bin
>Synopsis:       sh bug (with example)
>Confidential:   no
>Severity:       non-critical
>Priority:       high
>Responsible:    cracauer
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Dec  9 18:10:01 PST 1997
>Closed-Date:    Wed Aug 16 12:48:05 MEST 2000
>Last-Modified:  Wed Aug 16 12:50:28 MEST 2000
>Originator:     Tom
>Release:        FreeBSD 2.2.5-STABLE i386
>Organization:
SDF Systems
>Environment:

FreeBSD 2.2.5-STABLE i386

>Description:

sh has a problem with joining lists within a "for x in list1:list2" construct.
Basically, the last element of list1 gets attached to the remaining elements of
list2, and this thing gets returned as single item.

>How-To-Repeat:

  Sample script:

#! /bin/sh

PATH=/bin:/usr/bin:/sbin

# The whitespace below is a space followed by a tab
IFS="${IFS= 	}"; ac_save_ifs="$IFS"; IFS="${IFS}:"
for ac_dir in $PATH:/usr/local/bin$ac_dummy; do
        echo "ac_dir is $ac_dir"
done


  When run under /bin/sh this script produces:

ac_dir is /bin
ac_dir is /usr/bin
ac_dir is /sbin:/usr/local/bin

  When run under bash this script produces:

ac_dir is /bin
ac_dir is /usr/bin
ac_dir is /sbin
ac_dir is /usr/local/bin


  This is a big problem for ports, as auto-conf configure scripts often use for
loops like this to scan for certain binaries.

>Fix:
	

>Release-Note:
>Audit-Trail:

From: Martin Cracauer <cracauer@cons.org>
To: tom@sdf.com
Cc: FreeBSD-gnats-submit@FreeBSD.ORG
Subject: Re: bin/5263: sh bug (with example)
Date: Tue, 24 Mar 1998 11:14:32 +0100

 In <199712100201.SAA14215@shell.uniserve.com>, tom@sdf.com wrote: 
 
 > sh has a problem with joining lists within a "for x in list1:list2"
 > construct.  Basically, the last element of list1 gets attached to
 > the remaining elements of list2, and this thing gets returned as
 > single item.
 
 > #! /bin/sh
 > 
 > PATH=/bin:/usr/bin:/sbin
 > 
 > # The whitespace below is a space followed by a tab
 > IFS="${IFS= 	}"; ac_save_ifs="$IFS"; IFS="${IFS}:"
 > for ac_dir in $PATH:/usr/local/bin$ac_dummy; do
 >         echo "ac_dir is $ac_dir"
 > done
 > 
 > 
 >   When run under /bin/sh this script produces:
 > 
 > ac_dir is /bin
 > ac_dir is /usr/bin
 > ac_dir is /sbin:/usr/local/bin
 > 
 >   When run under bash this script produces:
 > 
 > ac_dir is /bin
 > ac_dir is /usr/bin
 > ac_dir is /sbin
 > ac_dir is /usr/local/bin
  
 The appended diff fixes this particular problem. It is an ugly one, I
 didn't even attempt to understand sh's regular IFS handling.
 
 I'm not going to commit it (unless we officially give up clean fixing
 of /bin/sh :-] ).
 
 BUT I welcome feedback. I'm not that much of an /bin/sh (script)
 programmer, I'd like to know what other cases are being broken by this
 hack. Especially, I'd like to get example scripts where $IFS is used
 in places where normal words are expected (like in this for loop
 construct).
 
 So please give it a try and keep me updated, I want to gain better
 understanding of sh syntax issues.
 
 >   This is a big problem for ports, as auto-conf configure scripts
 > often use for loops like this to scan for certain binaries.
 
 In the particular case of GNU configure, I strongly reccommend using
 an intermediate variable:
 
   foobar=$PATH:/usr/local/bin$ac_dummy
   for ac_dir in $foobar; do
           echo "ac_dir is $ac_dir"
   done
 
 This works fine in our sh.
 
 pdksh has the same problem as our sh, and I'm sure there are more sh
 variants that have. The word list of a for-loop statement is easy to
 consider non-IFS-affected.
 
 I think a highly portable, reusable package like autoconf should take
 the extra work, although the problems are not its fault.
 
 Tom, are you in any way assosicated with the autoconf maintainers and
 could approach them for an opinion?
 
 Martin
 -- 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 Martin Cracauer <cracauer@cons.org> http://www.cons.org/cracauer
 BSD User Group Hamburg, Germany     http://www.bsdhh.org/
 
 diff -r -c bin/sh.current.original/parser.c bin/sh.deleteme/parser.c
 *** bin/sh.current.original/parser.c	Tue Aug 26 11:11:10 1997
 --- bin/sh.deleteme/parser.c	Tue Mar 24 10:38:36 1998
 ***************
 *** 211,216 ****
 --- 211,218 ----
   				pungetc();		/* push back EOF on input */
   			return n1;
   		default:
 + 
 + 
   			if (nlflag)
   				synexpect(-1);
   			tokpushback++;
 ***************
 *** 916,924 ****
   					setprompt(0);
   				c = pgetc();
   				goto loop;		/* continue outer loop */
 - 			case CWORD:
 - 				USTPUTC(c, out);
 - 				break;
   			case CCTL:
   				if (eofmark == NULL || dblquote)
   					USTPUTC(CTLESC, out);
 --- 918,923 ----
 ***************
 *** 1004,1009 ****
 --- 1003,1018 ----
   				break;
   			case CEOF:
   				goto endword;		/* exit outer loop */
 + 			case CWORD:
 + 				if (strchr(ifsval(),c) &&
 + 					syntax == BASESYNTAX) {
 + 					if (varnest == 0) {
 + 						c = pgetc_macro();
 + 						goto endword;  /* exit outer loop */
 + 					}
 + 				}
 + 				USTPUTC(c, out);
 + 				break;
   			default:
   				if (varnest == 0)
   					goto endword;	/* exit outer loop */

From: Martin Cracauer <cracauer@cons.org>
To: FreeBSD-gnats-submit@FreeBSD.ORG
Cc:  Subject: Re: bin/5263: sh bug (with example)
Date: Tue, 12 May 1998 17:51:00 +0200

 [repeated for inclusion into gnats...]
 
 To follow-up on this one: I think that my fix, bash and autoconf are     
 wrong and that the original ash (FreeBSD's /bin/sh) and pdksh     
 behaviour are right. I sent an explanation as a followup to PR 6557.
 
 You can see it online at                                 
 http://www.freebsd.org/cgi/query-pr.cgi?pr=6557                                 
 
 Martin                                                                          
 -- 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 Martin Cracauer <cracauer@cons.org> http://www.cons.org/cracauer
 BSD User Group Hamburg, Germany     http://www.bsdhh.org/
Responsible-Changed-From-To: freebsd-bugs->cracauer 
Responsible-Changed-By: nbm 
Responsible-Changed-When: Thu Jul 13 04:08:01 PDT 2000 
Responsible-Changed-Why:  
I would normally close duplicates, but the sh maintainer can decide what 
to do in this case 

http://www.freebsd.org/cgi/query-pr.cgi?pr=5263 
State-Changed-From-To: open->closed 
State-Changed-By: cracauer 
State-Changed-When: Wed Aug 16 12:48:05 MEST 2000 
State-Changed-Why:  
Field splitting does not happen on simple command lines, only on the 
portions of the fields generated by tilde expansion, parameter 
expansion, command substitution and arithmetic expansion, as explained 
by Tor in the discussion. 

bash and pdksh now work the same way as our sh. 


http://www.freebsd.org/cgi/query-pr.cgi?pr=5263 
>Unformatted:
