From mark@thuvia.org  Fri Oct  3 17:37:20 2003
Return-Path: <mark@thuvia.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 13EA716A4B3
	for <FreeBSD-gnats-submit@freebsd.org>; Fri,  3 Oct 2003 17:37:20 -0700 (PDT)
Received: from colossus.systems.pipex.net (colossus.systems.pipex.net [62.241.160.73])
	by mx1.FreeBSD.org (Postfix) with ESMTP id C359143FB1
	for <FreeBSD-gnats-submit@freebsd.org>; Fri,  3 Oct 2003 17:37:18 -0700 (PDT)
	(envelope-from mark@thuvia.org)
Received: from dotar.thuvia.org (81-86-228-29.dsl.pipex.com [81.86.228.29])
	by colossus.systems.pipex.net (Postfix) with ESMTP id C482A16000103
	for <FreeBSD-gnats-submit@freebsd.org>; Sat,  4 Oct 2003 01:37:16 +0100 (BST)
Received: from dotar.thuvia.org (localhost [127.0.0.1])
	by dotar.thuvia.org (8.12.9/8.12.9) with ESMTP id h940bGPA077985
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 4 Oct 2003 01:37:16 +0100 (BST)
	(envelope-from mark@dotar.thuvia.org)
Received: (from mark@localhost)
	by dotar.thuvia.org (8.12.9/8.12.9/Submit) id h940bGPL077984;
	Sat, 4 Oct 2003 01:37:16 +0100 (BST)
	(envelope-from mark)
Message-Id: <200310040037.h940bGPL077984@dotar.thuvia.org>
Date: Sat, 4 Oct 2003 01:37:16 +0100 (BST)
From: Mark Valentine <mark@thuvia.org>
Reply-To: Mark Valentine <mark@thuvia.org>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: sh(1) incorrect handling of quoted parameter expansion
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         57554
>Category:       bin
>Synopsis:       sh(1) incorrect handling of quoted parameter expansion
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    jilles
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Oct 03 17:40:14 PDT 2003
>Closed-Date:    Fri May 06 21:34:47 UTC 2011
>Last-Modified:  Fri May 06 21:34:47 UTC 2011
>Originator:     Mark Valentine
>Release:        FreeBSD 4.8-STABLE i386
>Organization:
>Environment:
System: FreeBSD dotar.thuvia.org 4.8-STABLE FreeBSD 4.8-STABLE #6: Wed Jun 11 15:04:41 BST 2003 root@dotar.thuvia.org:/usr/obj/usr/src/sys/DOTAR i386


>Description:
	sh(1) incorrectly quotes the pattern word in parameter expansions
	involving prefix/suffix removal when the whole expansion is enclosed
	in double quotes:

	    $ foo='\{foo'
	    $ echo ${foo#\{}
	    \{foo			# correct (1) - pattern is {
	    $ echo ${foo#"\{"}
	    foo				# correct (2) - pattern is \{
	    $ echo "${foo#\{}"
	    foo				# WRONG - should be same as (1)

	IEEE Std 1003.1-2001 states that "Enclosing the full parameter
	expansion string in double-quotes shall not cause the [...] pattern
	characters to be quoted, whereas quoting characters within the braces
	shall have this effect."

	NOTE: there seems to be a related problem with our handling of
	${foo#{} - bash and ksh seem to expect a further matching } -
	but I haven't managed to figure out chapter and verse on that
	one!

	I initially came across this bug in the context of (foo='{foo' and
	"${foo#\{}" doing the wrong thing (real life scenario: FreeBSD-
	hosted NetBSD build falls over in distrib/utils/sysinst/msg_xlat.sh),
	but the above example is a clearer illustration of the part I did
	manage to find chapter and verse on...
>How-To-Repeat:
	
>Fix:

	NetBSD's sh(1) handles this correctly, if the code change can be
	identified...  (NOTE: possibly look at their parser.c rev 1.51 and
	1.52; my brain's shut down for the day.)
>Release-Note:
>Audit-Trail:

From: Mark Valentine <mark@valentine.me.uk>
To: FreeBSD-gnats-submit@freebsd.org
Cc:  
Subject: Re: bin/57554: [PATCH] sh(1) incorrect handling of quoted parameter expansion
Date: Mon, 6 Oct 2003 20:15:38 +0000

 The NetBSD revisions I mentioned do seem to fix this.  Here's a patch
 which works for FreeBSD 4.8-STABLE; I just started a buildworld on my
 system (which is running an older 4.8-STABLE from June), and I'll do
 another installworld/buildworld cycle after that and follow up if any
 problems arise.
 
 The patch also applies OK to -CURRENT, but it'll take me a bit longer
 to get around to testing that.
 
 Index: mksyntax.c
 ===================================================================
 RCS file: /usr/cvs/src/bin/sh/mksyntax.c,v
 retrieving revision 1.14.2.3
 diff -u -r1.14.2.3 mksyntax.c
 --- mksyntax.c	19 Jul 2002 04:38:51 -0000	1.14.2.3
 +++ mksyntax.c	6 Oct 2003 18:21:50 -0000
 @@ -70,7 +70,6 @@
  	{ "CBACK",	"a backslash character" },
  	{ "CSQUOTE",	"single quote" },
  	{ "CDQUOTE",	"double quote" },
 -	{ "CENDQUOTE",	"a terminating quote" },
  	{ "CBQUOTE",	"backwards single quote" },
  	{ "CVAR",	"a dollar sign" },
  	{ "CENDVAR",	"a '}' character" },
 @@ -220,7 +219,7 @@
  	fputs("\n/* syntax table used when in double quotes */\n", cfile);
  	add("\n", "CNL");
  	add("\\", "CBACK");
 -	add("\"", "CENDQUOTE");
 +	add("\"", "CDQUOTE");
  	add("`", "CBQUOTE");
  	add("$", "CVAR");
  	add("}", "CENDVAR");
 @@ -230,7 +229,7 @@
  	init();
  	fputs("\n/* syntax table used when in single quotes */\n", cfile);
  	add("\n", "CNL");
 -	add("'", "CENDQUOTE");
 +	add("'", "CSQUOTE");
  	/* ':/' for tilde expansion, '-' for [a\-x] pattern ranges */
  	add("!*?[=~:/-", "CCTL");
  	print("sqsyntax");
 Index: parser.c
 ===================================================================
 RCS file: /usr/cvs/src/bin/sh/parser.c,v
 retrieving revision 1.29.2.10
 diff -u -r1.29.2.10 parser.c
 --- parser.c	22 Jul 2003 13:11:26 -0000	1.29.2.10
 +++ parser.c	6 Oct 2003 18:39:24 -0000
 @@ -73,6 +73,8 @@
  /* values returned by readtoken */
  #include "token.h"
  
 +#define OPENBRACE '{'
 +#define CLOSEBRACE '}'
  
  
  struct heredoc {
 @@ -885,6 +887,28 @@
  #define PARSEBACKQNEW()	{oldstyle = 0; goto parsebackq; parsebackq_newreturn:;}
  #define	PARSEARITH()	{goto parsearith; parsearith_return:;}
  
 +/*
 + * Keep track of nested doublequotes in dblquote and doublequotep.
 + * We use dblquote for the first 32 levels, and we expand to a malloc'ed
 + * region for levels above that. Usually we never need to malloc.
 + * This code assumes that an int is 32 bits. We don't use uint32_t,
 + * because the rest of the code does not.
 + */
 +#define ISDBLQUOTE() ((varnest < 32) ? (dblquote & (1 << varnest)) : \
 +    (dblquotep[(varnest / 32) - 1] & (1 << (varnest % 32))))
 +
 +#define SETDBLQUOTE() \
 +    if (varnest < 32) \
 +	dblquote |= (1 << varnest); \
 +    else \
 +	dblquotep[(varnest / 32) - 1] |= (1 << (varnest % 32))
 +
 +#define CLRDBLQUOTE() \
 +    if (varnest < 32) \
 +	dblquote &= ~(1 << varnest); \
 +    else \
 +	dblquotep[(varnest / 32) - 1] &= ~(1 << (varnest % 32))
 +
  STATIC int
  readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs)
  {
 @@ -894,6 +918,8 @@
  	char line[EOFMARKLEN + 1];
  	struct nodelist *bqlist;
  	int quotef;
 +	int *dblquotep = NULL;
 +	size_t maxnest = 32;
  	int dblquote;
  	int varnest;	/* levels of variables expansion */
  	int arinest;	/* levels of arithmetic expansion */
 @@ -903,6 +929,8 @@
  	int synentry;
  #if __GNUC__
  	/* Avoid longjmp clobbering */
 +	(void) &maxnest;
 +	(void) &dblquotep;
  	(void) &out;
  	(void) &quotef;
  	(void) &dblquote;
 @@ -917,11 +945,12 @@
  
  	startlinno = plinno;
  	dblquote = 0;
 -	if (syntax == DQSYNTAX)
 -		dblquote = 1;
 +	varnest = 0;
 +	if (syntax == DQSYNTAX) {
 +		SETDBLQUOTE();
 +	}
  	quotef = 0;
  	bqlist = NULL;
 -	varnest = 0;
  	arinest = 0;
  	parenlevel = 0;
  
 @@ -959,7 +988,7 @@
  				USTPUTC(c, out);
  				break;
  			case CCTL:
 -				if (eofmark == NULL || dblquote)
 +				if (eofmark == NULL || ISDBLQUOTE())
  					USTPUTC(CTLESC, out);
  				USTPUTC(c, out);
  				break;
 @@ -974,7 +1003,7 @@
  					else
  						setprompt(0);
  				} else {
 -					if (dblquote && c != '\\' &&
 +					if (ISDBLQUOTE() && c != '\\' &&
  					    c != '`' && c != '$' &&
  					    (c != '"' || eofmark != NULL))
  						USTPUTC('\\', out);
 @@ -987,27 +1016,36 @@
  				}
  				break;
  			case CSQUOTE:
 -				if (eofmark == NULL)
 -					USTPUTC(CTLQUOTEMARK, out);
 -				syntax = SQSYNTAX;
 -				break;
 +				if (syntax != SQSYNTAX) {
 +				    if (eofmark == NULL)
 +					    USTPUTC(CTLQUOTEMARK, out);
 +				    syntax = SQSYNTAX;
 +				    break;
 +				}
 +				/* FALLTHROUGH */
  			case CDQUOTE:
 -				if (eofmark == NULL)
 -					USTPUTC(CTLQUOTEMARK, out);
 -				syntax = DQSYNTAX;
 -				dblquote = 1;
 -				break;
 -			case CENDQUOTE:
  				if (eofmark != NULL && arinest == 0 &&
  				    varnest == 0) {
  					USTPUTC(c, out);
  				} else {
  					if (arinest) {
 -						syntax = ARISYNTAX;
 -						dblquote = 0;
 +						if (c != '"' || ISDBLQUOTE()) {
 +							syntax = ARISYNTAX;
 +							CLRDBLQUOTE();
 +						} else {
 +							syntax = DQSYNTAX;
 +							SETDBLQUOTE();
 +							USTPUTC(CTLQUOTEMARK, out);
 +						}
  					} else if (eofmark == NULL) {
 -						syntax = BASESYNTAX;
 -						dblquote = 0;
 +						if (c != '"' || ISDBLQUOTE()) {
 +							syntax = BASESYNTAX;
 +							CLRDBLQUOTE();
 +						} else {
 +							syntax = DQSYNTAX;
 +							SETDBLQUOTE();
 +							USTPUTC(CTLQUOTEMARK, out);
 +						}
  					}
  					quotef++;
  				}
 @@ -1015,8 +1053,8 @@
  			case CVAR:	/* '$' */
  				PARSESUB();		/* parse substitution */
  				break;
 -			case CENDVAR:	/* '}' */
 -				if (varnest > 0) {
 +			case CENDVAR:	/* CLOSEBRACE */
 +				if (varnest > 0 && !ISDBLQUOTE()) {
  					varnest--;
  					USTPUTC(CTLENDVAR, out);
  				} else {
 @@ -1037,9 +1075,9 @@
  							USTPUTC(CTLENDARI, out);
  							syntax = prevsyntax;
  							if (syntax == DQSYNTAX)
 -								dblquote = 1;
 +								SETDBLQUOTE();
  							else
 -								dblquote = 0;
 +								CLRDBLQUOTE();
  						} else
  							USTPUTC(')', out);
  					} else {
 @@ -1092,6 +1130,8 @@
  	backquotelist = bqlist;
  	grabstackblock(len);
  	wordtext = out;
 +	if (dblquotep != NULL)
 +	    ckfree(dblquotep);
  	return lasttoken = TWORD;
  /* end of readtoken routine */
  
 @@ -1202,7 +1242,7 @@
         int bracketed_name = 0; /* used to handle ${[0-9]*} variables */
  
  	c = pgetc();
 -	if (c != '(' && c != '{' && !is_name(c) && !is_special(c)) {
 +	if (c != '(' && c != OPENBRACE && !is_name(c) && !is_special(c)) {
  		USTPUTC('$', out);
  		pungetc();
  	} else if (c == '(') {	/* $(command) or $((arith)) */
 @@ -1217,11 +1257,11 @@
  		typeloc = out - stackblock();
  		USTPUTC(VSNORMAL, out);
  		subtype = VSNORMAL;
 -		if (c == '{') {
 +		if (c == OPENBRACE) {
  			bracketed_name = 1;
  			c = pgetc();
  			if (c == '#') {
 -				if ((c = pgetc()) == '}')
 +				if ((c = pgetc()) == CLOSEBRACE)
  					c = '#';
  				else
  					subtype = VSLENGTH;
 @@ -1281,11 +1321,17 @@
  		} else {
  			pungetc();
  		}
 -		if (subtype != VSLENGTH && (dblquote || arinest))
 +		if (subtype != VSLENGTH && (ISDBLQUOTE() || arinest))
  			flags |= VSQUOTE;
  		*(stackblock() + typeloc) = subtype | flags;
 -		if (subtype != VSNORMAL)
 +		if (subtype != VSNORMAL) {
  			varnest++;
 + 			if (varnest >= maxnest) {
 + 				dblquotep = ckrealloc(dblquotep, maxnest / 8);
 + 				dblquotep[(maxnest / 32) - 1] = 0;
 + 				maxnest += 32;
 + 			}
 +		}
  	}
  	goto parsesub_return;
  }
 @@ -1366,7 +1412,7 @@
  					continue;
  				}
                                  if (c != '\\' && c != '`' && c != '$'
 -                                    && (!dblquote || c != '"'))
 +                                    && (!ISDBLQUOTE() || c != '"'))
                                          STPUTC('\\', out);
  				break;
  
 @@ -1437,7 +1483,7 @@
  	}
  	parsebackquote = savepbq;
  	handler = savehandler;
 -	if (arinest || dblquote)
 +	if (arinest || ISDBLQUOTE())
  		USTPUTC(CTLBACKQ | CTLQUOTE, out);
  	else
  		USTPUTC(CTLBACKQ, out);
 @@ -1456,7 +1502,7 @@
  		prevsyntax = syntax;
  		syntax = ARISYNTAX;
  		USTPUTC(CTLARI, out);
 -		if (dblquote)
 +		if (ISDBLQUOTE())
  			USTPUTC('"',out);
  		else
  			USTPUTC(' ',out);
 
 -- 
 "Tigers will do ANYTHING for a tuna fish sandwich."
 "We're kind of stupid that way."   *munch* *munch*
   -- <http://www.calvinandhobbes.com>

From: Jilles Tjoelker <jilles@stack.nl>
To: bug-followup@FreeBSD.org, mark@thuvia.org
Cc:  
Subject: Re: bin/57554: sh(1) incorrect handling of quoted parameter
	expansion
Date: Thu, 10 Sep 2009 22:31:18 +0200

 Sorry for waiting so long with this.
 
 Your patch seems to work after fixing the conflicts (fairly easy).
 
 However, it (and also NetBSD /bin/sh) has a memory leak if there is a
 syntax error or SIGINT within 32 or more levels of variable expansion.
 Some possible fixes:
 - remove the dynamic allocation and just use the old broken way for
   level 32 and higher
 - add an exception handler when allocating dblquotep for the first time
   (not for all readtoken1 calls, that would probably be rather slow)
 - link it to a list pointed to by a global variable so it can be cleaned
   up eventually (note that things like ${X-$(printf %x ${Y-${Z}})} are
   possible so a single global does not do); this is somewhat similar to
   memalloc.c's "stack" which probably cannot be used for this as it is
   already used for assembling the resulting word
 
 -- 
 Jilles Tjoelker

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: bin/57554: commit references a PR
Date: Sat,  3 Apr 2010 20:56:11 +0000 (UTC)

 Author: jilles
 Date: Sat Apr  3 20:55:56 2010
 New Revision: 206145
 URL: http://svn.freebsd.org/changeset/base/206145
 
 Log:
   sh: Fix various things about expansions:
   * remove the backslash from \} inside double quotes inside +-=?
     substitutions, e.g. "${$+\}a}"
   * maintain separate double-quote state for ${v#...} and ${v%...};
     single and double quotes are special inside, even in a double-quoted
     string or here document
   * keep track of correct order of substitutions and arithmetic
   
   This is different from dash's approach, which does not track individual
   double quotes in the parser, trying to fix this up during expansion.
   This treats single quotes inside "${v#...}" incorrectly, however.
   
   This is similar to NetBSD's approach (as submitted in PR bin/57554), but
   recognizes the difference between +-=? and #% substitutions hinted at in
   POSIX and is more refined for arithmetic expansion and here documents.
   
   PR:		bin/57554
   Exp-run done by:	erwin (with some other sh(1) changes)
 
 Added:
   head/tools/regression/bin/sh/expansion/plus-minus2.0   (contents, props changed)
   head/tools/regression/bin/sh/parser/heredoc2.0   (contents, props changed)
 Modified:
   head/bin/sh/parser.c
 
 Modified: head/bin/sh/parser.c
 ==============================================================================
 --- head/bin/sh/parser.c	Sat Apr  3 20:35:39 2010	(r206144)
 +++ head/bin/sh/parser.c	Sat Apr  3 20:55:56 2010	(r206145)
 @@ -79,6 +79,10 @@ struct heredoc {
  	int striptabs;		/* if set, strip leading tabs */
  };
  
 +struct parser_temp {
 +	struct parser_temp *next;
 +	void *data;
 +};
  
  
  STATIC struct heredoc *heredoclist;	/* list of here documents to read */
 @@ -94,6 +98,7 @@ STATIC struct heredoc *heredoc;
  STATIC int quoteflag;		/* set if (part of) last token was quoted */
  STATIC int startlinno;		/* line # where last token started */
  STATIC int funclinno;		/* line # where the current function started */
 +STATIC struct parser_temp *parser_temp;
  
  /* XXX When 'noaliases' is set to one, no alias expansion takes place. */
  static int noaliases = 0;
 @@ -117,6 +122,73 @@ STATIC void synerror(const char *);
  STATIC void setprompt(int);
  
  
 +STATIC void *
 +parser_temp_alloc(size_t len)
 +{
 +	struct parser_temp *t;
 +
 +	INTOFF;
 +	t = ckmalloc(sizeof(*t));
 +	t->data = NULL;
 +	t->next = parser_temp;
 +	parser_temp = t;
 +	t->data = ckmalloc(len);
 +	INTON;
 +	return t->data;
 +}
 +
 +
 +STATIC void *
 +parser_temp_realloc(void *ptr, size_t len)
 +{
 +	struct parser_temp *t;
 +
 +	INTOFF;
 +	t = parser_temp;
 +	if (ptr != t->data)
 +		error("bug: parser_temp_realloc misused");
 +	t->data = ckrealloc(t->data, len);
 +	INTON;
 +	return t->data;
 +}
 +
 +
 +STATIC void
 +parser_temp_free_upto(void *ptr)
 +{
 +	struct parser_temp *t;
 +	int done = 0;
 +
 +	INTOFF;
 +	while (parser_temp != NULL && !done) {
 +		t = parser_temp;
 +		parser_temp = t->next;
 +		done = t->data == ptr;
 +		ckfree(t->data);
 +		ckfree(t);
 +	}
 +	INTON;
 +	if (!done)
 +		error("bug: parser_temp_free_upto misused");
 +}
 +
 +
 +STATIC void
 +parser_temp_free_all(void)
 +{
 +	struct parser_temp *t;
 +
 +	INTOFF;
 +	while (parser_temp != NULL) {
 +		t = parser_temp;
 +		parser_temp = t->next;
 +		ckfree(t->data);
 +		ckfree(t);
 +	}
 +	INTON;
 +}
 +
 +
  /*
   * Read and parse a command.  Returns NEOF on end of file.  (NULL is a
   * valid parse tree indicating a blank line.)
 @@ -127,6 +199,11 @@ parsecmd(int interact)
  {
  	int t;
  
 +	/* This assumes the parser is not re-entered,
 +	 * which could happen if we add command substitution on PS1/PS2.
 +	 */
 +	parser_temp_free_all();
 +
  	tokpushback = 0;
  	doprompt = interact;
  	if (doprompt)
 @@ -863,6 +940,21 @@ breakloop:
  }
  
  
 +#define MAXNEST_STATIC 8
 +struct tokenstate
 +{
 +	const char *syntax; /* *SYNTAX */
 +	int parenlevel; /* levels of parentheses in arithmetic */
 +	enum tokenstate_category
 +	{
 +		TSTATE_TOP,
 +		TSTATE_VAR_OLD, /* ${var+-=?}, inherits dquotes */
 +		TSTATE_VAR_NEW, /* other ${var...}, own dquote state */
 +		TSTATE_ARITH
 +	} category;
 +};
 +
 +
  /*
   * Called to parse command substitutions.
   */
 @@ -1040,7 +1132,7 @@ done:
  #define	PARSEARITH()	{goto parsearith; parsearith_return:;}
  
  STATIC int
 -readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs)
 +readtoken1(int firstc, char const *initialsyntax, char *eofmark, int striptabs)
  {
  	int c = firstc;
  	char *out;
 @@ -1048,22 +1140,21 @@ readtoken1(int firstc, char const *synta
  	char line[EOFMARKLEN + 1];
  	struct nodelist *bqlist;
  	int quotef;
 -	int dblquote;
 -	int varnest;	/* levels of variables expansion */
 -	int arinest;	/* levels of arithmetic expansion */
 -	int parenlevel;	/* levels of parens in arithmetic */
 -	char const *prevsyntax;	/* syntax before arithmetic */
 +	int newvarnest;
 +	int level;
  	int synentry;
 +	struct tokenstate state_static[MAXNEST_STATIC];
 +	int maxnest = MAXNEST_STATIC;
 +	struct tokenstate *state = state_static;
  
  	startlinno = plinno;
 -	dblquote = 0;
 -	if (syntax == DQSYNTAX)
 -		dblquote = 1;
  	quotef = 0;
  	bqlist = NULL;
 -	varnest = 0;
 -	arinest = 0;
 -	parenlevel = 0;
 +	newvarnest = 0;
 +	level = 0;
 +	state[level].syntax = initialsyntax;
 +	state[level].parenlevel = 0;
 +	state[level].category = TSTATE_TOP;
  
  	STARTSTACKSTR(out);
  	loop: {	/* for each line, until end of word */
 @@ -1071,11 +1162,11 @@ readtoken1(int firstc, char const *synta
  		for (;;) {	/* until end of line or end of word */
  			CHECKSTRSPACE(3, out);	/* permit 3 calls to USTPUTC */
  
 -			synentry = syntax[c];
 +			synentry = state[level].syntax[c];
  
  			switch(synentry) {
  			case CNL:	/* '\n' */
 -				if (syntax == BASESYNTAX)
 +				if (state[level].syntax == BASESYNTAX)
  					goto endword;	/* exit outer loop */
  				USTPUTC(c, out);
  				plinno++;
 @@ -1089,7 +1180,7 @@ readtoken1(int firstc, char const *synta
  				USTPUTC(c, out);
  				break;
  			case CCTL:
 -				if (eofmark == NULL || dblquote)
 +				if (eofmark == NULL || initialsyntax != SQSYNTAX)
  					USTPUTC(CTLESC, out);
  				USTPUTC(c, out);
  				break;
 @@ -1105,41 +1196,37 @@ readtoken1(int firstc, char const *synta
  					else
  						setprompt(0);
  				} else {
 -					if (dblquote && c != '\\' &&
 -					    c != '`' && c != '$' &&
 -					    (c != '"' || eofmark != NULL))
 +					if (state[level].syntax == DQSYNTAX &&
 +					    c != '\\' && c != '`' && c != '$' &&
 +					    (c != '"' || (eofmark != NULL &&
 +						newvarnest == 0)) &&
 +					    (c != '}' || state[level].category != TSTATE_VAR_OLD))
  						USTPUTC('\\', out);
  					if (SQSYNTAX[c] == CCTL)
  						USTPUTC(CTLESC, out);
 -					else if (eofmark == NULL)
 +					else if (eofmark == NULL ||
 +					    newvarnest > 0)
  						USTPUTC(CTLQUOTEMARK, out);
  					USTPUTC(c, out);
  					quotef++;
  				}
  				break;
  			case CSQUOTE:
 -				if (eofmark == NULL)
 -					USTPUTC(CTLQUOTEMARK, out);
 -				syntax = SQSYNTAX;
 +				USTPUTC(CTLQUOTEMARK, out);
 +				state[level].syntax = SQSYNTAX;
  				break;
  			case CDQUOTE:
 -				if (eofmark == NULL)
 -					USTPUTC(CTLQUOTEMARK, out);
 -				syntax = DQSYNTAX;
 -				dblquote = 1;
 +				USTPUTC(CTLQUOTEMARK, out);
 +				state[level].syntax = DQSYNTAX;
  				break;
  			case CENDQUOTE:
 -				if (eofmark != NULL && arinest == 0 &&
 -				    varnest == 0) {
 +				if (eofmark != NULL && newvarnest == 0)
  					USTPUTC(c, out);
 -				} else {
 -					if (arinest) {
 -						syntax = ARISYNTAX;
 -						dblquote = 0;
 -					} else if (eofmark == NULL) {
 -						syntax = BASESYNTAX;
 -						dblquote = 0;
 -					}
 +				else {
 +					if (state[level].category == TSTATE_ARITH)
 +						state[level].syntax = ARISYNTAX;
 +					else
 +						state[level].syntax = BASESYNTAX;
  					quotef++;
  				}
  				break;
 @@ -1147,30 +1234,33 @@ readtoken1(int firstc, char const *synta
  				PARSESUB();		/* parse substitution */
  				break;
  			case CENDVAR:	/* '}' */
 -				if (varnest > 0) {
 -					varnest--;
 +				if (level > 0 &&
 +				    (state[level].category == TSTATE_VAR_OLD ||
 +				    state[level].category == TSTATE_VAR_NEW)) {
 +					if (state[level].category == TSTATE_VAR_OLD)
 +						state[level - 1].syntax = state[level].syntax;
 +					else
 +						newvarnest--;
 +					level--;
  					USTPUTC(CTLENDVAR, out);
  				} else {
  					USTPUTC(c, out);
  				}
  				break;
  			case CLP:	/* '(' in arithmetic */
 -				parenlevel++;
 +				state[level].parenlevel++;
  				USTPUTC(c, out);
  				break;
  			case CRP:	/* ')' in arithmetic */
 -				if (parenlevel > 0) {
 +				if (state[level].parenlevel > 0) {
  					USTPUTC(c, out);
 -					--parenlevel;
 +					--state[level].parenlevel;
  				} else {
  					if (pgetc() == ')') {
 -						if (--arinest == 0) {
 +						if (level > 0 &&
 +						    state[level].category == TSTATE_ARITH) {
 +							level--;
  							USTPUTC(CTLENDARI, out);
 -							syntax = prevsyntax;
 -							if (syntax == DQSYNTAX)
 -								dblquote = 1;
 -							else
 -								dblquote = 0;
  						} else
  							USTPUTC(')', out);
  					} else {
 @@ -1184,13 +1274,15 @@ readtoken1(int firstc, char const *synta
  				}
  				break;
  			case CBQUOTE:	/* '`' */
 -				out = parsebackq(out, &bqlist, 1, dblquote,
 -						arinest || dblquote);
 +				out = parsebackq(out, &bqlist, 1,
 +				    state[level].syntax == DQSYNTAX &&
 +				    (eofmark == NULL || newvarnest > 0),
 +				    state[level].syntax == DQSYNTAX || state[level].syntax == ARISYNTAX);
  				break;
  			case CEOF:
  				goto endword;		/* exit outer loop */
  			default:
 -				if (varnest == 0)
 +				if (level == 0)
  					goto endword;	/* exit outer loop */
  				USTPUTC(c, out);
  			}
 @@ -1198,14 +1290,17 @@ readtoken1(int firstc, char const *synta
  		}
  	}
  endword:
 -	if (syntax == ARISYNTAX)
 +	if (state[level].syntax == ARISYNTAX)
  		synerror("Missing '))'");
 -	if (syntax != BASESYNTAX && eofmark == NULL)
 +	if (state[level].syntax != BASESYNTAX && eofmark == NULL)
  		synerror("Unterminated quoted string");
 -	if (varnest != 0) {
 +	if (state[level].category == TSTATE_VAR_OLD ||
 +	    state[level].category == TSTATE_VAR_NEW) {
  		startlinno = plinno;
  		synerror("Missing '}'");
  	}
 +	if (state != state_static)
 +		parser_temp_free_upto(state);
  	USTPUTC('\0', out);
  	len = out - stackblock();
  	out = stackblock();
 @@ -1228,7 +1323,6 @@ endword:
  /* end of readtoken routine */
  
  
 -
  /*
   * Check to see whether we are at the end of the here document.  When this
   * is called, c is set to the first character of the next input line.  If
 @@ -1345,8 +1439,11 @@ parsesub: {
  			PARSEARITH();
  		} else {
  			pungetc();
 -			out = parsebackq(out, &bqlist, 0, dblquote,
 -					arinest || dblquote);
 +			out = parsebackq(out, &bqlist, 0,
 +			    state[level].syntax == DQSYNTAX &&
 +			    (eofmark == NULL || newvarnest > 0),
 +			    state[level].syntax == DQSYNTAX ||
 +			    state[level].syntax == ARISYNTAX);
  		}
  	} else {
  		USTPUTC(CTLVAR, out);
 @@ -1446,11 +1543,44 @@ parsesub: {
  			pungetc();
  		}
  		STPUTC('=', out);
 -		if (subtype != VSLENGTH && (dblquote || arinest))
 +		if (subtype != VSLENGTH && (state[level].syntax == DQSYNTAX ||
 +		    state[level].syntax == ARISYNTAX))
  			flags |= VSQUOTE;
  		*(stackblock() + typeloc) = subtype | flags;
 -		if (subtype != VSNORMAL)
 -			varnest++;
 +		if (subtype != VSNORMAL) {
 +			if (level + 1 >= maxnest) {
 +				maxnest *= 2;
 +				if (state == state_static) {
 +					state = parser_temp_alloc(
 +					    maxnest * sizeof(*state));
 +					memcpy(state, state_static,
 +					    MAXNEST_STATIC * sizeof(*state));
 +				} else
 +					state = parser_temp_realloc(state,
 +					    maxnest * sizeof(*state));
 +			}
 +			level++;
 +			state[level].parenlevel = 0;
 +			if (subtype == VSMINUS || subtype == VSPLUS ||
 +			    subtype == VSQUESTION || subtype == VSASSIGN) {
 +				/*
 +				 * For operators that were in the Bourne shell,
 +				 * inherit the double-quote state.
 +				 */
 +				state[level].syntax = state[level - 1].syntax;
 +				state[level].category = TSTATE_VAR_OLD;
 +			} else {
 +				/*
 +				 * The other operators take a pattern,
 +				 * so go to BASESYNTAX.
 +				 * Also, ' and " are now special, even
 +				 * in here documents.
 +				 */
 +				state[level].syntax = BASESYNTAX;
 +				state[level].category = TSTATE_VAR_NEW;
 +				newvarnest++;
 +			}
 +		}
  	}
  	goto parsesub_return;
  }
 @@ -1461,21 +1591,26 @@ parsesub: {
   */
  parsearith: {
  
 -	if (++arinest == 1) {
 -		prevsyntax = syntax;
 -		syntax = ARISYNTAX;
 -		USTPUTC(CTLARI, out);
 -		if (dblquote)
 -			USTPUTC('"',out);
 -		else
 -			USTPUTC(' ',out);
 -	} else {
 -		/*
 -		 * we collapse embedded arithmetic expansion to
 -		 * parenthesis, which should be equivalent
 -		 */
 -		USTPUTC('(', out);
 +	if (level + 1 >= maxnest) {
 +		maxnest *= 2;
 +		if (state == state_static) {
 +			state = parser_temp_alloc(
 +			    maxnest * sizeof(*state));
 +			memcpy(state, state_static,
 +			    MAXNEST_STATIC * sizeof(*state));
 +		} else
 +			state = parser_temp_realloc(state,
 +			    maxnest * sizeof(*state));
  	}
 +	level++;
 +	state[level].syntax = ARISYNTAX;
 +	state[level].parenlevel = 0;
 +	state[level].category = TSTATE_ARITH;
 +	USTPUTC(CTLARI, out);
 +	if (state[level - 1].syntax == DQSYNTAX)
 +		USTPUTC('"',out);
 +	else
 +		USTPUTC(' ',out);
  	goto parsearith_return;
  }
  
 
 Added: head/tools/regression/bin/sh/expansion/plus-minus2.0
 ==============================================================================
 --- /dev/null	00:00:00 1970	(empty, because file is newly added)
 +++ head/tools/regression/bin/sh/expansion/plus-minus2.0	Sat Apr  3 20:55:56 2010	(r206145)
 @@ -0,0 +1,4 @@
 +# $FreeBSD$
 +
 +e=
 +test "${e:-\}}" = '}'
 
 Added: head/tools/regression/bin/sh/parser/heredoc2.0
 ==============================================================================
 --- /dev/null	00:00:00 1970	(empty, because file is newly added)
 +++ head/tools/regression/bin/sh/parser/heredoc2.0	Sat Apr  3 20:55:56 2010	(r206145)
 @@ -0,0 +1,44 @@
 +# $FreeBSD$
 +
 +failures=0
 +
 +check() {
 +	if ! eval "[ $* ]"; then
 +		echo "Failed: $*"
 +		: $((failures += 1))
 +	fi
 +}
 +
 +s='ast*que?non' sq=\' dq=\"
 +
 +check '"$(cat <<EOF
 +${s}
 +EOF
 +)" = "ast*que?non"'
 +
 +check '"$(cat <<EOF
 +${s+"x"}
 +EOF
 +)" = ${dq}x${dq}'
 +
 +check '"$(cat <<EOF
 +${s+'$sq'x'$sq'}
 +EOF
 +)" = ${sq}x${sq}'
 +
 +check '"$(cat <<EOF
 +${s#ast}
 +EOF
 +)" = "*que?non"'
 +
 +check '"$(cat <<EOF
 +${s##"ast"}
 +EOF
 +)" = "*que?non"'
 +
 +check '"$(cat <<EOF
 +${s##'$sq'ast'$sq'}
 +EOF
 +)" = "*que?non"'
 +
 +exit $((failures != 0))
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
Responsible-Changed-From-To: freebsd-bugs->jilles 
Responsible-Changed-By: jilles 
Responsible-Changed-When: Wed Apr 21 23:21:04 UTC 2010 
Responsible-Changed-Why:  
Take. This is mostly done but there is still an issue with ${var#"}"}. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=57554 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: bin/57554: commit references a PR
Date: Thu, 28 Oct 2010 21:51:22 +0000 (UTC)

 Author: jilles
 Date: Thu Oct 28 21:51:14 2010
 New Revision: 214490
 URL: http://svn.freebsd.org/changeset/base/214490
 
 Log:
   sh: Make double-quotes quote a '}' inside ${v#...} and ${v%...}.
   
   Exp-run done by:	pav (with some other sh(1) changes)
   PR:			bin/57554
 
 Added:
   head/tools/regression/bin/sh/expansion/trim5.0   (contents, props changed)
 Modified:
   head/bin/sh/parser.c
 
 Modified: head/bin/sh/parser.c
 ==============================================================================
 --- head/bin/sh/parser.c	Thu Oct 28 20:18:26 2010	(r214489)
 +++ head/bin/sh/parser.c	Thu Oct 28 21:51:14 2010	(r214490)
 @@ -1234,7 +1234,8 @@ readtoken1(int firstc, char const *initi
  			case CENDVAR:	/* '}' */
  				if (level > 0 &&
  				    (state[level].category == TSTATE_VAR_OLD ||
 -				    state[level].category == TSTATE_VAR_NEW)) {
 +				    (state[level].category == TSTATE_VAR_NEW &&
 +				     state[level].syntax == BASESYNTAX))) {
  					if (state[level].category == TSTATE_VAR_OLD)
  						state[level - 1].syntax = state[level].syntax;
  					else
 
 Added: head/tools/regression/bin/sh/expansion/trim5.0
 ==============================================================================
 --- /dev/null	00:00:00 1970	(empty, because file is newly added)
 +++ head/tools/regression/bin/sh/expansion/trim5.0	Thu Oct 28 21:51:14 2010	(r214490)
 @@ -0,0 +1,28 @@
 +# $FreeBSD$
 +
 +e= q='?' a='*' t=texttext s='ast*que?non' p='/et[c]/' w='a b c' b='{{(#)}}'
 +h='##'
 +failures=''
 +ok=''
 +
 +testcase() {
 +	code="$1"
 +	expected="$2"
 +	oIFS="$IFS"
 +	eval "$code"
 +	IFS='|'
 +	result="$#|$*"
 +	IFS="$oIFS"
 +	if [ "x$result" = "x$expected" ]; then
 +		ok=x$ok
 +	else
 +		failures=x$failures
 +		echo "For $code, expected $expected actual $result"
 +	fi
 +}
 +
 +testcase 'set -- "${b%'\'}\''}"'		'1|{{(#)}'
 +testcase 'set -- ${b%"}"}'			'1|{{(#)}'
 +testcase 'set -- "${b%"}"}"'			'1|{{(#)}'
 +
 +test "x$failures" = x
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: open->patched 
State-Changed-By: jilles 
State-Changed-When: Fri Oct 29 22:54:18 UTC 2010 
State-Changed-Why:  
Fixed in 9-CURRENT. This might be MFCed if there is existing code that 
relies on this, although there is a risk of changing behaviour so it is 
not very likely. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=57554 
State-Changed-From-To: patched->closed 
State-Changed-By: jilles 
State-Changed-When: Fri May 6 21:31:29 UTC 2011 
State-Changed-Why:  
Although this appears to work well in 9-CURRENT, I consider this change 
too risky for a -STABLE branch. Existing scripts might rely on the old 
behaviour. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=57554 
>Unformatted:
