From eugen@eg.sd.rdtc.ru  Fri Nov 11 09:08:58 2011
Return-Path: <eugen@eg.sd.rdtc.ru>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 18B8F1065674
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 11 Nov 2011 09:08:58 +0000 (UTC)
	(envelope-from eugen@eg.sd.rdtc.ru)
Received: from eg.sd.rdtc.ru (eg.sd.rdtc.ru [IPv6:2a03:3100:c:13::5])
	by mx1.freebsd.org (Postfix) with ESMTP id 7C4108FC15
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 11 Nov 2011 09:08:56 +0000 (UTC)
Received: from eg.sd.rdtc.ru (localhost [127.0.0.1])
	by eg.sd.rdtc.ru (8.14.5/8.14.5) with ESMTP id pAB98sIQ006170
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 11 Nov 2011 16:08:54 +0700 (NOVT)
	(envelope-from eugen@eg.sd.rdtc.ru)
Received: (from eugen@localhost)
	by eg.sd.rdtc.ru (8.14.5/8.14.5/Submit) id pAB98nYZ006169;
	Fri, 11 Nov 2011 16:08:49 +0700 (NOVT)
	(envelope-from eugen)
Message-Id: <201111110908.pAB98nYZ006169@eg.sd.rdtc.ru>
Date: Fri, 11 Nov 2011 16:08:49 +0700 (NOVT)
From: Eugene Grosbein <egrosbein@rdtc.ru>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: expr(1) false syntax errors
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         162468
>Category:       bin
>Synopsis:       expr(1) false syntax errors
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Nov 11 09:10:06 UTC 2011
>Closed-Date:    Fri Nov 25 09:22:35 UTC 2011
>Last-Modified:  Fri Nov 25 09:22:35 UTC 2011
>Originator:     Eugene Grosbein
>Release:        FreeBSD 8.2-STABLE i386
>Organization:
RDTC JSC
>Environment:
System: FreeBSD eg.sd.rdtc.ru 8.2-STABLE FreeBSD 8.2-STABLE #35: Thu Sep 29 14:35:55 NOVT 2011 root@eg.sd.rdtc.ru:/usr/local/obj/usr/local/src/sys/EG i386

>Description:

	The following command fails:

# expr '>' : '.*'
expr: syntax error

	It should return 1 (the length of the string '>') instead.
	Note that Linux's expr does right thing.

>How-To-Repeat:

	See above.

>Fix:

	Unknown. It seems our expr threats '>' and operator
	and not as operand for ':' operator.
>Release-Note:
>Audit-Trail:

From: Jilles Tjoelker <jilles@stack.nl>
To: bug-followup@FreeBSD.org, egrosbein@rdtc.ru
Cc:  
Subject: Re: bin/162468: expr(1) false syntax errors
Date: Fri, 11 Nov 2011 16:44:55 +0100

 > [expr treats any string that looks like an operator as an operator,
 > for example, expr '>' : '.*' fails]
 
 The current behaviour of expr is allowed by POSIX (SUSv4, XCU 4
 Utilities, expr). If the application passes '>', this is not a string
 operand but an operator, even if that results in an invalid expression.
 This is also documented in the man page.
 
 It would be a valid extension to allow such expressions but it is not
 immediately clear how it would work. For example, should
   expr \( = \)
 compare two strings ("0") or return a single string ("=")? And should
   expr \( + \)
 return "+" or raise an error?
 
 The test utility is different in that POSIX specifies how a similar
 ambiguity shall be resolved (for a limited set of cases).
 
 Oh, and if you want to find a string length in a shell script, why don't
 you just use
   ${#VAR}
 (given that the string is in $VAR)? If you must use expr(1), do
   expr \( "x$VAR" : '.*' \) - 1
 as described in the man page.
 
 -- 
 Jilles Tjoelker

From: Eugene Grosbein <egrosbein@rdtc.ru>
To: Jilles Tjoelker <jilles@stack.nl>
Cc: bug-followup@FreeBSD.org
Subject: Re: bin/162468: expr(1) false syntax errors
Date: Sat, 12 Nov 2011 01:58:55 +0700

 11.11.2011 22:44, Jilles Tjoelker :
 >> [expr treats any string that looks like an operator as an operator,
 >> for example, expr '>' : '.*' fails]
 > 
 > The current behaviour of expr is allowed by POSIX (SUSv4, XCU 4
 > Utilities, expr). If the application passes '>', this is not a string
 > operand but an operator, even if that results in an invalid expression.
 > This is also documented in the man page.
 
 Yes. But I have reports that that NetBSD's and Linux's expr(1)
 both work as expected.
 
 > It would be a valid extension to allow such expressions but it is not
 > immediately clear how it would work. For example, should
 >   expr \( = \)
 > compare two strings ("0") or return a single string ("=")? And should
 >   expr \( + \)
 > return "+" or raise an error?
 
 It would be wise to take a look at more robust expr(1) implementations
 and try to keep compatibility.
 
 > The test utility is different in that POSIX specifies how a similar
 > ambiguity shall be resolved (for a limited set of cases).
 > 
 > Oh, and if you want to find a string length in a shell script, why don't
 > you just use
 >   ${#VAR}
 > (given that the string is in $VAR)? If you must use expr(1), do
 >   expr \( "x$VAR" : '.*' \) - 1
 > as described in the man page.
 
 That's just a simple test case. In fact, I need not string length
 but evaluate regexp that has ()'s:
 
 read string < file
 expr -- "$string" : 'Key: \(.*\)'
 
 Then $string starts with '>' this fails (and $string may start with '>').
 I've found a workaround: expr -- "x$string" : 'xKey: \(.*\)'
 But that's only workaround, not good solution.
 
 Eugene Grosbein
 

From: Jilles Tjoelker <jilles@stack.nl>
To: Eugene Grosbein <egrosbein@rdtc.ru>
Cc: bug-followup@FreeBSD.org
Subject: Re: bin/162468: expr(1) false syntax errors
Date: Sat, 12 Nov 2011 00:52:59 +0100

 On Sat, Nov 12, 2011 at 01:58:55AM +0700, Eugene Grosbein wrote:
 > 11.11.2011 22:44, Jilles Tjoelker пишет:
 > >> [expr treats any string that looks like an operator as an operator,
 > >> for example, expr '>' : '.*' fails]
 
 > > The current behaviour of expr is allowed by POSIX (SUSv4, XCU 4
 > > Utilities, expr). If the application passes '>', this is not a string
 > > operand but an operator, even if that results in an invalid expression.
 > > This is also documented in the man page.
 
 > Yes. But I have reports that that NetBSD's and Linux's expr(1)
 > both work as expected.
 
 > > It would be a valid extension to allow such expressions but it is not
 > > immediately clear how it would work. For example, should
 > >   expr \( = \)
 > > compare two strings ("0") or return a single string ("=")? And should
 > >   expr \( + \)
 > > return "+" or raise an error?
 
 > It would be wise to take a look at more robust expr(1) implementations
 > and try to keep compatibility.
 
 For '<', your example may work. The expr from GNU coreutils 7.4
 definitely fails your example for '(', ')' and '+'. In the case of '+',
 they added a unary plus operator that takes the next argument as a
 literal even if it looks like an operator so "fixing" it would be ugly.
 GNU expr also has "match", "substr", "index" and "length" operators.
 Trying some more, GNU expr appears inconsistent and unpredictable: it
 will accept strings that have the form of an operator as strings in some
 cases but not all and it is unclear why.
 
 NetBSD's expr supports the "length" operator that we do not, but not
 "match", "substr" or "index". It appears to try fairly hard to make
 wrong input work anyway. For example, it will treat an initial "--" as a
 string (rather than an end-of-options marker) if the next argument is
 not an operator. It also gives yacc the alternative to treat any
 operator except parentheses as a string instead. Because of the
 one-token lookahead of a yacc parser, this does not, however, allow it
 to recognize all possible expressions with such operators as strings.
 For example, if the first two tokens are "length" "<", it may be
 necessary to read all input to decide which of the two is an operator
 (consider the case where the subsequent tokens are zero or more colons).
 
 NetBSD's approach will lead to inconsistent results if we ever need to
 extend expr (such as with GNU's named operators). The extension will
 change the meaning of some expressions in an unpredictable way. One way
 to handle this is to add the GNU cruft; it is unlikely that expr's
 syntax will be extended ever again given that it is mostly a legacy
 tool. The GNU extensions are ugly, though.
 
 If it is accepted that parentheses are always special (which GNU and
 NetBSD expr appear to do, and which is one way to resolve expr \( = \)
 ambiguity) and that there are no named operators or GNU unary "+", then
 there are only binary operators and the first, third, fifth, ...
 arguments excluding parentheses must be operands while the second,
 fourth, sixth, ... must be operators.
 
 > > The test utility is different in that POSIX specifies how a similar
 > > ambiguity shall be resolved (for a limited set of cases).
 
 A similar approach could be applied to expr (e.g. if there are three
 arguments and the second is ":" then it is defined to be a matching
 expression without going into the grammar). The assumption is that
 expressions written without care for strings that look like operators
 will be very simple (one operator only).
 
 > > Oh, and if you want to find a string length in a shell script, why don't
 > > you just use
 > >   ${#VAR}
 > > (given that the string is in $VAR)? If you must use expr(1), do
 > >   expr \( "x$VAR" : '.*' \) - 1
 > > as described in the man page.
 
 > That's just a simple test case. In fact, I need not string length
 > but evaluate regexp that has ()'s:
 
 > read string < file
 > expr -- "$string" : 'Key: \(.*\)'
 
 read string < file
 case $string in
 "Key: "*)
 	printf '%s\n' "${string#Key: }" ;;
 *)
 	echo
 	false ;;
 esac
 
 (Of course, all the printf and false mess is likely unnecessary in a
 real script, but this matches your command very closely.)
 
 A limitation is that the case command and the #/##/%/%% substitutions
 work with shell patterns which are weaker than even basic regular
 expressions.
 
 > Then $string starts with '>' this fails (and $string may start with '>').
 
 It should only fail if $string is exactly '>' or '>='.
 
 > I've found a workaround: expr -- "x$string" : 'xKey: \(.*\)'
 > But that's only workaround, not good solution.
 
 This is not really a workaround, it is the proper way to use expr. So
 poor is the design of expr.
 
 -- 
 Jilles Tjoelker
State-Changed-From-To: open->closed 
State-Changed-By: jh 
State-Changed-When: Fri Nov 25 09:18:53 UTC 2011 
State-Changed-Why:  
This is a request to implement GNU extensions in expr(1) rather than a 
bug report. As the extensions seem to be controversial and there are no 
patches available I think it is better to close this PR. 

If there are patches available in later time, it's probably better to 
discuss about them on mailing lists first. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=162468 
>Unformatted:
