From nobody@FreeBSD.org  Wed Apr 11 22:59:55 2012
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A40BE106564A
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 11 Apr 2012 22:59:55 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id 7538E8FC08
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 11 Apr 2012 22:59:55 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.4/8.14.4) with ESMTP id q3BMxt8c052975
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 11 Apr 2012 22:59:55 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.4/8.14.4/Submit) id q3BMxt4O052971;
	Wed, 11 Apr 2012 22:59:55 GMT
	(envelope-from nobody)
Message-Id: <201204112259.q3BMxt4O052971@red.freebsd.org>
Date: Wed, 11 Apr 2012 22:59:55 GMT
From: Jim Pryor <dubiousjim@gmail.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: bsdgrep -E and sed handle invalid {} constructs strangely
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         166861
>Category:       bin
>Synopsis:       bsdgrep(1)/sed(1): bsdgrep -E and sed handle invalid {} constructs strangely
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          change-request
>Submitter-Id:   current-users
>Arrival-Date:   Wed Apr 11 23:00:28 UTC 2012
>Closed-Date:    
>Last-Modified:  Thu Apr 12 00:24:29 UTC 2012
>Originator:     Jim Pryor
>Release:        9.0-PRELEASE
>Organization:
>Environment:
FreeBSD vaio.jimpryor.net 9.0-PRERELEASE FreeBSD 9.0-PRERELEASE #0: Tue Nov 29 02:45:33 EST 2011 root@vaio.jimpryor.net:/usr/obj/usr/src/sys/MINE amd64
>Description:
grep version line:
/* $FreeBSD: src/usr.bin/grep/grep.c,v 1.11.2.3 2011/10/20 16:08:11 gabor Exp $

sed version line:
$FreeBSD: src/usr.bin/sed/main.c,v 1.45.2.1 2011/09/23 00:51:37 kensmith Exp $

(1) FreeBSD grep without -E will reject unmatched '\{' as an invalid pattern, but treat unmatched '\}' as a literal '}'. So far, so good. This is also how Gnu grep and BusyBox grep handle these; POSIX-2008 doesn't specify what to do here.

BusyBox's egrep sticks to the same pattern. But FreeBSD's egrep diverges: it treats unmatched { and unmatched } both as literals. These are perverse patterns and no one should be relying on this behavior; however, FreeBSD's change of behavior here seems unmotivated. Admittedly, Gnu egrep does the same as FreeBSD.

(2) FreeBSD grep without -E follows the other greps in rejecting 'a\{1,2,3\}b' as an invalid pattern. The other egreps likewise reject 'a{1,2,3}b'. However, FreeBSD grep accepts 'a{1,2,3}b', and moreover will match it against the text "a{1,2,3}b"; however, the match is zero-length. Again, a perverse pattern whose interpretation no one should be relying on. However, FreeBSD's handling of it seems strange.

(3) The pattern among other sed implementations is:
     without -r: reject unmatched \{ as error, accept unmatched \} as literal
                 reject \{\}, \{2,1\}, and \{1,2,3\}
        with -r: reject unmatched { as error, accept unmatched } as literal
                 reject {}, {2,1}, and {1,2,3}

However, FreeBSD sed without -r diverges from the pattern in rejecting unmatched \} as error.

(4) Also, FreeBSD sed with -r diverges from the pattern in accepting {} as those two literal characters.

>How-To-Repeat:
See above.
>Fix:


>Release-Note:
>Audit-Trail:

From: Jim Pryor <dubiousjim@gmail.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: bin/166861: bsdgrep -E and sed handle invalid {} constructs
 strangely
Date: Wed, 11 Apr 2012 19:45:57 -0400

 (5) FreeBSD sed without -r, as well as the other sed implementations,
 reject unmatched \( and unmatched \) both as invalid patterns. So too do
 all these grep implementations without -E.
 
 Similarly, gnu sed with -r rejects unmatched ( and unmatched ) both as
 invalid patterns. And so too do all these egrep implementations.
 
 Like everyone else, FreeBSD sed with -r rejects unmatched ( as invalid.
 But it diverges in treating unmatched ) as literal.
 -- 
 dubiousjim@gmail.com
 
>Unformatted:
