From nobody@FreeBSD.org  Wed Apr 11 13:49:56 2012
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id AE0C2106564A
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 11 Apr 2012 13:49:56 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id 99C018FC12
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 11 Apr 2012 13:49:56 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.4/8.14.4) with ESMTP id q3BDnuJ4085524
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 11 Apr 2012 13:49:56 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.4/8.14.4/Submit) id q3BDnuFU085523;
	Wed, 11 Apr 2012 13:49:56 GMT
	(envelope-from nobody)
Message-Id: <201204111349.q3BDnuFU085523@red.freebsd.org>
Date: Wed, 11 Apr 2012 13:49:56 GMT
From: Jim Pryor <dubiousjim@gmail.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: bsdgrep inconsistently handles ^ in non-anchoring positions
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         166842
>Category:       bin
>Synopsis:       bsdgrep(1) inconsistently handles ^ in non-anchoring positions
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Apr 11 13:50:12 UTC 2012
>Closed-Date:    
>Last-Modified:  Thu Apr 12 04:10:10 UTC 2012
>Originator:     Jim Pryor
>Release:        9.0-PRELEASE
>Organization:
>Environment:
FreeBSD vaio.jimpryor.net 9.0-PRERELEASE FreeBSD 9.0-PRERELEASE #0: Tue Nov 29 02:45:33 EST 2011     root@vaio.jimpryor.net:/usr/obj/usr/src/sys/MINE  amd64
>Description:
version line:
/*      $FreeBSD: src/usr.bin/grep/grep.c,v 1.11.2.3 2011/10/20 16:08:11 gabor Exp $

According to the POSIX-2008 standard, "^" and "$" should be ordinary characters in BREs (basic regexs) when they're not in anchoring positions (as contrasted to EREs, where they should always be anchors). Hence:

$ printf 'a^b$c' | grep -o 'a^b'

should match, and it does when I use Gnu grep (on Linux), and using BusyBox grep (again on Linux, built against uClibc). But it doesn't using the described version of FreeBSD grep. Curiously though:

$ printf 'a^b$c' | grep -o '[a]^b'

will match. And so too will 'b$c'.

One can't portably rely on '\^' here to specify the literal '^', because POSIX-2008 says that '^' in non-anchoring positions is not special in BREs, and that the combination of '\' and a non-special character is undefined. Of course, neither can one use '[^]'.

>How-To-Repeat:
See above.
>Fix:


>Release-Note:
>Audit-Trail:

From: Jim Pryor <dubiousjim@gmail.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: bin/166842: bsdgrep(1) inconsistently handles ^ in non-anchoring
 positions
Date: Wed, 11 Apr 2012 15:21:01 -0400

 I've noticed some more issues with the same version of grep. I don't
 know whether they're related, but I'll append them here for now.
 
 $ printf abc | grep -o '^[a-c]'
 
 should just print 'a', but instead gives three hits, against each letter
 of the incoming text. The same issue occurs when handling multiline
 buffers:
 
 $ printf 'abc\ndef' | grep -o --null '^[a-f]'
 
 incorrectly matches 6 times.
 
 $ printf 'abc\ndef' | grep -o --null '[a-f]$'
 
 correctly only matches 'c' and 'f'.
 
 
 $ printf 'abc\ndef' | grep -o --null '\`[a-f]'
 
 has the same issue as ^, whereas:
 
 $ printf 'abc\ndef' | grep -o --null '[a-f]\'\'
 
 matches 'c' and 'f'. To fix \` in a way that matches the behavior of \',
 it should only match the 'a' and 'd'. In fact, though, both of these
 should only match against a single character: 'a' for \` and 'f' for \'.
 That's the specified behavior of these Gnu extensions, and how they
 behave in the Gnu grep and BusyBox grep implementations I'm testing
 against. If that behavior isn't going to be provided, then wouldn't it'd
 be better for these extensions not even pretend to be present? And so,
 just match against a literal ` or '?
 -- 
 dubiousjim@gmail.com
 

From: Jim Pryor <dubiousjim@gmail.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: bin/166842: bsdgrep(1) inconsistently handles ^ in non-anchoring
 positions
Date: Thu, 12 Apr 2012 00:00:46 -0400

 On Wed, Apr 11, 2012, at 03:21 PM, Jim Pryor wrote:
 > I've noticed some more issues with the same version of grep. I don't
 > know whether they're related, but I'll append them here for now.
 > 
 > $ printf abc | grep -o '^[a-c]'
 
 Some more observations that seem related:
 
 $ printf 'abc def' | grep -o '^[a-z]'
 
 will match against each of the letters in 'abc', but not against any of
 the letters in 'def'.
 
 On the other hand:
 
 $ printf 'abc def' | grep -o '\b[a-z]'
 $ printf 'abc def' | grep -o '\<[a-z]'
 
 will each match against all six of the letters.
 
 Matching against the patterns:
   '[a-z]\b'
   '[a-z]\>'
   '[a-z]$'
 gives correct results.
 -- 
 dubiousjim@gmail.com
 
>Unformatted:
