From nobody@FreeBSD.org  Fri Jul  5 17:17:43 2013
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
	by hub.freebsd.org (Postfix) with ESMTP id 23E72CEC
	for <freebsd-gnats-submit@FreeBSD.org>; Fri,  5 Jul 2013 17:17:43 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from oldred.freebsd.org (oldred.freebsd.org [8.8.178.121])
	by mx1.freebsd.org (Postfix) with ESMTP id 16366187F
	for <freebsd-gnats-submit@FreeBSD.org>; Fri,  5 Jul 2013 17:17:43 +0000 (UTC)
Received: from oldred.freebsd.org ([127.0.1.6])
	by oldred.freebsd.org (8.14.5/8.14.7) with ESMTP id r65HHgiT020242
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 5 Jul 2013 17:17:42 GMT
	(envelope-from nobody@oldred.freebsd.org)
Received: (from nobody@localhost)
	by oldred.freebsd.org (8.14.5/8.14.5/Submit) id r65HHgLu020241;
	Fri, 5 Jul 2013 17:17:42 GMT
	(envelope-from nobody)
Message-Id: <201307051717.r65HHgLu020241@oldred.freebsd.org>
Date: Fri, 5 Jul 2013 17:17:42 GMT
From: Steffen <sdaoden@gmail.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: awk(1) fails to treat var as integer
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         180328
>Category:       bin
>Synopsis:       awk(1) fails to treat var as integer
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Jul 05 17:20:01 UTC 2013
>Closed-Date:    
>Last-Modified:  Wed Jul 10 09:10:00 UTC 2013
>Originator:     Steffen
>Release:        10
>Organization:
>Environment:
FreeBSD fbsd10 10.0-CURRENT FreeBSD 10.0-CURRENT #0: Sun Jun 23 02:55:37 UTC 2013     root@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
>Description:
Note first that this problem also occurs for Mac OS X Snow Leopard and NetBSD current.  I have not yet tested GNU awk.
I use awk(1) to generate test data from Unicode text files.
. i think the best is i show it:

## Input producers

io_unicode_data() {
   < unicode/UnicodeData.txt ${TAWK} '
      BEGIN {FS = ";" ; OFS = ";"}
      # There are no comments in this, but..
      /^[[:space:]]*[^#]+$/ {
         i = $2
         # Ranges must become unrolled, otherwise step on
         if (i !~ /, First>/) {
            $2 = ""
            print
            next
         }

         r1 = sprintf("%d", "0x" $1)
         getline
         r2 = sprintf("%d", "0x" $1)
         $2 = ""
         # This gets around a bug in at least "awk version 20070501" as found
         # on Slow Leopard: there the range F0000-FFFFD, and only that one,
         # will *not* be evaluated unless we do this (once property test came)
         # XXX presumably the type system is a bit weird; check other AWKs!
         sprintf("%X %X", r1, r2)
[
this is it; UnicodeData.txt contains multiple ranges, but only this one will be "omitted" without sprintf(), the while() will simply not execute otherwise.
]
         while (r1 <= r2) {
            $1 = sprintf("%X", r1)
            printf "%s\n", $0
            ++r1
         }
      }
   '
}

>How-To-Repeat:
well..; git clone my S-CText and run `make ucd' with and without the line `sprintf("%X %X", r1, r2)', compare the resulting `test/sa/t_props.dat' files.
>Fix:


>Release-Note:
>Audit-Trail:

From: Mark Linimon <linimon@lonesome.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: bin/180328: awk(1) fails to treat var as integer
Date: Fri, 5 Jul 2013 17:02:01 -0500

 ----- Forwarded message from Steffen Daode Nurpmeso <sdaoden@gmail.com> -----
 
 Date: Fri, 05 Jul 2013 23:52:45 +0200
 From: Steffen Daode Nurpmeso <sdaoden@gmail.com>
 To: freebsd-bugs@FreeBSD.org
 Subject: Re: bin/180328: awk(1) fails to treat var as integer
 User-Agent: s-nail s-nail-14.3.2-20-g1f64075
 
 Hello.
 uwe@netbsd prodded that i dig a bit deeper and so here is the
 thing a bit narrowed down.  Sorry.
 
  | Please, can you minimize the test case?  As far as I understand it
  | should be reducible to the script and to a single line of input that
  | triggers the problem.
 
 Hmmm.
 
   cat > test.sh <<\!
   printf '1 '; printf "F0000\n" |
   awk '{r2 = r1 = sprintf("%d", "0x" $1); while (r1 <= r2) {print r1; ++r1}}'
   printf '2 '; printf "F0000\n" |
   awk '{r1 = sprintf("%d", "0x" $1); r2 = r1; while (r1 <= r2) {print r1; ++r1}}'
   printf '3 '; printf "F0000\n" |
   awk '{r1 = sprintf("%d", "0x" $1); while (r1 <= 983040) {print r1; ++r1}}'
   printf '4 '; printf "F0000\n" |
   awk '{r1 = sprintf("%d", "0x" $1); r2 = sprintf("%d", "0x" $1); while (r1 <= r2) {print r1; ++r1}}'
   printf '5 '; printf "F0000\n" |
   awk '{r1 = sprintf("%d", "0x" $1); r2 = sprintf("%d", "0x" $1); while (r1 <= r2) {print r1; ++r1}}'
   printf '6 '; printf "F0000 F0001\n" |
   awk '{r1 = sprintf("%d", "0x" $1); r2 = sprintf("%d", "0x" $1); while (r1 < r2) {print r1; ++r1}}'
   sh ./test.sh
 
 results in
 
   1 983040
   2 983040
   3 983040
   4 983040
   5 983040
   6
 
 So -- indeed.  Sorry.
 
  | -uwe
 
 --steffen
 
 But
 
   $ make ucd; ll test/sa/t_props.dat; make ucd-clean;\
   sed -e 40d -i '' tools/t-base.t; make ucd; ll test/sa/t_props.dat
 
 becomes (when i strip all the other messages)
 
   ucd: ok
   4956 -rw-rw-r--  1 steffen  staff  5071362  5 Jul 23:40 test/sa/t_props.dat
   ucd-clean: ok
   ...
   ucd: ok
   4188 -rw-rw-r--  1 steffen  staff  4284954  5 Jul 23:40 test/sa/t_props.dat
 _______________________________________________
 freebsd-bugs@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-bugs
 To unsubscribe, send any mail to "freebsd-bugs-unsubscribe@freebsd.org"
 
 
 ----- End forwarded message -----

From: Steffen "Daode" Nurpmeso <sdaoden@gmail.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: bin/180328: awk(1) fails to treat var as integer
Date: Wed, 10 Jul 2013 11:01:34 +0200

 Hello, i'm forwarding one more.  (This time to bug-followup@ --
 hello, Mark Linimon!)
 
 -------- Original Message --------
 Date: Wed, 10 Jul 2013 10:53:13 +0200
 From: Steffen "Daode" Nurpmeso <sdaoden@gmail.com>
 To: gnats-bugs@NetBSD.org
 Subject: Re: bin/48017: awk(1) fails to treat var as integer (may be related
  to #47840)
 
 David Holland <dholland-bugs@netbsd.org> wrote:
  | sprintf witih %d doesn't produce an number value; it produces a
  | string value, which you have to coerce to a number by adding zero to
  | it to get it to behave like a number.
 
 (Adding +0 was my final solution too, because GNU awk(1) didn't
 make it by the (presumably more expensive, too) sprintf("%X")
 call just as all other tested awk(1)s did.)
 
 So there is a problem with the implicit type conversion, since
 
   echo f001 f00d |\
   awk '{ a=sprintf("%d", "0x" $1); b=sprintf("%d", "0x" $2); while (a < b) { print a; a++; }}'
 
 works just fine?!?  I think the relevant parts from POSIX are
 
   the value of an expression shall be implicitly converted to the
   type needed for the context in which it is used.
   [.]
   A numeric value that is exactly equal to the value of an integer
   (see Concepts Derived from the ISO C Standard) shall be converted
   to a string by the equivalent of a call to the sprintf function
   (see String Functions) with the string "%d" as the fmt argument
   and the numeric value being converted as the first and only expr
   argument.
   [.]
   This volume of POSIX.1-2008 specifies no explicit conversions
   between numbers and strings. An application can force an
   expression to be treated as a number by adding zero to it, or can
   force it to be treated as a string by concatenating the null
   string ( "" ) to it.
   [.]
   A string value shall be considered a numeric string if it comes
   from one of the following:
     [.]
     1. Field variables
     [.]
     8. Variable assignment from another numeric string variable
   [...]
   and an implementation-dependent condition corresponding to either
   case (a) or (b) below is met.
     [.]
     b. After all the following conversions have been applied, the
     resulting string would lexically be recognized as a NUMBER
     token as described by the lexical conventions in Grammar :
     [.]
   Whether or not a string is a numeric string shall be relevant only
   in contexts where that term is used in this section.
 
 And because the `Table: Expressions in Decreasing Precedence in awk'
 contains the line
 
   expr < expr   Less than   Numeric   None
 
 i believe its a bug.  (That hopefully gets fixed by someone who
 yet has some experience with the awk codebase.)
 
  | David A. Holland
  | dholland@netbsd.org
 
 --steffen
>Unformatted:
