From nobody@FreeBSD.org  Wed Oct  6 02:18:27 2004
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 943C816A4D0
	for <freebsd-gnats-submit@FreeBSD.org>; Wed,  6 Oct 2004 02:18:27 +0000 (GMT)
Received: from www.freebsd.org (www.freebsd.org [216.136.204.117])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 71CF843D55
	for <freebsd-gnats-submit@FreeBSD.org>; Wed,  6 Oct 2004 02:18:27 +0000 (GMT)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.12.11/8.12.11) with ESMTP id i962IR6p002418
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 6 Oct 2004 02:18:27 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.12.11/8.12.11/Submit) id i962IR3Q002417;
	Wed, 6 Oct 2004 02:18:27 GMT
	(envelope-from nobody)
Message-Id: <200410060218.i962IR3Q002417@www.freebsd.org>
Date: Wed, 6 Oct 2004 02:18:27 GMT
From: Joseph Koshy <jkoshy@FreeBSD.org>
To: freebsd-gnats-submit@FreeBSD.org
Subject: awk in -current dumps core
X-Send-Pr-Version: www-2.3

>Number:         72370
>Category:       bin
>Synopsis:       awk in -current dumps core
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    ru
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Oct 06 02:20:20 GMT 2004
>Closed-Date:    Fri Sep 15 13:43:01 GMT 2006
>Last-Modified:  Fri Sep 15 13:43:01 GMT 2006
>Originator:     Joseph Koshy
>Release:        5-current
>Organization:
The FreeBSD Project
>Environment:
FreeBSD orthanc-5 6.0-CURRENT FreeBSD 6.0-CURRENT #4: Sat Sep 25 10:52:35 UTC 2004     root@orthanc-5:/home/obj-current/home/fcpi/src/sys/FCPI  i386
>Description:
awk in 5-current dumps core if asked to deference a positional
parameter at a large positive index.  There also seems to be
numeric overflow occuring behind the scenes.  The following
examples show the difference between GNU awk in 4-STABLE and
the awk in 5-current.

$ echo | /4/usr/bin/awk '{ x = 2147483648; print $x }'
awk: cmd. line:1: (FILENAME=- FNR=1) fatal: attempt to access field -2147483648

$ echo | /4/usr/bin/awk '{ x = 2147483647; print $x }'
*blank line*

$ echo | /5/usr/bin/awk '{ x = 2147483648; print $x }'
/5/usr/bin/awk: trying to access field -2147483648
input record number 1, file
source line number 1

$ echo | /5/usr/bin/awk '{ x = 2147483647; print $x }'
*core dump*

>How-To-Repeat:
      
>Fix:
      
>Release-Note:
>Audit-Trail:

From: Giorgos Keramidas <keramida@freebsd.org>
To: Joseph Koshy <jkoshy@freebsd.org>
Cc: "David O'Brien" <obrien@freebsd.org>, bug-followup@freebsd.org
Subject: Re: bin/72370: awk in -current dumps core
Date: Wed, 6 Oct 2004 06:06:26 +0300

 On 2004-10-06 02:18, Joseph Koshy <jkoshy@freebsd.org> wrote:
 > awk in 5-current dumps core if asked to deference a positional
 > parameter at a large positive index.  There also seems to be numeric
 > overflow occuring behind the scenes.  The following examples show the
 > difference between GNU awk in 4-STABLE and the awk in 5-current.
 
 Others have reported awk allocating huge amounts of memory if a program
 references a variable with a huge index, which seems to be related to this.
 
 > $ echo | /4/usr/bin/awk '{ x = 2147483648; print $x }'
 > awk: cmd. line:1: (FILENAME=- FNR=1) fatal: attempt to access field -2147483648
 
 Looking at the sources of contrib/one-true-awk I can see several places where
 an overflow/truncation of values can occur.  One example is the code of
 indirect() in run.c which calls getfval() with:
 
 : Awkfloat getfval(Cell *);
 : Cell *indirect(Node **a, int n) /* $( a[0] ) */
 : {
 : 	Cell *x;
 : 	int m;
 :
 : 	m = (int) getfval(x);
 : 	...
 
 There is no guarantee that a plain `int' can hold all the values of an
 Awkfloat, so here's truncation waiting to happen.
 
 The excessive memory allocation is probably caused by the code in lib.c which,
 in the body of the fldbld() function, fails to check for overflow the field
 counter; a plain `int' again:
 
     253 void fldbld(void)       /* create fields from current record */
     254 {
     ...
     259         int i, j, n;
     ...
     278                 for (i = 0; ; ) {
     ...
     283                         i++;
     284                         if (i > nfields)
     285                                 growfldtab(i);
 
 There's no check for an overflow of `i' here, so all sorts of funny things can
 happen if one asks for a large field number.
 
 What you see below:
 
 > $ echo | /4/usr/bin/awk '{ x = 2147483647; print $x }'
 > *blank line*
 > $ echo | /5/usr/bin/awk '{ x = 2147483648; print $x }'
 > /5/usr/bin/awk: trying to access field -2147483648
 > input record number 1, file
 > source line number 1
 
 is a result of the fieldaddr() function in lib.c, which does:
 
     378 Cell *fieldadr(int n)   /* get nth field */
     379 {
     380         if (n < 0)
     381                 FATAL("trying to access field %d", n);
     382         if (n > nfields)        /* fields after NF are empty */
     383                 growfldtab(n);  /* but does not increase NF */
     384         return(fldtab[n]);
     385 }
 
 so negative field numbers are warned about but field numbers greater than the
 existing fields are silently converted to empty strings.
 
 David O'Brien is the one who imported this version of awk in our tree, so he's
 the right person to decide if we can make changes to one-true-awk to fix the
 problems it has or do something else and what that 'something else' should be.

From: Ruslan Ermilov <ru@freebsd.org>
To: Joseph Koshy <jkoshy@freebsd.org>
Cc: bug-followup@freebsd.org
Subject: Re: bin/72370: awk in -current dumps core
Date: Wed, 6 Oct 2004 13:17:57 +0300

 On Wed, Oct 06, 2004 at 02:18:27AM +0000, Joseph Koshy wrote:
 > 
 > awk in 5-current dumps core if asked to deference a positional
 > parameter at a large positive index.  There also seems to be
 > numeric overflow occuring behind the scenes.  The following
 > examples show the difference between GNU awk in 4-STABLE and
 > the awk in 5-current.
 > 
 > $ echo | /5/usr/bin/awk '{ x = 2147483647; print $x }'
 > *core dump*
 > 
 There's no bounds checking done when growing the "field table".
 What happens here is that realloc() is given "0" as the second
 argument, and later the code assumes that enough data has been
 allocated when in fact it was not.  The below patch should check
 for all possible overflows by doing the reverse arithmetics.
 
 %%%
 Index: lib.c
 ===================================================================
 RCS file: /home/ncvs/src/contrib/one-true-awk/lib.c,v
 retrieving revision 1.1.1.3
 diff -u -p -r1.1.1.3 lib.c
 --- lib.c	17 Mar 2003 07:59:58 -0000	1.1.1.3
 +++ lib.c	6 Oct 2004 07:55:36 -0000
 @@ -387,10 +387,15 @@ Cell *fieldadr(int n)	/* get nth field *
  void growfldtab(int n)	/* make new fields up to at least $n */
  {
  	int nf = 2 * nfields;
 +	size_t s;
  
  	if (n > nf)
  		nf = n;
 -	fldtab = (Cell **) realloc(fldtab, (nf+1) * (sizeof (struct Cell *)));
 +	s = (nf+1) * (sizeof (struct Cell *));
 +	if (s / (sizeof (struct Cell *)) - 1 == nf)
 +		fldtab = (Cell **) realloc(fldtab, s);
 +	else
 +		xfree(fldtab);
  	if (fldtab == NULL)
  		FATAL("out of space creating %d fields", nf);
  	makefields(nfields+1, nf);
 %%%
 
 
 Cheers,
 -- 
 Ruslan Ermilov
 ru@FreeBSD.org
 FreeBSD committer

From: Giorgos Keramidas <keramida@freebsd.org>
To: Joseph Koshy <jkoshy@freebsd.org>
Cc: "David O'Brien" <obrien@freebsd.org>, bug-followup@freebsd.org
Subject: Re: bin/72370: awk in -current dumps core
Date: Wed, 6 Oct 2004 13:22:26 +0300

 On 2004-10-06 06:06, Giorgos Keramidas <keramida@freebsd.org> wrote:
 > What you see below:
 > > $ echo | /4/usr/bin/awk '{ x = 2147483647; print $x }'
 > > *blank line*
 > > $ echo | /5/usr/bin/awk '{ x = 2147483648; print $x }'
 > > /5/usr/bin/awk: trying to access field -2147483648
 > > input record number 1, file
 > > source line number 1
 >
 > is a result of the fieldaddr() function in lib.c, which does:
 >
 >     378 Cell *fieldadr(int n)   /* get nth field */
 >     379 {
 >     380         if (n < 0)
 >     381                 FATAL("trying to access field %d", n);
 >     382         if (n > nfields)        /* fields after NF are empty */
 >     383                 growfldtab(n);  /* but does not increase NF */
 >     384         return(fldtab[n]);
 >     385 }
 >
 > so negative field numbers are warned about but field numbers greater than the
 > existing fields are silently converted to empty strings.
 
 The overflow shown above can be fixed with this minor patch:
 
 : Index: run.c
 : ===================================================================
 : RCS file: /home/ncvs/src/contrib/one-true-awk/run.c,v
 : retrieving revision 1.1.1.7
 : diff -u -u -r1.1.1.7 run.c
 : --- run.c       8 Feb 2004 21:32:21 -0000       1.1.1.7
 : +++ run.c       6 Oct 2004 10:18:17 -0000
 : @@ -26,6 +26,7 @@
 :  #include <stdio.h>
 :  #include <ctype.h>
 :  #include <setjmp.h>
 : +#include <limits.h>
 :  #include <math.h>
 :  #include <string.h>
 :  #include <stdlib.h>
 : @@ -705,12 +706,16 @@
 :
 :  Cell *indirect(Node **a, int n)        /* $( a[0] ) */
 :  {
 : +       Awkfloat val;
 :         Cell *x;
 :         int m;
 :         char *s;
 :
 :         x = execute(a[0]);
 : -       m = (int) getfval(x);
 : +       val = getfval(x);
 : +       if ((Awkfloat)INT_MAX < val)
 : +               FATAL("trying to access field %s", x->nval);
 : +       m = (int) val;
 :         if (m == 0 && !is_number(s = getsval(x)))       /* suspicion! */
 :                 FATAL("illegal field $(%s), name \"%s\"", s, x->nval);
 :                 /* BUG: can x->nval ever be null??? */
 
 I'm still investigating if something can be done about the other places
 where nawk might start accessing field numbers way beyond the limits of
 INT_MAX.  Its source is fairly complicated for my limited C knowledge
 though, so don't hold your breath.
 
 - Giorgos
 
Responsible-Changed-From-To: freebsd-bugs->obrien 
Responsible-Changed-By: obrien 
Responsible-Changed-When: Sat Dec 4 08:31:08 GMT 2004 
Responsible-Changed-Why:  
Speaking to BWK about this. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=72370 
State-Changed-From-To: open->closed 
State-Changed-By: ru 
State-Changed-When: Fri Sep 15 13:40:12 UTC 2006 
State-Changed-Why:  
Brian integrated my patch on Dec 31, 2004, but attributed it 
to another person. 

: Dec 31, 2004: 
:         prevent overflow of -f array in main, head off potential error in  
:         call of SYNTAX(), test malloc return in lib.c, all with thanks to  
:         todd miller. 


Responsible-Changed-From-To: obrien->ru 
Responsible-Changed-By: ru 
Responsible-Changed-When: Fri Sep 15 13:40:12 UTC 2006 
Responsible-Changed-Why:  

http://www.freebsd.org/cgi/query-pr.cgi?pr=72370 
>Unformatted:
