From simon@comsys.ntu-kpi.kiev.ua  Mon Mar 12 11:18:30 2012
Return-Path: <simon@comsys.ntu-kpi.kiev.ua>
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 63FD5106566C
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 12 Mar 2012 11:18:30 +0000 (UTC)
	(envelope-from simon@comsys.ntu-kpi.kiev.ua)
Received: from comsys.kpi.ua (comsys.kpi.ua [77.47.192.42])
	by mx1.freebsd.org (Postfix) with ESMTP id E43168FC1B
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 12 Mar 2012 11:18:29 +0000 (UTC)
Received: from pm513-1.comsys.kpi.ua ([10.18.52.101] helo=pm513-1.comsys.ntu-kpi.kiev.ua)
	by comsys.kpi.ua with esmtpsa (TLSv1:AES256-SHA:256)
	(Exim 4.63)
	(envelope-from <simon@comsys.ntu-kpi.kiev.ua>)
	id 1S73Gm-0006Qy-WD
	for FreeBSD-gnats-submit@freebsd.org; Mon, 12 Mar 2012 13:18:29 +0200
Received: by pm513-1.comsys.ntu-kpi.kiev.ua (Postfix, from userid 1001)
	id DEFFF1CC36; Mon, 12 Mar 2012 13:18:28 +0200 (EET)
Message-Id: <20120312111828.GA1818@pm513-1.comsys.ntu-kpi.kiev.ua>
Date: Mon, 12 Mar 2012 13:18:28 +0200
From: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
To: FreeBSD-gnats-submit@freebsd.org
Subject: libc: strtonum() optimization

>Number:         165970
>Category:       kern
>Synopsis:       [libc] [patch] strtonum() optimization
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          update
>Submitter-Id:   current-users
>Arrival-Date:   Mon Mar 12 11:20:11 UTC 2012
>Closed-Date:    
>Last-Modified:  Thu Nov 08 20:54:40 UTC 2012
>Originator:     Andrey Simonenko
>Release:        FreeBSD 10.0-CURRENT amd64
>Organization:
>Environment:
>Description:

The strtonum() function can be optimized:

1. Array with messages and errno values should not be created on
   each strtonum() call.

2. If the errno value is not changed, then it can be cached, instead
   of calling corresponding function each time to get its value.

3. Some code paths can be made a bit faster.

Also if implementation guaranties that strtoll() does not change errno
if successful, then comparisons for LLONG_MAX and LLONG_MIN can be removed.

>How-To-Repeat:
>Fix:
--- strtonum.c.orig	2006-03-14 21:53:03.000000000 +0200
+++ strtonum.c	2012-03-12 13:05:40.000000000 +0200
@@ -24,45 +24,51 @@ __FBSDID("$FreeBSD: src/lib/libc/stdlib/
 #include <limits.h>
 #include <stdlib.h>
 
-#define INVALID 	1
-#define TOOSMALL 	2
-#define TOOLARGE 	3
+#define INVALID 	0
+#define TOOSMALL 	1
+#define TOOLARGE 	2
 
 long long
 strtonum(const char *numstr, long long minval, long long maxval,
     const char **errstrp)
 {
-	long long ll = 0;
-	char *ep;
-	int error = 0;
-	struct errval {
-		const char *errstr;
-		int err;
-	} ev[4] = {
-		{ NULL,		0 },
-		{ "invalid",	EINVAL },
-		{ "too small",	ERANGE },
-		{ "too large",	ERANGE },
+	static const struct {
+		const char	*errstr;
+		int		errnum;
+	} ev[] = {
+		[INVALID]	= { "invalid",		EINVAL },
+		[TOOSMALL]	= { "too small",	ERANGE },
+		[TOOLARGE]	= { "too large",	ERANGE }
 	};
 
-	ev[0].err = errno;
-	errno = 0;
+	int error;
+
 	if (minval > maxval)
 		error = INVALID;
 	else {
+		long long ll;
+		char *ep;
+		int errno_save;
+
+		errno_save = errno;
+		errno = 0;
 		ll = strtoll(numstr, &ep, 10);
-		if (errno == EINVAL || numstr == ep || *ep != '\0')
+		error = errno;
+		if (error == EINVAL || numstr == ep || *ep != '\0')
 			error = INVALID;
-		else if ((ll == LLONG_MIN && errno == ERANGE) || ll < minval)
+		else if ((ll == LLONG_MIN && error == ERANGE) || ll < minval)
 			error = TOOSMALL;
-		else if ((ll == LLONG_MAX && errno == ERANGE) || ll > maxval)
+		else if ((ll == LLONG_MAX && error == ERANGE) || ll > maxval)
 			error = TOOLARGE;
+		else {
+			errno = errno_save;
+			if (errstrp != NULL)
+				*errstrp = NULL;
+			return (ll);
+		}
 	}
 	if (errstrp != NULL)
 		*errstrp = ev[error].errstr;
-	errno = ev[error].err;
-	if (error)
-		ll = 0;
-
-	return (ll);
+	errno = ev[error].errnum;
+	return (0);
 }

>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: gnats-admin->freebsd-bugs 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Mon Mar 12 11:34:41 UTC 2012 
Responsible-Changed-Why:  
Rescue this PR from the 'pending' category. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=165970 

From: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/165970: [libc] [patch] strtonum() optimization
Date: Tue, 13 Mar 2012 11:56:11 +0200

 Please ignore the statement about LLONG_MAX and LLONG_MIN.
Responsible-Changed-From-To: freebsd-bugs->eadler 
Responsible-Changed-By: eadler 
Responsible-Changed-When: Wed Mar 14 03:18:10 UTC 2012 
Responsible-Changed-Why:  
This PR looking interesting. I have to do some benchmarking to ensure 
this code is faster && some testing to ensure it is still correct.  I 
may not get to this soon, but I'll try to work on it 

http://www.freebsd.org/cgi/query-pr.cgi?pr=165970 


From: Eitan Adler <eadler@freebsd.org>
To: bug-followup@freebsd.org
Cc:  
Subject: Re: kern/165970: [libc] [patch] strtonum() optimization
Date: Sat, 28 Apr 2012 16:31:42 -0400

 ---------- Forwarded message ----------
 From: Bruce Evans <brde@optusnet.com.au>
 Date: 14 March 2012 05:55
 Subject: Re: kern/165970: [libc] [patch] strtonum() optimization
 To: eadler@freebsd.org
 
 
 > Responsible-Changed-From-To: freebsd-bugs->eadler
 > Responsible-Changed-By: eadler
 > Responsible-Changed-When: Wed Mar 14 03:18:10 UTC 2012
 > Responsible-Changed-Why:
 > This PR looking interesting. I have to do some benchmarking to ensure
 > this code is faster && some testing to ensure it is still correct. =C2=A0=
 I
 > may not get to this soon, but I'll try to work on it
 
 
 I have a low opinion of strtonum() and might have pointed out its bugs
 before. =C2=A0There is another PR about the brokenness of atoi() and how
 replacements for it can be even worse, by providing non-replacements
 that are even harder to use. =C2=A0strtonum() is basically a step sideways
 for this. =C2=A0It tries to be easier to use than strtol(), but error handl=
 ing
 for it is still painful, and it has a bad API that was fixed for strtol()
 by expanding to to strto[u]imax(). =C2=A0strtonum() doesn't even have an
 unsigned version, and steals the generic name 'num' for its very
 non-generic API. =C2=A0The generic version should also expand on strtod(),
 but now keeping it easy to use is even harder.
 
 The above was only about its design errors. =C2=A0Its implementation errors
 are a little smaller.
 
 From the PR:
 
 % >Description:
 % % The strtonum() function can be optimized:
 % % 1. Array with messages and errno values should not be created on
 % =C2=A0 =C2=A0each strtonum() call.
 
 Anyone who programs this pessimization should be rewarded with lots
 of programming in K&R where it wasn't possible. =C2=A0Compilers should
 optimize it better, but I haven't seen one doing it perfectly. =C2=A0Most
 like to make a static copy of the data like the programmer should
 have written originally and then copy this to the stack on every call.
 Perhaps declaring it as const prevents the copying. =C2=A0But IIRC,
 strtonum() wants to modify the data as part of its dubious error
 handling, so the data can't be const.
 
 % % 2. If the errno value is not changed, then it can be cached, instead
 % =C2=A0 =C2=A0of calling corresponding function each time to get its value=
 
 % % 3. Some code paths can be made a bit faster.
 
 Possibly.
 
 % % Also if implementation guaranties that strtoll() does not change errno
 % if successful, then comparisons for LLONG_MAX and LLONG_MIN can be remove=
 d.
 
 Such comparisons don't even work, since LLONG_MAX and LLONG_MIN are in-band=
 
 The errno method must be used. =C2=A0It us not just an optimizations.
 
 % --- strtonum.c.orig =C2=A0 2006-03-14 21:53:03.000000000 +0200
 % +++ strtonum.c =C2=A0 =C2=A0 =C2=A0 =C2=A02012-03-12 13:05:40.000000000 +=
 0200
 % @@ -24,45 +24,51 @@ __FBSDID("$FreeBSD: src/lib/libc/stdlib/
 % =C2=A0#include <limits.h>
 % =C2=A0#include <stdlib.h>
 % % -#define INVALID =C2=A0 =C2=A01
 % -#define TOOSMALL =C2=A0 =C2=A0 2
 % -#define TOOLARGE =C2=A0 =C2=A0 3
 % +#define INVALID =C2=A0 =C2=A0 =C2=A00
 % +#define TOOSMALL =C2=A0 =C2=A0 1
 % +#define TOOLARGE =C2=A0 =C2=A0 2
 
 Of course code of this quality misformats the tab after @define.
 
 % % =C2=A0long long
 % =C2=A0strtonum(const char *numstr, long long minval, long long maxval,
 % =C2=A0 =C2=A0 =C2=A0const char **errstrp)
 % =C2=A0{
 % - =C2=A0 =C2=A0 long long ll =3D 0;
 % - =C2=A0 =C2=A0 char *ep;
 % - =C2=A0 =C2=A0 int error =3D 0;
 % - =C2=A0 =C2=A0 struct errval {
 % - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 const char *errstr;
 % - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 int err;
 % - =C2=A0 =C2=A0 } ev[4] =3D {
 % - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 { NULL, =C2=A0 =C2=A0 =C2=A0 =
 =C2=A0 0 },
 % - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 { "invalid", =C2=A0 =C2=A0EIN=
 VAL },
 % - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 { "too small", =C2=A0ERANGE }=
 ,
 % - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 { "too large", =C2=A0ERANGE }=
 ,
 % + =C2=A0 =C2=A0 static const struct {
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 const char =C2=A0 =C2=A0 =C2=
 =A0*errstr;
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 int =C2=A0 =C2=A0 =C2=A0 =C2=
 =A0 =C2=A0 =C2=A0 errnum;
 % + =C2=A0 =C2=A0 } ev[] =3D {
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 [INVALID] =C2=A0 =C2=A0 =C2=
 =A0 =3D { "invalid", =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0EINVAL },
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 [TOOSMALL] =C2=A0 =C2=A0 =C2=
 =A0=3D { "too small", =C2=A0 =C2=A0 =C2=A0 =C2=A0ERANGE },
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 [TOOLARGE] =C2=A0 =C2=A0 =C2=
 =A0=3D { "too large", =C2=A0 =C2=A0 =C2=A0 =C2=A0ERANGE }
 % =C2=A0 =C2=A0 =C2=A0 };
 
 Good not try to get the pessimization of copying on evey call.
 
 This also improves the formatting of the structs, but I think the whole
 implementation of the error handling is worse than its design. =C2=A0In
 K&R, you couldn't use the [] initialization, and it is overkill for
 a small array. =C2=A0It is most useful for unstructured structs.
 
 % % - =C2=A0 ev[0].err =3D errno;
 % - =C2=A0 =C2=A0 errno =3D 0;
 % + =C2=A0 =C2=A0 int error;
 % +
 % =C2=A0 =C2=A0 =C2=A0 if (minval > maxval)
 % =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 error =3D INVALID;
 % =C2=A0 =C2=A0 =C2=A0 else {
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 long long ll;
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 char *ep;
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 int errno_save;
 % +
 
 This adds some style bugs (nested declarations).
 
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 errno_save =3D errno;
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 errno =3D 0;
 % =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ll =3D strtoll(numstr, &=
 ep, 10);
 
 The largest design bug is not the limitation to long longs, but the
 limitation to base 10. =C2=A0This limitation saves passing just the base
 parameter of strtol(). =C2=A0It's good to simplify like that, but a
 generic function should accept any base.
 
 % - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (errno =3D=3D EINVAL || nu=
 mstr =3D=3D ep || *ep !=3D '\0')
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 error =3D errno;
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (error =3D=3D EINVAL || nu=
 mstr =3D=3D ep || *ep !=3D '\0')
 % =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
 =A0 error =3D INVALID;
 
 I don't like using the POSIX (?) extension of strto*() returning EINVAL,
 but if we do then we shouldn't check for EINVAL. =C2=A0I think numstr =3D=
 =3D ep
 is unreachable since conditions that cause it also cause error =3D=3D EINVA=
 L.
 I prefer to omit the error =3D=3D EINVAL check instead.
 
 % - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 else if ((ll =3D=3D LLONG_MIN=
  && errno =3D=3D ERANGE) || ll < minval)
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 else if ((ll =3D=3D LLONG_MIN=
  && error =3D=3D ERANGE) || ll < minval)
 % =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
 =A0 error =3D TOOSMALL;
 % - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 else if ((ll =3D=3D LLONG_MAX=
  && errno =3D=3D ERANGE) || ll > maxval)
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 else if ((ll =3D=3D LLONG_MAX=
  && error =3D=3D ERANGE) || ll > maxval)
 % =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
 =A0 error =3D TOOLARGE;
 
 Unfortunately there is a problem here. =C2=A0We want to distinguish TOOSMAL=
 L from
 TOOLARGE, but ERANGE only tells us that one of these occurred. =C2=A0So che=
 cking
 the limits seems to be necessary.
 
 If anyone cared about optimizing this, then they would check error =3D=3D E=
 RANGE
 before checking ll, since error is smaller and usually !=3D ERANGE.
 
 I keep mistyping `error' as errno... =C2=A0Copying errno is good, but why
 doesn't the compiler cache it for you? =C2=A0It is per-thread, so it is
 even less volatile than a general global. =C2=A0Maybe its declaration is
 missing a __pure2. =C2=A0I looked at it, and it indeed doesn't have a
 __pure2. =C2=A0I was surprised to find that it was a function call even in
 old versions of FreeBSD (I thought it was ifdefed to just a global in
 the non-threaded case). =C2=A0After adding a __pure2, the result of the
 function call is cached. =C2=A0But the accesses through the pointer are not
 cached by gcc-3.3. =C2=A0Neither are the stores for 'x =3D errno; x =3D err=
 no;'.
 x is global, and there is no volatile in sight. =C2=A0But gcc-4.2 reduces
 this to just the function call to obtain the pointer, then a single
 load-store.
 
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 else {
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 e=
 rrno =3D errno_save;
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 i=
 f (errstrp !=3D NULL)
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 *errstrp =3D NULL;
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 r=
 eturn (ll);
 % + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 }
 % =C2=A0 =C2=A0 =C2=A0 }
 % =C2=A0 =C2=A0 =C2=A0 if (errstrp !=3D NULL)
 % =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 *errstrp =3D ev[error].e=
 rrstr;
 % - =C2=A0 =C2=A0 errno =3D ev[error].err;
 % - =C2=A0 =C2=A0 if (error)
 % - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ll =3D 0;
 % -
 % - =C2=A0 =C2=A0 return (ll);
 % + =C2=A0 =C2=A0 errno =3D ev[error].errnum;
 % + =C2=A0 =C2=A0 return (0);
 % =C2=A0}
 
 Mostly better. =C2=A0The old version does something confusing with the non-=
 const
 data structure, to do little more than avoid an else clause.
 
 I tried writing this without any data structure or internal error codes,
 but didn't like the results. =C2=A0It gave lots of large else clauses, wher=
 e
 the above maps all error cases to an error code and then looks up this
 code in the table, which is certainly better for a large number of cases.
 So both the original and the above are looking better -- the orginal
 goes too far in mapping the non-error case to the table lookup, and the
 above fixes that.
 
 I would have designed the API so that it doesn't try to be more explicit
 than standard strto*() in its description of errors. =C2=A0It should just
 return errno =3D ERANGE (or maybe EINVAL) and not add the complication
 of an only slightly more descriptive string error message. =C2=A0I just
 remembered that the latter was one of the main points discussed in the
 old PR. =C2=A0Even moldy programs that use atoi() often have context-sensit=
 ive
 error messages that give more useful detail than a generic function can.
 .. Just found the old PR -- it is 142911. =C2=A0One example from it:
 
 @ @@ -226,9 +223,10 @@
 @ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
 =A0 break;
 @ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 case 'n':
 @ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
 =A0 nflag =3D 1;
 @ - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 m=
 axshowdevs =3D (int)strtonum(optarg, 0, 1000, &errstr);
 @ - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 i=
 f (errstr !=3D NULL)
 @ - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 errx(1, "-n %s: %s", optarg, errstr);
 @ + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
 =C2=A0 maxshowdevs =3D atoi(optarg);
 @ + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 i=
 f (maxshowdevs < 0)
 @ + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 errx(1, "number of devices %d is < 0",
 @ + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0maxshowdevs);
 @ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
 =A0 break;
 
 Note that the patch is backwards. =C2=A0This change, proposed by the submit=
 ter,
 shows how unusable strtonum() and especially its error string are. =C2=A0We=
  had
 to do extra work to call it. =C2=A0It gives better error checking. =C2=A0Bu=
 t then
 using its error string is hard. =C2=A0Well, not very hard, but the submitte=
 d
 just printed it, which replaces "number of devices" by the cryptic -n.
 The correct translation seems to be
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
 =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0errx(1, "number of devices %s is %s",
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
 =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0optarg, errstr);
 (The %s for optarg should also be quoted.).
 
 Bruce
 
 
 
 --=20
 Eitan Adler
 Source & Ports committer
 X11, Bugbusting teams

From: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
To: bug-followup@freebsd.org
Cc: Eitan Adler <eadler@freebsd.org>
Subject: kern/165970: [libc] [patch] strtonum() optimization
Date: Tue, 17 Jul 2012 15:40:29 +0300

 I was not sent bug-followup, so I noticed that reply only today.
 
 > % >Description:
 > % % The strtonum() function can be optimized:
 > % % 1. Array with messages and errno values should not be created on
 > %    each strtonum() call.
 >
 > Anyone who programs this pessimization should be rewarded with lots
 > of programming in K&R where it wasn't possible.  Compilers should
 > optimize it better, but I haven't seen one doing it perfectly.  Most
 > like to make a static copy of the data like the programmer should
 > have written originally and then copy this to the stack on every call.
 > Perhaps declaring it as const prevents the copying.  But IIRC,
 > strtonum() wants to modify the data as part of its dubious error
 > handling, so the data can't be const.
 
 Original strtonum() version modifies this array, but updated version
 does not modify it.  I've just verified how gcc-4.2.1 and clang-3.1 on
 9.0-STABLE generate code for this function, just "const" is enough.
 Since updated version does not modify this array, it should not be created
 on each function call and even if some version needs to modify this array,
 I would prefer to declare it as "static", so not modified elements of
 an array will be ready without filling.
 
 > % % Also if implementation guaranties that strtoll() does not change errno
 > % if successful, then comparisons for LLONG_MAX and LLONG_MIN can be removed.
 > 
 > Such comparisons don't even work, since LLONG_MAX and LLONG_MIN are in-band
 > The errno method must be used.  It us not just an optimizations.
 
 First of all I wrote in bug-folloup that this my statement is wrong and
 should be ignored.  Second, they are needed to distinguish "too small" and
 "too large" errors.
 
 > % -#define TOOSMALL     2
 > % -#define TOOLARGE     3
 > % +#define INVALID      0
 > % +#define TOOSMALL     1
 > % +#define TOOLARGE     2
 >
 > Of course code of this quality misformats the tab after @define.
 
 I copied these lines from the original code.  Corrected.
 
 > % -             if (errno == EINVAL || numstr == ep || *ep != '\0')
 > % +             error = errno;
 > % +             if (error == EINVAL || numstr == ep || *ep != '\0')
 > %                       error = INVALID;
 > 
 > I don't like using the POSIX (?) extension of strto*() returning EINVAL,
 > but if we do then we shouldn't check for EINVAL.  I think numstr == ep
 > is unreachable since conditions that cause it also cause error == EINVAL.
 > I prefer to omit the error == EINVAL check instead.
 
 Looks correct to me.
 
 > % -             else if ((ll == LLONG_MAX && errno == ERANGE) || ll > maxval)
 > % +             else if ((ll == LLONG_MAX && error == ERANGE) || ll > maxval)
 > %                       error = TOOLARGE;
 > 
 > Unfortunately there is a problem here.  We want to distinguish TOOSMALL from
 > TOOLARGE, but ERANGE only tells us that one of these occurred.  So checking
 > the limits seems to be necessary.
 
 Yes.
 
 > If anyone cared about optimizing this, then they would check error == ERANGE
 > before checking ll, since error is smaller and usually != ERANGE.
 
 Agree.
 
 Thank you for comments.
 
 New changes for strtonum():
 
 --- strtonum.c.orig	2006-03-14 21:53:03.000000000 +0200
 +++ strtonum.c	2012-07-17 15:34:38.000000000 +0300
 @@ -24,45 +24,48 @@
  #include <limits.h>
  #include <stdlib.h>
  
 -#define INVALID 	1
 -#define TOOSMALL 	2
 -#define TOOLARGE 	3
 +#define INVALID		0
 +#define TOOSMALL	1
 +#define TOOLARGE	2
  
  long long
  strtonum(const char *numstr, long long minval, long long maxval,
      const char **errstrp)
  {
 -	long long ll = 0;
 -	char *ep;
 -	int error = 0;
 -	struct errval {
 -		const char *errstr;
 -		int err;
 -	} ev[4] = {
 -		{ NULL,		0 },
 -		{ "invalid",	EINVAL },
 -		{ "too small",	ERANGE },
 -		{ "too large",	ERANGE },
 +	const struct {
 +		const char	*errstr;
 +		int		errnum;
 +	} ev[3] = {
 +		[INVALID]	= { "invalid",		EINVAL },
 +		[TOOSMALL]	= { "too small",	ERANGE },
 +		[TOOLARGE]	= { "too large",	ERANGE }
  	};
 +	long long ll;
 +	char *ep;
 +	int error, errno_save;
  
 -	ev[0].err = errno;
 -	errno = 0;
  	if (minval > maxval)
  		error = INVALID;
  	else {
 +		errno_save = errno;
 +		errno = 0;
  		ll = strtoll(numstr, &ep, 10);
 -		if (errno == EINVAL || numstr == ep || *ep != '\0')
 +		error = errno;
 +		if (numstr == ep || *ep != '\0')
  			error = INVALID;
 -		else if ((ll == LLONG_MIN && errno == ERANGE) || ll < minval)
 +		else if ((error == ERANGE && ll == LLONG_MIN) || ll < minval)
  			error = TOOSMALL;
 -		else if ((ll == LLONG_MAX && errno == ERANGE) || ll > maxval)
 +		else if ((error == ERANGE && ll == LLONG_MAX) || ll > maxval)
  			error = TOOLARGE;
 +		else {
 +			errno = errno_save;
 +			if (errstrp != NULL)
 +				*errstrp = NULL;
 +			return (ll);
 +		}
  	}
  	if (errstrp != NULL)
  		*errstrp = ev[error].errstr;
 -	errno = ev[error].err;
 -	if (error)
 -		ll = 0;
 -
 -	return (ll);
 +	errno = ev[error].errnum;
 +	return (0);
  }
Responsible-Changed-From-To: eadler->freebsd-bugs 
Responsible-Changed-By: eadler 
Responsible-Changed-When: Thu Nov 8 20:54:40 UTC 2012 
Responsible-Changed-Why:  
I won't be dealing with this PR for some time, so give it back to the 
pool 

http://www.freebsd.org/cgi/query-pr.cgi?pr=165970 
>Unformatted:
