From nobody@FreeBSD.org  Fri Apr 10 22:36:05 2009
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 42DAD10656CE
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 10 Apr 2009 22:36:05 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 15AA28FC0C
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 10 Apr 2009 22:36:05 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id n3AMa4B5080329
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 10 Apr 2009 22:36:04 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id n3AMa4tT080326;
	Fri, 10 Apr 2009 22:36:04 GMT
	(envelope-from nobody)
Message-Id: <200904102236.n3AMa4tT080326@www.freebsd.org>
Date: Fri, 10 Apr 2009 22:36:04 GMT
From: Abramo Bagnara <abramo.bagnara@gmail.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: fma does not respect rounding mode using extended precision
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         133583
>Category:       kern
>Synopsis:       [libm] fma(3) does not respect rounding mode using extended precision
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Apr 10 22:40:01 UTC 2009
>Closed-Date:    Fri Dec 03 07:02:04 UTC 2010
>Last-Modified:  Fri Dec  3 07:10:09 UTC 2010
>Originator:     Abramo Bagnara
>Release:        7.1
>Organization:
>Environment:
FreeBSD freebsd.homenet.telecomitalia.it 7.1-RELEASE FreeBSD 7.1-RELEASE #0: Thu Jan  1 14:37:25 UTC 2009     root@logan.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  i386
>Description:
After fpsetprec(FP_PE) libc fma does not respect rounding mode.
>How-To-Repeat:
$ gcc bug.c -lm
$ ./a.out
Exact result: 9973859.79298831405913006165064871311187744140625
Default precision, DOWNWARD  +* : 9973859.79298831336200237274169921875
Default precision, UPWARD    +* : 9973859.79298831522464752197265625
Default precision, DOWNWARD  fma: 9973859.79298831336200237274169921875
Default precision, UPWARD    fma: 9973859.79298831522464752197265625
Extended precision, DOWNWARD  +* : 9973859.79298831336200237274169921875
Extended precision, UPWARD    +* : 9973859.79298831522464752197265625
Extended precision, DOWNWARD  fma: 9973859.7929883114993572235107421875
Extended precision, UPWARD    fma: 9973859.79298831336200237274169921875
>Fix:


Patch attached with submission follows:

#include <math.h>
#include <fenv.h>
#include <stdio.h>

#ifdef __FreeBSD__
#include <ieeefp.h>
#endif

double to = 8248384;
double x = 2871;
double y = 601.00166944908187360852025449275970458984375;


// Exact result is: 9973859.79298831405913006165064871311187744140625
int main() {
  printf("Exact result: 9973859.79298831405913006165064871311187744140625\n");

  fesetround(FE_DOWNWARD);
  printf("Default precision, DOWNWARD  +* : %.1000g\n", to + x * y);
  fesetround(FE_UPWARD);
  printf("Default precision, UPWARD    +* : %.1000g\n", to + x * y);

  fesetround(FE_DOWNWARD);
  printf("Default precision, DOWNWARD  fma: %.1000g\n", fma(x, y, to));
  fesetround(FE_UPWARD);
  printf("Default precision, UPWARD    fma: %.1000g\n", fma(x, y, to));

#ifdef __FreeBSD__
  fpsetprec(FP_PE);

  fesetround(FE_DOWNWARD);
  printf("Extended precision, DOWNWARD  +* : %.1000g\n", to + x * y);
  fesetround(FE_UPWARD);
  printf("Extended precision, UPWARD    +* : %.1000g\n", to + x * y);

  fesetround(FE_DOWNWARD);
  printf("Extended precision, DOWNWARD  fma: %.1000g\n", fma(x, y, to));
  fesetround(FE_UPWARD);
  printf("Extended precision, UPWARD    fma: %.1000g\n", fma(x, y, to));
#endif
}


>Release-Note:
>Audit-Trail:

From: Abramo Bagnara <abramo.bagnara@gmail.com>
To: bug-followup@FreeBSD.org, abramo.bagnara@gmail.com
Cc:  
Subject: Re: i386/133583: fma does not respect rounding mode using extended
 precision
Date: Sat, 11 Apr 2009 09:21:54 +0200

 Unfortunately I've misread the problem report form, and the attached
 file is not the patch, but the C source that shows the bug.
 
Responsible-Changed-From-To: freebsd-i386->freebsd-bugs 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Mon Apr 13 05:28:29 UTC 2009 
Responsible-Changed-Why:  
reclassify. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=133583 

From: Alexander Best <arundel@freebsd.org>
To: bug-followup@freebsd.org
Cc:  
Subject: Re: kern/133583: [libm] fma(3) does not respect rounding mode using extended precision
Date: Tue, 23 Nov 2010 23:40:18 +0000

 this is what bruce evans wrote concerning this issue.
 
 cheers.
 alex
 
 ----- Forwarded message from Bruce Evans <brde@optusnet.com.au> -----
 
 Date: Tue, 16 Nov 2010 05:44:03 +1100 (EST)
 From: Bruce Evans <brde@optusnet.com.au>
 To: Alexander Best <arundel@FreeBSD.org>
 cc: Bruce Evans <brde@optusnet.com.au>, Ulrich Spoerlein <uqs@FreeBSD.org>,
         das@freebsfd.org
 Subject: Re: svn commit: r215237 - head/lib/msun/src
 
 [Cc trimmed]
 
 On Mon, 15 Nov 2010, Alexander Best wrote:
 
 >if you are interested in solving two more msun mysteries, you might want to
 >have a look at #PR kern/133583 and standards/143358.
 
 Here is a quick fix for #133583.
 
 % Index: s_fma.c
 % ===================================================================
 % RCS file: /home/ncvs/src/lib/msun/src/s_fma.c,v
 % retrieving revision 1.5
 % diff -u -2 -r1.5 s_fma.c
 % --- s_fma.c	3 Apr 2008 06:14:51 -0000	1.5
 % +++ s_fma.c	15 Nov 2010 17:44:48 -0000
 % @@ -170,5 +170,5 @@
 %  	zs = ldexp(zs, -spread);
 %  	r = c + zs;
 % -	s = r - c;
 % +	*(volatile *)&s = r - c;
 %  	rr = (c - (r - s)) + (zs - s) + cc;
 %
 
 The basic problem is that s_fma.c uses Dekker's algorithm, which assumes
 working floating point, i.e., not the floating point given by gcc on
 i387.  FreeBSD defaults to rounding to double precision so as to mostly
 avoid this problem for doubles, and msun tries to and mostly succeeds
 in avoiding it using direct methods for floats (using STRICT_ASSIGN()
 and sometimes a more direct volatile cast or variable), but setting
 extended precision as the test program does exposes all double precision
 functions to it.  This is 1 reason why the regression tests shouldn't
 configure extended precision for double precision tests (unless they
 want to test this, and we're not really ready for this since many other
 double precision functions are known to be broken when extended precision
 is configured).
 
 The problem was obvious since it went away with -O0 and with -ffloat-store.
 The above patch is the result of sprinkling volatiles around until I
 found one that worked and then removing the ones that made no difference.
 Perhaps many more are needed.  In particular, the splitting into high
 and low parts won't work as intended.  The test in the PR somehow
 worked without fixing this, probably just because it uses special args that
 are already have enough low bits.
 
 Note that STRICT_ASSIGN() doesn't work for doubles (is just null),
 since it is intentionally optimized for the default FreeBSD rounding
 precision, and using it (when it is not null) is a large pessimization.
 The following functions in msun/src (and few or none elsewhere) use it:
 
 % e_rem_pio2.c:	    STRICT_ASSIGN(double,fn,x*invpio2+0x1.8p52);
 % e_rem_pio2f.c:	    STRICT_ASSIGN(double,fn,x*invpio2+0x1.8p52);
 % k_rem_pio2.c:		STRICT_ASSIGN(double,fw,fw);
 % s_exp2.c:	STRICT_ASSIGN(double, t, x + redux);
 % s_exp2f.c:	STRICT_ASSIGN(float, t, x + redux);
 % s_log1p.c:		STRICT_ASSIGN(double,u,1.0+x);
 % s_log1pf.c:		STRICT_ASSIGN(float,u,(float)1.0+x);
 % s_rint.c:	        STRICT_ASSIGN(double,w,TWO52[sx]+x);
 % s_rint.c:	STRICT_ASSIGN(double,w,TWO52[sx]+x);
 % s_rintf.c:		STRICT_ASSIGN(float,w,TWO23[sx]+x);
 % s_rintf.c:	    STRICT_ASSIGN(float,w,TWO23[sx]+x);
 
 All the double-precision uses here are null and are just clones of the
 float precision code where the use avoids a problem that occurs in the
 usual case.  Since the double precision uses are null, all the functions
 with such uses are broken if someone configures extended precision.
 This includes all trig functions (via *_rem_pio2.c).  I have fixed
 this locally only for the trig functions.  In k_rem_pio2_.c, I essentially
 just use a variant of STRICT_ASSIGN(), but a better way is to add and
 subtract a bias (something like 0x1.8p53 (52 DBL_MANT_DIG - 1) to push
 the unwanted bits to oblivion).  I use such a bias to avoid the
 STRICT_ASSIGN() in in s_exp2f.c:
 
 % --- s_exp2f.c	Fri Feb 22 14:46:43 2008
 % +++ z22/s_exp2f.c	Fri Feb 22 14:48:37 2008
 % @@ -33,14 +33,50 @@
 %  #include "math_private.h"
 % 
 % +/* To be moved nearer to <float.h>. */
 % +#if FLT_EVAL_METHOD == 0
 % +#define	FLT_T_MANT_DIG	FLT_MANT_DIG
 % +#else
 % +/*
 % + * XXX this hack works for all supported arches.  There is some doubt that
 % + * float_t actually is double on arm and powerpc as claimed in <math.h>,
 % + * but it is certainly not long double, and the reduction only needs
 % + * float_t to be at least as large as the evaluation precision.
 % + */
 % +#define	FLT_T_MANT_DIG	DBL_MANT_DIG
 % +#endif
 % +
 %  #define	TBLBITS	4
 %  #define	TBLSIZE	(1 << TBLBITS)
 % 
 % +/*
 % + * Ensure that redux fits in a float.  float_t would always work, but
 % + * float works on all supported arches and is more efficient on some.
 % + */
 % +#if FLT_T_MANT_DIG >= FLT_MAX_EXP + TBLBITS
 % +#error "Unsupported type for float_t"
 % +#endif
 % +
 %  static const float
 %      huge    = 0x1p100f,
 % -    redux   = 0x1.8p23f / TBLSIZE,
 % +    redux   = __CONCAT(0x1.8p, FLT_T_MANT_DIG) / 2 / TBLSIZE,
 % +/*
 % + * Domain [-0.03125, 0.03125], range ~[-6.015e-11, 5.960e-11]
 % + * |exp2(x) - p(x)| < 2**-33.95
 % + */
 %      P1	    = 0x1.62e430p-1f,
 %      P2	    = 0x1.ebfbe0p-3f,
 %      P3	    = 0x1.c6b348p-5f,
 %      P4	    = 0x1.3b2c9cp-7f;
 % +#if 0
 % +/*
 % + * Domain [-0.03125, 0.03125], range ~[-2.4881e-12, 2.4881e-12]:
 % + * |exp2(x) - p(x)| < 2**-38.54
 % + */
 % +static const double
 % +    P1 =  6.9314718016256749e-1,	/*  0x162e42fec39c72.0p-53 */
 % +    P2 =  2.4022650689211292e-1,	/*  0x1ebfbdff5df1dd.0p-55 */
 % +    P3 =  5.5505736316289703e-2,	/*  0x1c6b3f74700eea.0p-57 */
 % +    P4 =  9.6183434022792860e-3;	/*  0x13b2c832d5fd3b.0p-59 */
 % +#endif
 % 
 %  static volatile float twom100 = 0x1p-100f;
 % @@ -69,4 +105,8 @@
 %   *
 %   * Accuracy: Peak error < 0.501 ulp; location of peak: -0.030110927.
 % +#if 0
 % + * Accuracy: Peak error < 0.50004 ulp; location of peak: -0.00918653142
 % + * 5168 cases out of 2**32 are incorrectly rounded in round-to-nearest 
 mode.
 % +#endif
 %   *
 %   * Method: (equally-spaced tables)
 % @@ -95,5 +135,5 @@
 %  {
 %  	double tv, twopk, u, z;
 % -	float t;
 % +	float_t t;
 %  	uint32_t hx, ix, i0;
 %  	int32_t k;
 % @@ -118,12 +158,18 @@
 % 
 %  	/* Reduce x, computing z, i0, and k. */
 % -	STRICT_ASSIGN(float, t, x + redux);
 % +	t = x + redux;
 % +#if FLT_T_MANT_DIG == 24
 %  	GET_FLOAT_WORD(i0, t);
 % +#elif FLT_T_MANT_DIG == 53
 % +	GET_LOW_WORD(i0, t);
 % +#else
 % +#error "Unsupported type for float_t"
 % +#endif
 % ...
 
 This also has complications to use float_t.  float_t will normally be
 double or long double, and the bias (redux here) depends on the number
 of bits in the type used in the redux.  The basic idea is to use the
 natural type for operations (float_t, not float here), so that adding
 and subtracting biases just works and the correct bias can be known
 at compile time.  This is painful to configure (see above), is broken
 by anywone configuring extra precision (e.g., the 53 in the above
 depends on the precision remaining at double).
 
 My uncommitted logl() uses a modified subset of Dekker's algorithm without
 hitting the problems with extra precision or needing to use STRICT_ASSIGN()
 to avoid them (any extra precision just gives extra precision in the result).
 I think the additive part doesn't really care about extra precision.  It
 The multicative part cares since it needs the hi*hi multiplications to be
 exact, so the hi parts must not have extra bits.
 
 Bruce
 
 ----- End forwarded message -----
 
 -- 
 a13x
State-Changed-From-To: open->closed 
State-Changed-By: das 
State-Changed-When: Fri Dec 3 07:01:28 UTC 2010 
State-Changed-Why:  
Thanks for the report! This limitation is described in the source for 
fma(), and unfortunately, it is unlikely to ever change. There are 
several reasons: 

- We are a long way from having the necessary compiler support to make  
dynamic precision changes work as expected. 
- Dynamic FPU precision changes aren't officially supported, and 
fpsetprec() has been documented as deprecated for many years.  
- The only supported architecture that can have this problem due to 
dynamic precision changes is i386, and even then only for non-SSE2 
builds. 
- The cost and complexity associated with making every function in 
libm detect and adapt to dynamic precision changes is prohibitive. 

I have updated the manpage for fpsetprec() to explain that changing 
the FPU precision isn't supported by the compiler or libraries. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=133583 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/133583: commit references a PR
Date: Fri,  3 Dec 2010 07:01:14 +0000 (UTC)

 Author: das
 Date: Fri Dec  3 07:01:07 2010
 New Revision: 216142
 URL: http://svn.freebsd.org/changeset/base/216142
 
 Log:
   Explain some of the reasons that fpsetprec() is unlikely to work as
   one might expect.  (These functions have already been deprecated for
   many years.)
   
   PR:		133583
 
 Modified:
   head/share/man/man3/fpgetround.3
 
 Modified: head/share/man/man3/fpgetround.3
 ==============================================================================
 --- head/share/man/man3/fpgetround.3	Fri Dec  3 04:39:48 2010	(r216141)
 +++ head/share/man/man3/fpgetround.3	Fri Dec  3 07:01:07 2010	(r216142)
 @@ -32,7 +32,7 @@
  .\"     @(#)fpgetround.3	1.0 (Berkeley) 9/23/93
  .\" $FreeBSD$
  .\"
 -.Dd August 23, 1993
 +.Dd December 3, 2010
  .Dt FPGETROUND 3
  .Os
  .Sh NAME
 @@ -164,6 +164,10 @@ and
  .Fn fpsetprec
  functions provide functionality unavailable on many platforms.
  At present, they are implemented only on the i386 and amd64 platforms.
 +Changing precision isn't a supported feature:
 +it may be ineffective when code is compiled to take advantage of SSE,
 +and many library functions and compiler optimizations depend upon the
 +default precision for correct behavior.
  .Sh SEE ALSO
  .Xr fenv 3 ,
  .Xr isnan 3
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
>Unformatted:
