# This is a shell archive.  Remove anything before this line,
# then unpack it by saving it in a file and typing "sh file".
#
# Wrapped by dgh!dgh on Mon Nov 16 14:54:01 PST 1987
# Contents:  ansi.c
 
echo x - ansi.c
sed 's/^@//' > "ansi.c" <<'@//E*O*F ansi.c//'
@.\" @(#)ansi.c 1.7 87/11/16
@.\" The following is a draft to be submitted to ANSI after a
@.\" short review.  After talking about writing up these comments
@.\" for a year, I have finally acted and
@.\" now of course want to get them out as quickly as possible.
@.\" I am interested in your reactions if they're more timely than mine; 
@.\" please send them to
@.\" sun!dhough 
@.\" or 
@.\" dhough@sun.com

@.\" Let me know if you want to be listed among those "in substantial agreement."
@.\" Those whose agreement is partial or none should send their own
@.\" commentaries to
@.\" ANSI X3J11 Secretariat
@.\" CBEMA, Suite 500
@.\" 311 First St NW
@.\" Washington, DC 20001

@.\" The ANSI X3J11 draft is said to be available from Global Engineering Documents
@.\" at 800-854-7179.

@.\" ?roff source is available for Sun engineering machines in
@.\" teak:/home/teak/dgh/ansi/ansi.c
@.\" and will be transmitted to others in shar format on request.
@.\" Format via
@.\" tbl ansi.c | ?roff -ms

@.de p
@.PP
@..
@.de ts
@.ds CH \\$1
@.ds RH %
@.ds LH D. Hough
@.ds LF \\$3
@.ds CF Not for Publication
@.ds RF \\$2
@.TL
\\$1
@.AU
David Hough - dhough@sun.com
@..
@.ts "Comments on Proposed ANSI C Standard" "87/11/16" "ansi.c 1.7"
@.AB
The proposed C standard suffers numerical shortcomings - many inherited from
its precursors - in the areas of expression evaluation and
floating-point exception handling, particularly in the library
of elementary transcendental functions.
The following comments are included:
@.TS
@.center ;
l l.
Comment #1, Section 3.2.1.4:	round conversions between floating types
Comment #2, Section 4.7:	SIGFPE means floating point
Comment #3, Section 4.9.6.2:	scanf requires pushback > 1
Comment #4, Section 4.10.2:	require two random number generators
Comment #5, Section 3.5.2.4:	facilitate forcing rounding
Comment #6, Section 3.3:	opaque parentheses by default
Comment #7, Section 3.4:	opaque parentheses defer constant expression evaluation
Comment #8, Section 3.3.3.3:	respect opaque parentheses in expression evaluation
Comment #9, Section 2.2.4.2:	<float.h> has too many names, not enough information
Comment #10, Section 4.9.6:	printf/scanf duality for non-model numbers
Comment #11, Section 4.10.1.4:	strtod/atof are mathematical functions
Comment #12, Section 4.13.2:	standard functions are predefined generic operators
Comment #13, Section 4.5:	make numerical exception handling uniform
Comment #14, Section 4.5:	specific mathematical library functions
@.TE
@.AE
@.PP
The following comments are based upon 
@.ul
Draft Proposed American National Standard for Information Systems - Programming Language C,
document number X3J11/86-151, dated 1 October 1986.
Differences between that draft and later ones are presumably
insubstantial in the areas of interest.  [Let me know of exceptions.]
@.p
The following comments are personal opinions of the author,
and should neither be construed as wholly original nor as representing the position
of any organization or other person.
A number of colleagues helped formulate and clarify them.
Those listed below,
while not necessarily agreeing in every detail,
have expressed agreement with the main points: 
@.TS
@.center ;
l l .
Zhishun Alex Liu	zliu%hobbes@berkeley.edu
Jim Valerio	ogcvax!omepd!radix!jimv
[ Your name here? ]
@.TE
@.SH 
Introduction
@.PP
C was not particularly designed to facilitate numerical computation.  In
recent years it has come to be increasingly used for implementing portable systems
and applications, including numerical ones.  This is more a tribute to the good judgment
embodied in some other aspects of C's design than to its numerical facilities.
It is easier to get a usable C compiler and library working than a comparable
Fortran compiler and library.  
@.p
In its drafts the ANSI committee has removed some of traditional C's
numerical weaknesses, such as requiring double precision expression evaluation
and parameter passing, and overspecifying error response in the elementary transcendental
function library.  
With a little more effort the remaining numerical stumbling
blocks could be removed and C would be as convenient for the numerical parts
of applications as for the other parts.
@.p
Comments follow, approximately in order of increasing significance, complexity, and priority.
Most of what follows is more a critique of existing C implementations than of the
X3J11 committee's work.
For brevity the principles are exposed rather than the details; details
can be worked out for proposals accepted in principle.
@.SH
Comment #1, Section 3.2.1.4:	round conversions between floating types
@.PP
Conversion of a floating-point value to fit in a floating-point format of less precision
or exponent range should be accomplished by a rounding operation of the same
sort as is used for floating-point addition or multiplication.
To require otherwise is quite inconsistent with the IEEE standard.
The Rationale contains an incorrect assertion that so doing would be inefficient,
because rounding modes would have to be frequently changed, since rounding toward zero is
required for the conversion of floating-point values to integer formats.
@.p
Actually the efficiency of conversion from floating-point formats to integer formats
is not a major factor in overall performance of realistic applications.
Even so, better IEEE implementations recognize the prevalence
of language standards that specify conversion to integer by rounding toward zero,
and provide an additional operation beyond those those mandated by the IEEE standard.
That additional operation (fintrz on Motorola MC68881) converts floating-point values to integers,
rounding toward zero regardless
of the current IEEE rounding mode that governs other floating-point operations.
So there need be no efficiency lost by requiring conversion to
integer to round toward zero, even though that differs from the rounding mode for operations with floating-point results.
@.p
@.ul
Existing practice:
I believe that all existing C implementations for workstations that are based
upon IEEE arithmetic round floating-point results to nearest by default,
but always round integer results to zero.  No efficiency is lost thereby.
@.p
@.ul 
Implicit conversions:
Appendix A.5 mentions a warning for implicit conversions to narrower formats.
Better to forbid such implicit conversions outright since they so often
lead to elusive bugs.
Careless but correct existing code is easily remedied by explicit casts.
@.SH
Comment #2, Section 4.7:	SIGFPE means floating point
@.PP
SIGFPE typically encompasses many signals besides its namesake floating-point exceptions.
Because the handling possibilities for these other types of signals are often rather different
than for floating point, it would be better to reserve another SIG name for 
integer arithmetic exceptions and specify that only exceptions arising from
floating-point arithmetic are to generate SIGFPE signals.
@.SH
Comment #3, Section 4.9.6.2:	scanf requires pushback > 1
@.PP
The Rationale states that pushback of one suffices to back up scans that
turn out to be improperly formatted for the conversion specification.
I can't see how scanf can distinguish between a valid numeric field "\-.1"
and an invalid field "\-.x" without scanning three characters - necessitating
a pushback of several characters in the latter case.  "1e\-x" is another example.
@.p
Recognizing this, John Gilmore suggested that 
"the ungetc description should be changed to specify that one character
of pushback is only guaranteed if it hasn't already been used, e.g.
by scanf."
@.SH
Comment #4, Section 4.10.2:	require two random number generators
@.PP
The Rationale makes two valid points about random number generators, 
that repeatability is highly desirable
on different implementations, and that different algorithms may be efficient
on different machines, but the proposed standard fails to draw the obvious inference:
two random number generators should be mandated, one which is a specified good algorithm
that can be implemented to produce identical results with tolerable efficiency
on all machines, and another of similar calling sequence that may implement any other algorithm
of maximal efficiency on a particular machine.
@.SH
Comment #5, Section 3.5.2.4:	facilitate forcing rounding
@.PP
It is not easy to force a rounding from double to float to occur; the function
@.nf

	float round_to_float(x)
	double x;
	{
	return x;
	}

@.fi
won't cause any rounding to occur on many existing C implementations, and need not
cause any rounding to occur under the ANSI C proposal.  However it appears to me
that the functions
@.nf

	float round_to_float(x)
	double x;
	{
	float f = x;
	return *(&f);
	}

	float round_to_float(x)
	double x;
	{
	volatile float f = x;
	return f;
	}

@.fi
will always cause the desired rounding to occur under the ANSI C proposal,
at the cost of gratuitous memory operations.
@.p
Far better, to avoid unnecessary memory traffic and to
deal with traditional implementations that conform to the
ANSI standard, would be to extend cast syntax to include an optional "force"
adjective: thus (force float)(x) guarantees that x will always be
rounded to float format regardless of the context or the style of arithmetic
expression evaluation - but without requiring any actual storage to occur. 
Reserved words could be conserved, at the cost of clarity, by
exploiting the existing keyword "volatile" in this context; thus (volatile float)(x)
would have the intended effect.
@.SH
Comment #6, Section 3.3:	opaque parentheses by default
@.PP
The draft ANSI C standard currently incorporates notions that could be called
@.ul
transparent parentheses,
denoted by the traditional ( ) , and 
@.ul
opaque parentheses,
denoted by the unusual notation +( ) .
These parentheses allow different optimizations to be performed on the enclosed
expressions, and thus affect numerical results.  
@.p
I support the principle of two types of parentheses, "predictable" and "unrestricted",
but I think the syntax choices were unfortunate.  The standard should mandate the
traditional ( ) notation for predictable or opaque parentheses, and a new syntax for
unrestricted or transparent ones.  The exact syntax is not important as long as it
is not burdensome or misleading.  Here are some possibilities; there may be better ones:
@.nf

	\`( )
	\`( )\`
	\~\~( )
	(< >)

@.fi 
@.SH
Comment #7, Section 3.4:	opaque parentheses defer constant expression evaluation
@.PP
The C standard committee recognized some of the problems associated with constant
expression evaluations involving floating point, when it mandated that
compile-time constant expression evaluation must be at least as accurate as
the run-time environment.  Even so there are occasions when, for clarity,
one would like to write a constant expression while deferring its evaluation
to run time.  Thus 
@.nf

	z = \-1.0/0.0 ;

@.fi
may be intended to generate an infinity and
division-by-zero exception at run time (perhaps within an implementation
of log).   The exception might be lost if the expression
were evaluated at compile time; some compilers might refuse to compile that
expression until it was suitably disguised.  Such disguises do nothing to
improve code readability; it would be helpful if the language included a mechanism
that precluded compile-time expression evaluation unless it were guaranteed
free of side effects.  In this context compile-time expression evaluation
includes statements like
@.nf

	double z = 1.0e999 ;
and
	a += 1.0/3.0;

@.fi
which generate exceptions on some run-time systems.    
@.p
Transparent parentheses should indicate that the enclosed
expression should be evaluated to the maximal extent at compile time:
@.nf

	a += \`( 1.0/3.0 );

@.fi
while opaque parentheses or none should indicate that constants should only be combined
at compile time if the combination could be done exactly and without side effects.
Thus most expressions involving only small integer constants could be evaluated
at compile time under any circumstances, as at present.
@.SH
Comment #8, Section 3.3.3.3:	respect opaque parentheses in expression evaluation
@.PP
C's greatest numerical astonishment lurks in its disregard of conventional parenthesized
expressions.  Centuries of mathematical convention and almost every
other widely-used language for scientific computation
permit ((a+b)+c) to be evaluated only in the order that the parentheses indicate.
If the compiler is free to disregard that order then explicit variables must be used
in intermediate expressions to force the order
of evaluation, at the cost of obscured source code and optimization inhibited.
(Forcing a particular order of evaluation among associative floating-point operands is
necessary at times to insure that roundoff occurs where it is harmless).
@.p
C might have been fixed long ago in this respect had not parentheses also been
used in macro definitions to direct their expansion.  That syntax led to a
common expectation that parentheses would be disregarded during compile-time constant
evaluation (mostly of integer expressions), so now many people believe that their
code would suffer performance losses if the parentheses were respected, causing
some expressions like (x + 3) + 1 to be evaluated at run time with two additions
rather than one.  Even if parentheses were generally respected, however, 
implementations should be at liberty to re-associate freely when there are
no possible side effects, which in the case of (x+3)+1 is true if the implementation
of integer overflow is such that (x+3)+1 and (x+4) and (x+1)+3 always have
exactly the same net effect.
@.p
I doubt that detectable performance loss due to respecting parentheses
would be widespread, but the
ANSI committee decided to avoid it by adopting a kluge: using the unary +
operator in indicate opaque parentheses.
While this is preferable to the chaotic C tradition, 
it is hardly aesthetically pleasing.
@.p
A solution that better respects expectations from the non-C world
would be to mandate a non-conventional syntax for
@.ul
transparent parentheses
that imply nothing about execution order.
Then ordinary 
@.ul
opaque parentheses 
and unary + would have only their conventional significance.
@.p
@.ul
0+E and 0\-E:
These expressions are used to define the meaning of unary + and \-.  In IEEE
arithmetic, in the presence of signed zeros and signaling NaNs, 0+E and 0\-E have
semantic meaning other than that which the ANSI committee had in mind.  Better
to simply state the meaning of unary + and \- in English without using those formulas.
@.SH
Comment #9, Section 2.2.4.2:	<float.h> has too many names, not enough information
@.PP
The Rationale states that the constants in <float.h> were derived from 
a similar section of the Fortran 8X proposal.  The C committee may not
have been aware that those corresponding Fortran proposals are
not universally accepted.
@.p
I believe that the following would prove more widely useful than the
proposed <float.h>:
@.br
@.ne 17
@.TS
@.center box ;
cb s
l l .
<float.h> macro definitions: {FLT,DBL,LDBL}_...

RADIX	base of model numbers
SIGNIFICANCE	number of base-RADIX "digits" in model number significands
MIN_EXP	minimum exponent of model numbers
MAX_EXP	maximum exponent of model numbers
MAX	the largest positive representable number
MIN_DIGITS	the greatest decimal input precision
	no more precise than the internal
MAX_DIGITS	the least decimal output precision 
	no less precise than the internal
MIN_POWER	the smallest power of ten within the
	range of positive model numbers
MAX_POWER	the largest power of ten within the range
	of positive model numbers
@.TE
@.TS
@.center box ;
cb s
l .
<float.h> function definitions:

the next machine representable number from x in the direction y

        double nextafter( double x, y);
        float nextafterf( float x, y);
        long double nextafterl( long double x, y );

the next model number from x in the direction y	

        double nextmodel(double x,y);	
        float nextmodelf( float x, y);
        long double nextmodell( long double x, y );
@.TE
@.p
"Model numbers" represent the 
least common denominator subset of all popular floating-point implementations.
They include zero and numbers fitting a normalized
floating-point model like that in the ANSI C draft,
which is equivalent to
@.nf

	J * (b**e)

where J is an integer satisfying

	b**(p-1) <= abs(J) <= (b**p)-1

and e is an integer satisfying

	emin <= (e+p-1) <= emax

@.fi
double precision b, p, emin, and emax are called DBL_RADIX, DBL_SIGNIFICANCE,
DBL_MIN_EXP, and DBL_MAX_EXP in the table above.
Note that "model numbers" exclude IEEE NaNs, infinities, -0, and subnormal numbers
and may at the non-IEEE implementer's discretion also exclude other representable
numbers whose arithmetic participation is unsatisfactory, as
described by W. S. Brown (ACM TOMS 12/81).
@.p
MIN_POWER and MAX_POWER provide convenient estimates of the range of
model numbers for such purposes as allocating widths for printed
exponent fields.
@.p
MAX might represent infinity on IEEE and other systems,
and thereby differ from the largest model number.
@.p
Nextafter(0.0,1.0) yields the smallest positive 
@.ul
subnormal 
number
in IEEE arithmetic, while nextmodel(0.0,1.0) yields the smallest positive
@.ul
normalized 
number.  Nextafter() and nextmodel() are more generally applicable than 
constants such as EPSILON.  Such constants need be determined
only once at the start of a computation, so any performance difference
between a constant and a function evaluation is not significant.  Since
nextafter() and nextmodel() are often called with constant arguments anyway,
they could be evaluated at compile time if standard numerical functions were considered
to be operators.
Operator treatment would also eliminate the need for distinct reserved names for
each floating-point type such as nextafterf, nextafterl, etc.
Nextmodel(x,x) is defined as a nearest model number to x, namely x
itself if x is a model number.
@.p
@.ul
MIN_DIGITS and MAX_DIGITS 
are claims about an implementation's internal
floating-point format
and about the conversion sequences (scanf|printf) and (printf|scanf) for model numbers.
A decimal string of no more than MIN_DIGITS significant figures,
that may be read by scanf
to produce a model number without overflow or underflow,
may then be printf'd with MIN_DIGITS significant figures
to produce a string of unchanged numerical value, although the format may differ.
A string of no more than MIN_DIGITS significant figures
will be reproduced if printed in the same format, unaltered by the limited
internal precision of the variable in which the converted value was stored.
@.p
A model number
printed with at least MAX_DIGITS significant figures 
will be reproduced unchanged by scanf.
Thus printing a numerical
value with MAX_DIGITS significant figures specifies the internal value
unambiguously.  
@.p
For IEEE 754 arithmetic, FLT_MIN_DIGITS is 6, FLT_MAX_DIGITS is 9,
DBL_MIN_DIGITS is 15, and DBL_MAX_DIGITS is 17.  
@.p
Functions like the following might be used to test
an implementation's DBL_MIN_DIGITS and DBL_MAX_DIGITS:
@.ne 10
@.nf

#include <float.h>

double 
randommodel()
{
	/*
	 * computes a "random" model number chosen from the whole space of
	 * model numbers
	 */
}
@.fi
@.ne 13
@.nf

testmax()
{
	double          d1, d2;
	char            s1[40], s2[40];
	do {
		d1 = randommodel();
		sprintf(s1, "%*.*e", DBL_MAX_DIGITS + 10, DBL_MAX_DIGITS - 1, d1);
		sscanf(s1, "%lf", &d2);
	}
	while (d1 == d2);
	printf(" DBL_MAX_DIGITS too low! ");
}
@.fi
@.ne 13
@.nf

testmin()
{
	double          d1, d2;
	char            s1[40], s2[40];
	do {
		d1 = randommodel();
		sprintf(s1, "%*.*e", DBL_MIN_DIGITS + 10, DBL_MIN_DIGITS - 1, d1);
		sscanf(s1, "%lf", &d2);
		sprintf(s2, "%*.*e", DBL_MIN_DIGITS + 10, DBL_MIN_DIGITS - 1, d2);
	}
	while (strcmp(s1, s2) == 0);
	printf(" DBL_MIN_DIGITS too high! ");
}

@.fi
@.p
@.ul
FLT_ROUNDS: 
The proposed definition is too vague to be useful.
The variation of floating-point rounding algorithms on non-IEEE
machines is too great to encapsulate in one definition.
Better to require applications to fabricate the specific test that matters to them,
such as
@.ne 9
@.nf

	double eps, oneminus, nextafter();

	eps = 1.0 - nextafter( 1.0, 0.0) ;
	oneminus = (force double)(1.0 - 0.5 * eps) ;
	if (oneminus == 1.0)
		{ looks like some kind of rounding }
	else
		{ looks like some kind of chopping }

@.fi
@.ul
(force double) 
was described in a previous comment; its use here anticipates
that oneminus might be allocated to an extended-precision floating-point
register.
@.p
@.ul
"significand"
is a more apt term than "mantissa" for the
part of a floating-point number that holds significant digits (page 14 line 10).
@.SH
Comment #10, Section 4.9.6:	printf/scanf duality for non-model numbers
@.PP
The values of ...MIN_DIGITS and ...MAX_DIGITS specified in <float.h> constitute
a specification on printf and scanf; under appropriate conditions, model numbers
are guaranteed to reproduce their values under input|output and output|input
sequences.
@.p
Non-numbers, such as IEEE infinities and NaNs, also should be printable
with printf and scannable with scanf in the same way as numeric tokens.  
Therefore the C standard, although not specifying the printable format of
non-numbers, should require that non-model numbers, when printf'ed in an output
format that accommodates ...MAX_DIGITS significant digits for model numbers,
should be readable with scanf when read with an input format that reproduces
model numbers.  Such an output|input sequence should map model numbers to themselves,
non-model numbers to themselves, and non-numbers to non-numbers.  That guarantees
that output formats sufficient to preserve numeric information also preserve 
non-numericness.
@.SH
Comment #11, Section 4.10.1.4:	strtod/atof are mathematical functions
@.PP
Error handling for strtod() and atof() should be like that for numerical
functions specified below.
They might as well be generic functions to
avoid having more identifiers for float and long double versions, and to 
allow reliable error handling indications to be specified with *perr
as described in a later section.
@.p
When no error handling is specified, the numerical results of underflow or
overflow or invalidly-formed input strings should be undefined.
When error handling is specified, the treatment of underflow and overflow
should be like that of multiplication, and the treatment of an invalid
input string should be like that of 0.0/0.0. 
@.SH
Comment #12, Section 4.13.2:	standard functions are predefined generic operators
@.PP
When Fortran was being invented, both division and square root were commonly implemented
as subroutines - sometimes rather similar ones.  
However the first Fortran character set, apparently a subset of a commercial
typewriter's, had a / character available but not a mathematical square root symbol.
So Fortran evolved using / for division and sqrt() for square root, and in this respect
has been followed by other algebraic languages.  But division and square root
are both operators, and rather similar in some respects - the use of operator 
or functional notation being a historical accident.   Other cases are similar, such
as unary \- and fabs().  In many modern hardware implementations sqrt is just
an atomic operation like division; likewise IEEE arithmetic was designed so that both
unary \- and fabs() may be accomplished by one-bit operations.  On an MC68881,
all the following C "functions" can be implemented in one or two instructions,
disregarding errno exception reporting requirements: fabs, fmod, ldexp, exp, log, log10,
sin, cos, tan, asin, acos, atan, sinh, cosh, tanh.
@.p
The point relevant to C is that many of its standard numerical functions should really
be thought of as operators (designated, like 
@.ul
sizeof, 
with words rather than special typographical symbols)
rather than external functions (although some might
be implemented with invisible external functions).  
Of course, that means that a program can't pass a pointer directly to fabs or sqrt,
but nobody laments that
it is tedious to create a function that performs float division in order
to create and pass a pointer to it.  Similarly there is no great loss
in requiring anyone wanting to create a pointer to a function that
computes sqrt to go to the trouble of creating his own function that
calls the predefined sqrt, then passing a pointer to that function.
@.p
Unlike fully reserved identifiers, operators designated by predefined names could be undefined
by the compiler upon encountering a declaration for that name.
The compiler should issue a warning however at this point since such redeclarations are not
generally commendable programming practice.
@.p
A consequence of generic treatment is that there is no need to reserve additional function
names fabsf, fabsl, sqrtf, sqrtl, etc., as contemplated in section 4.13.
It also means that, being generic functions known to the compiler,
constant expressions involving them could be evaluated at compile time.
An additional consequence of generic treatment is that
these functions could take an optional extra argument which is useful
in providing better exception handling, as described next.
@.SH 
Comment #13, Section 4.5:	make numerical exception handling uniform
@.PP
The traditional C scheme for exception handling has little to recommend it:
there is no guarantee that numerical exceptions are even detectable for
the most common floating-point operators, yet the standard would enforce
the unreliable global errno mechanism for the comparatively less frequently used
elementary transcendental functions.  Such an approach
promotes tradition rather than safety or efficiency.
@.p
A language standard contemplating implementation only on modern machines equipped
with IEEE arithmetic might require that all floating-point exceptions
relating to finite representation (inexact, underflow, overflow),
singularity (division by zero), and domain violations would
be detected and treated uniformly whether they arose in atomic hardware operations
or software subroutines.
Such architectures permit efficient exception detection and treatment.
@.p
Old architectures can't simultaneously accommodate both efficiency and safety, however, and
a language standard that attempts to accommodate those architectures must pay the
price in duplication of facilities.
@.p
@.ul
Efficiency:
The standard should not specify behavior in the face of exceptional conditions,
neither for
the conventional operators +\-*/ nor
for operators like fabs, sqrt, and the elementary transcendental functions.
The standard would not proscribe traditional errno-based implementations, but would
point out their limitations by prescribing alternative operators that are reliable
and machine-independent - at a cost of efficiency.  My guess is that such alternative
operators would be used seldom and in limited circumstances, but would
be much appreciated by programmers who choose or are constrained to produce reliable
code to run on older architectures. 
@.p
@.ul
Safety:
For each potentially exceptional floating-point operator +\-*/, fabs, sqrt, etc.,
alternative operators are required to be implemented that return an exception
status as well as a numerical value.  The notation for such operators would
by its necessary awkwardness guarantee that those operators would only be used
when essential.  Thus
@.nf

	double e_add(x,y,perr)
	double x, y;
	int *perr;

@.fi
suggests an operator with a functional notation that returns the double
sum x+y with an error indication in *perr, which could be thought of as
a local errno.
The error indication could be ERANGE or
EDOM as before, but might better be more detailed: SVID prescribes DOMAIN, SING, OVERFLOW,
and UNDERFLOW indications and that list might well be extended to include 
an indication that an implementation limitation was exceeded as discussed later.
@.p
For operators that are denoted functionally the difference would be less striking:
@.nf

	double e_sqrt(x,perr)
	double x;
	int *perr;

@.fi
although the meaning would be similar.  In this respect it would again be
desirable that "sqrt" had a generic operator meaning; then, instead of
reserving an additional identifier e_sqrt, the compiler
could recognize that sqrt invoked with two arguments referred to the 
version that returned a meaningful error indication in *perr.
@.p
@.ul
e_sqrt(x) calls matherr() instead?
A less flexible alternative to an explicit error indication on each operation
would be to follow SVID's example and require each of the e_... operators
to call matherr() on encountering an exceptional situation.  Then invocations of e_...
operators need not refer to a *perr.
Unfortunately there must be a unique global matherr() within a program which is not so
different from the traditional situation of a unique global errno.
@.p
@.ul
Wait for C++ instead?
C++ allows overloading operators and might therefore be a more
appropriate vehicle than C for the safety-oriented extensions described above.
Then the C standard should anticipate such extensions by not defining exception
handling at all for the mathematical functions, consistent with its
treatment of the common operators.
@.p
@.ul
The conclusion: don't overspecify!
Some of the mathematical functions traditionally included with C are
overspecified with respect to exceptional conditions, in the sense that
results are specified for conditions (such as fmod(x,0.0) that are unlikely to arise in practice
except in programs that are primarily intended to test compliance with
the specification.  A consequence of such an approach is that efficiency in the normal
case is lost by checking for unlikely exceptions for the purpose of providing
unreliable exception handling, such as a global errno.  This slight payoff
has a disproportionate cost; a modern high-performance hardware fmod implementation
may lose half its performance in checking for exceptional arguments and
setting errno.  That modern implementation will be safer, too, by returning a
NaN or reserved value instead of a numerical result.
@.p
Most designers of portable software desire that systems return reasonable
results and continue when exceptional conditions occur that the designers
anticipated; the appropriate response for unanticipated exceptions might be
termination with debugging output.   Thus my recommendation that exception
responses be unspecified is painful but appropriate given the improbability
that a C standard would specify default exception handling for ordinary operators
like * or /. 
@.SH
Comment #14, Section 4.5:	specific mathematical library functions
@.PP
Certain functions in the mathematical library
have misguided specifications.
The general issue of how to deal with exceptional conditions is treated above;
two classes of operators are described, one whose behavior is undefined in
exceptional circumstances, and another whose numerical results and error
indications are completely defined.
The following comments apply to specific cases in which numerical results and 
error indications are to be returned.
@.p
@.ul
Non-model arguments:
The introduction should explain that the specifications for these functions
only apply to model number arguments.  There are good examples like MC68881 or
4.3 BSD to
follow for extending to IEEE non-model numbers, but other architectures
are far too various to specify.
@.p
@.ul
ERANGE:
Overflow and underflow are exceptional conditions in the sense that they signal
that rounding errors have occurred that are "larger than normal" and may not have
been properly accounted for in the design of the program.  
The error indication should distinguish between overflow and underflow.
If a numerical result is to
be specified at all, it should be the same as that for a multiplication or division
producing a similar overflowed or underflowed value, thereby allowing some
uniformity in treatment.
@.p
@.ul
EDOM:
"Domain errors" include those for which a function of a real variable has no defined value or
has a singularity.  
The error indication should distinguish between these cases.
Signed infinity is an appropriate numerical result in the
latter case, NaN in the former; on pre-IEEE architectures, the numerical results
should correspond to 1.0/0.0 and 0.0/0.0.
@.p
@.ul
EIMPL:
"Implementation errors" are a way of indicating shortcomings in a particular
implementation without confusing them with limitations inherent in a particular
floating-point format or intrinsic to a particular mathematical function.  
They seem to be common in implementations of fmod and trigonometric functions.
If the C standard is to countenance such shortcomings at all, they should be
distinctively labeled.
@.PP
@.ul
fmod arguments:
z = fmod(x,y) for positive finite x and y
is appropriately, uniquely, and exactly defined as that z of minimum magnitude
which has the same sign 
as x and differs from x by an integral multiple of y.
The phrase "integral multiple" is chosen to emphasize that that multiple need
not fit in an int, a long, or any other storage format, for it need not be
explicitly computed.  Implementations that choose to compute fmod(x,y) by a 
@.ul
formula
like
@.nf

	x - y * (int)(x/y)

@.fi
should return the EIMPL error indication in cases when the formula does not
implement the definition, such as when x/y overflows int format.
@.p
Since fmod(x,y), like (x % y),
is almost always invoked for a specific constant y like 10, 24, or \(*p/2, 
and never for y==0,
the definition of fmod(x,0) is of no great consequence and is best left
undefined.  
@.PP
@.ul
Trigonometric arguments:
The proper treatment of large trigonometric arguments has long been a source
of confusion.  One satisfactory approach is to always perform a "correctly
rounded" argument reduction by generating a variable-precision approximation to \(*p that is 
large enough to reduce a particular argument correctly (4.3 BSD VAX version); 
another is to reduce
all arguments with an approximation to \(*p that is fixed to a precision greater
than or equal to the precision of the 
largest supported floating-point format (4.3 BSD IEEE version; MC68881). 
@.p
An unsatisfactory approach is to compute a reduced argument as fmod(x,\(*p/2)
using a defective fmod of the type mentioned above.  Such algorithms
have a catastrophic internal boundary 
at which x/(\(*p/2) overflows when converted to integer format;
SVID signals such results as "total" or "partial" loss of significance
but since the real issue is a particular implementation, EIMPL as described 
above is a more appropriate response.  The satisfactory algorithms mentioned in the
previous paragraph don't contain any such internal boundary; the fixed-precision
\(*p algorithm gradually loses accuracy for increasing arguments, but that loss
is practically undetectable, except by comparison with a variable-precision \(*p
algorithm, because all essential identities and relationships continue to hold
for large arguments as well as small.
@.p
@.ul
Inverse trigonometric argument ranges:
The functions acos, asin, and atan are multi-valued and therefore a particular
branch needs to be specified for computation.  However the proposed standard inadvertently
goes too far and in effect specifies tight bounds on rounding errors at the endpoints
of the ranges of those functions.  In contrast, nothing in the standard requires
that 1+1 be close to 2! 
@.p
Appropriate wording for the standard is to refer to the 
@.ul
approximate ranges
[0,\(*p] or [\-\(*p/2,+\(*p/2]
in order to indicate which branch is intended without setting any inappropriate expectation
that the end points will be strictly less in magnitude than \(*p or \(*p/2.
The approximate range for atan2 is [\-\(*p,+\(*p].
@.p
@.ul
atan2(0.0,0.0):
atan2 is generally used in conversion from rectangular coordinates to the polar
coordinate argument \(*h.
The \(*h associated with the origin is of little consequence.  
Conformal mapping applications can treat edges symmetrically if
atan2(\(+-0.0,\(+-0.0) is allowed to take any of the values
\(+-0 or \(+-\(*p.
Such methods aren't available on non-IEEE machines, so the best thing
for the C standard is to restrict atan2(0.0,0.0) to [\-\(*p,+\(*p]
but otherwise undefined.
@.p
@.ul
What about hypot?
Atan2(y,x) and hypot(x,y) are closely related and should either both be standardized or neither.
@.p
@.ul
Log(0.0) and log10(0.0):
The proper interpretation of these values is that of a singularity.
No overflow or underflow is involved.
For a logarithm of a real variable 0.0, \-\(if is the suitable result, which implies that the
standard can't specify a universal result.
@.p
@.ul
pow(0.0,0.0):
There are good arguments that pow(x,0.0) should be 1.0 for all x, without any
exceptional indication; 1.0 is the value most useful, most often.  
These arguments are not universally accepted,
so if the ANSI C committee is unprepared to accept them, 
it would be appropriate for the standard not to specify pow(0.0,0.0).
@.p
@.ul
pow(0.0,negative integral value):
0.0**-1.0 is just 1.0/0.0, and if an exception is to be specified it should be
like that of 1.0/0.0.
@.p
@.ul
"integral values" in floor() and ceil():
Ambiguous language in Fortran-77's definition of aint() and anint(),
much like the draft C standard's definition of ceil() and floor(),
has misled
some implementers into thinking that such functions could be adequately
implemented by converting the argument to a long or int and then converting
back to double.  A more explicit specification might read thus:
@.in +5
@.p
The ceil function returns the smallest integral value in double format greater
than or equal to x, even though that integral value might not fit in a variable of type
int or long.  Thus ceil(x) == x for all x sufficiently large in magnitude;
for IEEE 754 double precision, that includes all x such that fabs(x) >= 2**52.
@.in -5
@.p
@.ul
frexp and modf:
The Rationale mentions that these functions almost went the way of ecvt et al.
Too bad they didn't since frexp and modf 
are not essential for a mathematical library implementation
and modf in particular is a source of confusion because it has been implemented in
a variety of ways in the past.
@.p
@.ul
ldexp
might be more useful if its name were more clearly
related to its function; 
@.ul
scalen()
indicates its common use and its relation to the 
@.ul
scalb()
function defined in the IEEE 754 appendix.
@//E*O*F ansi.c//
chmod u=r,g=r,o=r ansi.c
 
exit 0


.