home *** CD-ROM | disk | FTP | other *** search
-
-
-
- MMMMAAAATTTTHHHH((((3333MMMM)))) MMMMAAAATTTTHHHH((((3333MMMM))))
-
-
-
- NNNNAAAAMMMMEEEE
- math - introduction to mathematical library functions
-
- DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN
- These functions constitute the C math library _l_i_b_m. There are four
- versions of the math library _l_i_b_m._a, _l_i_b_m_x._a, _l_i_b_m_4_3._a and _l_i_b_f_a_s_t_m._a
-
- The first, _l_i_b_m._a, contains routines newly implemented (1994) using
- algorithms which take advantage of the Mips architecture and includes
- many routines for the _f_l_o_a_t data type.
-
- For the -64 and -n32 versions of _l_i_b_m._a, a second version of the math
- library, _l_i_b_m_x._a, contains functions which give identical results to
- those in libm.a, but which use System V error handling.
- See matherr(3M) for a description of error handling for _l_i_b_m_x._a
- functions.
-
- The third version of the math library, _l_i_b_m_4_3._a, contains routines all
- based on the original codes in the 4.3BSD release. The difference
- between the error bounds for libm.a and libm43.a is typically around 1
- unit in the last place, whereas the performance difference may be a
- factor of two or more.
-
- The link editor searches this library under the "-lm", "-lmx", or "-lm43"
- option. Declarations for these functions may be obtained from the
- include file <_m_a_t_h._h>.
-
- The fourth library, _l_i_b_f_a_s_t_m._a, contains faster, lower-precision versions
- of various routines from libm.a.
-
- LLLLIIIISSSSTTTT OOOOFFFF FFFFUUUUNNNNCCCCTTTTIIIIOOOONNNNSSSS
- Error bounds listed below apply only to the -64 and -n32 versions of
- _l_i_b_m._a and _l_i_b_m_x._a The error bound sometimes applies only to the primary
- range.
-
-
- _E_r_r_o_r _B_o_u_n_d (_U_L_P_s)
- _N_a_m_e _A_p_p_e_a_r_s _o_n _P_a_g_e _D_e_s_c_r_i_p_t_i_o_n _l_i_b_m._a _l_i_b_m_4_3._a
- acos sin(3M) inverse trigonometric function 2 3
- acosf sin(3M) inverse trigonometric function 1
- acosh asinh(3M) inverse hyperbolic function 3 3
- asin sin(3M) inverse trigonometric function 2 3
- asinf sin(3M) inverse trigonometric function 1
- asinh asinh(3M) inverse hyperbolic function 3 3
- atan sin(3M) inverse trigonometric function 1.5 1
- atanf sin(3M) inverse trigonometric function 1
- atanh asinh(3M) inverse hyperbolic function 3 3
- atan2 sin(3M) inverse trigonometric function 2 2
- atan2f sin(3M) inverse trigonometric function 1
- cabs hypot(3M) complex absolute value 1 1
- cabsf hypot(3M) complex absolute value 1
- cbrt sqrt(3M) cube root 1 1
-
-
- PPPPaaaaggggeeee 1111
-
-
-
-
-
-
- MMMMAAAATTTTHHHH((((3333MMMM)))) MMMMAAAATTTTHHHH((((3333MMMM))))
-
-
-
- ceil floor(3M) integer no less than 0 0
- ceilf floor(3M) integer no less than 0 0
- copysign ieee(3M) copy sign bit 0 0
- cos sin(3M) trigonometric function 2 1
- cosf sin(3M) trigonometric function 1
- cosh sinh(3M) hyperbolic function 2 3
- coshf sinh(3M) hyperbolic function 1
- drem ieee(3M) remainder 0 0
- erf erf(3M) error function ? ?
- erfc erf(3M) complementary error function ? ?
- exp exp(3M) exponential 1 1
- expf exp(3M) exponential 1
- expm1 exp(3M) exp(x)-1 1 1
- expm1f exp(3M) exp(x)-1 1
- fabs floor(3M) absolute value 0 0
- fabsf floor(3M) absolute value 0 0
- finite ieee(3M) floating point arithmetic (N/A)
- floor floor(3M) integer no greater than 0 0
- floorf floor(3M) integer no greater than 0 0
- fmod floor(3M) remainder function 0
- fmodf floor(3M) remainder function 0
- hypot hypot(3M) Euclidean distance 1 1
- hypotf hypot(3M) Euclidean distance 1 1
- j0 j0(3M) bessel function ? ?
- j1 j0(3M) bessel function ? ?
- jn j0(3M) bessel function ? ?
- lgamma lgamma(3M) log gamma function ? ?
- log exp(3M) natural logarithm 1 1
- logf exp(3M) natural logarithm 1
- logb ieee(3M) exponent extraction 0 0
- log10 exp(3M) logarithm to base 10 2 3
- log10f exp(3M) logarithm to base 10 1.5
- log1p exp(3M) log(1+x) 1 1
- log1pf exp(3M) log(1+x) 1 1
- pow exp(3M) exponential x**y 2 60-500
- powf exp(3M) exponential x**y 1
- rint floor(3M) round to nearest integer 0 0
- sin sin(3M) trigonometric function 2 1
- sinf sin(3M) trigonometric function 1
- sinh sinh(3M) hyperbolic function 2 3
- sinhf sinh(3M) hyperbolic function 1
- sqrt sqrt(3M) square root 1 1
- sqrtf sqrt(3M) square root 1
- tan sin(3M) trigonometric function 2 3
- tanf sin(3M) trigonometric function 1
- tanh sinh(3M) hyperbolic function 2 3
- tanhf sinh(3M) hyperbolic function 1
- trunc floor(3M) truncate to whole number 0 0
- truncf floor(3M) truncate to whole number 0 0
- y0 j0(3M) bessel function ? ?
- y1 j0(3M) bessel function ? ?
- yn j0(3M) bessel function ? ?
-
-
-
- PPPPaaaaggggeeee 2222
-
-
-
-
-
-
- MMMMAAAATTTTHHHH((((3333MMMM)))) MMMMAAAATTTTHHHH((((3333MMMM))))
-
-
-
- VVVVEEEECCCCTTTTOOOORRRR IIIINNNNTTTTRRRRIIIINNNNSSSSIIIICCCCSSSS
- Beginning with IRIX 6.2, libm now supports the following vector
- intrinsics:
-
- /* single precision vector routines */
-
-
- vvvvaaaaccccoooossssffff(((( ffffllllooooaaaatttt ****xxxx,,,, ffffllllooooaaaatttt ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvvaaaassssiiiinnnnffff(((( ffffllllooooaaaatttt ****xxxx,,,, ffffllllooooaaaatttt ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvvaaaattttaaaannnnffff(((( ffffllllooooaaaatttt ****xxxx,,,, ffffllllooooaaaatttt ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvvccccoooossssffff(((( ffffllllooooaaaatttt ****xxxx,,,, ffffllllooooaaaatttt ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvveeeexxxxppppffff(((( ffffllllooooaaaatttt ****xxxx,,,, ffffllllooooaaaatttt ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvvllllooooggggffff(((( ffffllllooooaaaatttt ****xxxx,,,, ffffllllooooaaaatttt ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvvlllloooogggg11110000ffff(((( ffffllllooooaaaatttt ****xxxx,,,, ffffllllooooaaaatttt ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvvssssiiiinnnnffff(((( ffffllllooooaaaatttt ****xxxx,,,, ffffllllooooaaaatttt ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvvssssqqqqrrrrttttffff(((( ffffllllooooaaaatttt ****xxxx,,,, ffffllllooooaaaatttt ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvvttttaaaannnnffff(((( ffffllllooooaaaatttt ****xxxx,,,, ffffllllooooaaaatttt ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
-
- /* double precision vector routines */
-
-
- vvvvaaaaccccoooossss(((( ddddoooouuuubbbblllleeee ****xxxx,,,, ddddoooouuuubbbblllleeee ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvvaaaassssiiiinnnn(((( ddddoooouuuubbbblllleeee ****xxxx,,,, ddddoooouuuubbbblllleeee ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvvaaaattttaaaannnn(((( ddddoooouuuubbbblllleeee ****xxxx,,,, ddddoooouuuubbbblllleeee ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvvccccoooossss(((( ddddoooouuuubbbblllleeee ****xxxx,,,, ddddoooouuuubbbblllleeee ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvveeeexxxxpppp(((( ddddoooouuuubbbblllleeee ****xxxx,,,, ddddoooouuuubbbblllleeee ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvvlllloooogggg(((( ddddoooouuuubbbblllleeee ****xxxx,,,, ddddoooouuuubbbblllleeee ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvvlllloooogggg11110000(((( ddddoooouuuubbbblllleeee ****xxxx,,,, ddddoooouuuubbbblllleeee ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvvssssiiiinnnn(((( ddddoooouuuubbbblllleeee ****xxxx,,,, ddddoooouuuubbbblllleeee ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvvssssqqqqrrrrtttt(((( ddddoooouuuubbbblllleeee ****xxxx,,,, ddddoooouuuubbbblllleeee ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
- vvvvttttaaaannnn(((( ddddoooouuuubbbblllleeee ****xxxx,,,, ddddoooouuuubbbblllleeee ****yyyy,,,, lllloooonnnngggg ccccoooouuuunnnntttt,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeexxxx,,,, lllloooonnnngggg ssssttttrrrriiiiddddeeeeyyyy ))))
-
- Input and output arrays for the above routines should either be identical
- or non-overlapping.
-
- On Mips4 processors, these routines are software pipelined to take
- advantage of the multiple execution units. On that machine, throughput
- is up to several times greater than one gets by calling the scalar
- intrinsics repeatedly. On processors other than the Mips4, these
- routines are still available; although not software pipelined on those
- processors, they still eliminate considerable call overhead when they can
- be used. Note that the vector routines do not support denormals on the
- Mips4 processors.
-
- The single precision vector routines can also be called by the names
- vfacos, vfasin, etc.
-
- Semantics of these routines:
-
- i=0, 1, ..., count-1: y[i*stridey] = f(x[i*stridex])
-
- Example:
-
-
-
- PPPPaaaaggggeeee 3333
-
-
-
-
-
-
- MMMMAAAATTTTHHHH((((3333MMMM)))) MMMMAAAATTTTHHHH((((3333MMMM))))
-
-
-
- double x[10000], y[10000];
-
-
- for (i=0; i<1000; i++ ) y[2*i] = sin(x[3*i]);
-
- Transform (by hand) into
-
- vsin(x, y, 1000, 3, 2);
-
-
- Vector and scalar routines may differ slightly, however none of the
- results differ from the mathematically correct result by more than 2 ulps
- (units in the last place). Note that the vector square root routines are
- less accurate than the hardware versions; vsqrt and vsqrtf use the
- reciprocal square root instruction and lose up to about 2 bits of
- accuracy. vsqrt and vfsqrt give correct answers for zero and infinite
- arguments.
-
- LLLLOOOONNNNGGGG DDDDOOOOUUUUBBBBLLLLEEEE AAAARRRRIIIITTTTHHHHMMMMEEEETTTTIIIICCCC
- Long double arithmetic is supported by the MIPSpro compiler. The
- representation used is not IEEE compliant; long doubles are represented
- on this system as the sum or difference of two doubles, normalized so
- that the smaller double is <= .5 ulp of the larger. This is equivalent
- to a 107 bit mantissa with an 11 bit biased exponent (bias = 1023), and 1
- sign bit. In terms of decimal precision, this is approximately 34
- decimal digits.
-
- Long double constants are coded as double precision constants followed by
- the letter 'l' (upper or lower case). The largest (finite) long double
- constant is 1.797693134862315807937289714053023e308L .
- The smallest long double precision constant is
- 4.940656458412465441765687928682213e-324L . Long doubles less than
- 1.805194375864829576069262081173746e-276L
- may require a double denormal in their representation and therefore
- contain less than 107 bits precision.
-
- Long double NaNs and (signed) infinities are supported by the MIPSpro
- compiler. Long double infinity is represented as the sum of a double
- infinity and a double zero; similarly for NaNs.
-
- In Fortran, long doubles are denoted by the term REAL *16.
-
- In general, long double arithmetic operations (+, -, *, /) are not
- precisely rounded, but are accurate to approximately 3 ulps.
-
- Note that long double arithmetic operations are done in software by
- MIPSpro compilers; results of these operations may vary slightly from
- release to release due to improvements in the algorithms which implement
- them.
-
-
-
-
-
-
- PPPPaaaaggggeeee 4444
-
-
-
-
-
-
- MMMMAAAATTTTHHHH((((3333MMMM)))) MMMMAAAATTTTHHHH((((3333MMMM))))
-
-
-
- Long double operations on this system are only supported in round to
- nearest rounding mode (the default). The system must be in round to
- nearest rounding mode when issuing long double arithmetic operations or
- calling any of the long double functions, or incorrect answers will
- result.
-
- DDDDIIIIFFFFFFFFEEEERRRREEEENNNNCCCCEEEESSSS BBBBEEEETTTTWWWWEEEEEEEENNNN ----oooo33332222,,,, ----nnnn33332222,,,, ----66664444
- For the IRIX 6.2 release, faster and more accurate algorithms were
- implemented, and vector functions were added to the math library. In
- order to maintain numerical compatibility with older releases, these
- changes were made only in the -n32 and -64 versions of the library and
- not in the -o32 version. ( Where there are differences in accuracy, this
- document describes the behavior of the -n32 and -64 versions of the
- library. )
-
- To take advantage of the new functions and algorithms, you need to
- compile and link using either the -n32 or the -64 option.
-
- Note however, that the -o32 version of libmx contains all routines
- present in the -n32 and -64 versions of libmx except the quad precision
- and vector routines, and gives results identical to the -n32 and -64
- versions.
-
- NNNNOOOOTTTTEEEESSSS
- Users concerned with portability to other computer systems should note
- that the long double and float versions of these functions are optional
- according to the ANSI C Programming Language Specification ISO/IEC 9899 :
- 1990 (E).
-
- Long double functions have been renamed to be compliant with the ANSI-C
- standard, however to be backward compatible, they may still be called
- with the double precision function name prefixed with a q. (Exceptions:
- functions _f_a_b_s_l and _f_m_o_d_l may be called with names _q_a_b_s and _q_m_o_d, resp.)
-
- In 4.3BSD, distributed from the University of California in late 1985,
- most of the foregoing functions come in two versions, one for the
- double-precision "D" format in the DEC VAX-11 family of computers,
- another for double-precision arithmetic conforming to the IEEE Standard
- 754 for Binary Floating-point Arithmetic. The two versions behave very
- similarly, as should be expected from programs more accurate and robust
- than was the norm when UNIX was born. For instance, the programs are
- accurate to within the numbers of _u_l_ps tabulated above; an _u_l_p is one
- _Unit in the _Last _Place. And the programs have been cured of anomalies
- that afflicted the older math library _l_i_b_m in which incidents like the
- following had been reported:
- sqrt(-1.0) = 0.0 and log(-1.0) = -1.7e38.
- cos(1.0e-11) > cos(0.0) > 1.0.
- pow(x,1.0) != x when x = 2.0, 3.0, 4.0, ..., 9.0.
- pow(-1.0,1.0e10) trapped on Integer Overflow.
- sqrt(1.0e30) and sqrt(1.0e-30) were very slow.
- This machine conforms to the IEEE Standard 754 for Binary Floating-point
- Arithmetic, to which only the notes for IEEE floating-point apply and are
-
-
-
- PPPPaaaaggggeeee 5555
-
-
-
-
-
-
- MMMMAAAATTTTHHHH((((3333MMMM)))) MMMMAAAATTTTHHHH((((3333MMMM))))
-
-
-
- included here.
- (See however, the notes regarding long double precision below.)
-
- IIIIEEEEEEEEEEEE SSSSTTTTAAAANNNNDDDDAAAARRRRDDDD 777755554444 FFFFllllooooaaaattttiiiinnnngggg----ppppooooiiiinnnntttt AAAArrrriiiitttthhhhmmmmeeeettttiiiicccc::::
-
- This standard is on its way to becoming more widely adopted than any
- other design for computer arithmetic.
-
- Properties of IEEE 754 Double-precision:
- Wordsize: 64 bits, 8 bytes. Radix: Binary.
- Precision: 53 sig. bits, roughly 16 sig. decimals.
- If x and x' are consecutive positive Double-precision numbers
- (they differ by 1 _u_l_p), then
- 1.1e-16 < 0.5**53 < (x'-x)/x _< 0.5**52 < 2.3e-16.
- Range: Overflow threshold = 2.0**1024 = 1.8e308
- Underflow threshold = 0.5**1022 = 2.2e-308
- Overflow goes by default to a signed Infinity.
- Underflow is _G_r_a_d_u_a_l, rounding to the nearest integer multiple
- of 0.5**1074 = 4.9e-324.
- Zero is represented ambiguously as +0 or -0.
- Its sign transforms correctly through multiplication or
- division, and is preserved by addition of zeros with like
- signs; but x-x yields +0 for every finite x. The only
- operations that reveal zero's sign are division by zero and
- copysign(x,+_0). In particular, comparison (x > y, x _> y, etc.)
- cannot be affected by the sign of zero; but if finite x = y
- then Infinity = 1/(x-y) != -1/(y-x) = -Infinity.
- Infinity is signed.
- it persists when added to itself or to any finite number. Its
- sign transforms correctly through multiplication and division,
- and (finite)/+_Infinity = +_0 (nonzero)/0 = +_Infinity. But
- Infinity-Infinity, Infinity*0 and Infinity/Infinity are, like
- 0/0 and sqrt(-3), invalid operations that produce _N_a_N.
- Reserved operands:
- there are 2**53-2 of them, all called _N_a_N (_Not _a _Number).
- Some, called Signaling _N_a_Ns, trap any floating-point operation
- performed upon them; they could be used to mark missing or
- uninitialized values, or nonexistent elements of arrays. The
- rest are Quiet _N_a_Ns; they are the default results of Invalid
- Operations, and propagate through subsequent arithmetic
- operations. If x != x then x is _N_a_N; every other predicate (x
- > y, x = y, x < y, ...) is FALSE if _N_a_N is involved.
- NOTE: Trichotomy is violated by _N_a_N.
- Besides being FALSE, predicates that entail ordered
- comparison, rather than mere (in)equality, signal Invalid
- Operation when _N_a_N is involved.
- Rounding:
- Every algebraic operation (+, -, *, /, sqrt) is rounded by
- default to within half an _u_l_p, and when the rounding error is
- exactly half an _u_l_p then the rounded value's least significant
- bit is zero. This kind of rounding is usually the best kind,
- sometimes provably so; for instance, for every x = 1.0, 2.0,
-
-
-
- PPPPaaaaggggeeee 6666
-
-
-
-
-
-
- MMMMAAAATTTTHHHH((((3333MMMM)))) MMMMAAAATTTTHHHH((((3333MMMM))))
-
-
-
- 3.0, 4.0, ..., 2.0**52, we find (x/3.0)*3.0 == x and
- (x/10.0)*10.0 == x and ... despite that both the quotients and
- the products have been rounded. Only rounding like IEEE 754
- can do that. But no single kind of rounding can be proved best
- for every circumstance, so IEEE 754 provides rounding towards
- zero or towards +Infinity or towards -Infinity at the
- programmer's option.
- Exceptions:
- IEEE 754 recognizes five kinds of floating-point exceptions,
- listed below in declining order of probable importance.
- Exception Default Result
- ------- -------
- Invalid Operation _N_a_N, or FALSE
- Overflow +_Infinity
- Divide by Zero +_Infinity
- Underflow Gradual Underflow
- Inexact Rounded value
- NOTE: An Exception is not an Error unless handled badly. What
- makes a class of exceptions exceptional is that no single
- default response can be satisfactory in every instance. On the
- other hand, if a default response will serve most instances
- satisfactorily, the unsatisfactory instances cannot justify
- aborting computation every time the exception occurs.
-
- For each kind of floating-point exception, IEEE 754 provides a Flag
- that is raised each time its exception is signaled, and stays raised
- until the program resets it. Programs may also test, save and
- restore a flag. Thus, IEEE 754 provides three ways by which
- programs may cope with exceptions for which the default result might
- be unsatisfactory:
-
- 1) Test for a condition that might cause an exception later, and
- branch to avoid the exception.
-
- 2) Test a flag to see whether an exception has occurred since the
- program last reset its flag.
-
- 3) Test a result to see whether it is a value that only an
- exception could have produced.
- CAUTION: The only reliable ways to discover whether Underflow
- has occurred are to test whether products or quotients lie
- closer to zero than the underflow threshold, or to test the
- Underflow flag. (Sums and differences cannot underflow in IEEE
- 754; if x != y then x-y is correct to full precision and
- certainly nonzero regardless of how tiny it may be.) Products
- and quotients that underflow gradually can lose accuracy
- gradually without vanishing, so comparing them with zero (as one
- might on a VAX) will not reveal the loss. Fortunately, if a
- gradually underflowed value is destined to be added to something
- bigger than the underflow threshold, as is almost always the
- case, digits lost to gradual underflow will not be missed
- because they would have been rounded off anyway. So gradual
-
-
-
- PPPPaaaaggggeeee 7777
-
-
-
-
-
-
- MMMMAAAATTTTHHHH((((3333MMMM)))) MMMMAAAATTTTHHHH((((3333MMMM))))
-
-
-
- underflows are usually _p_r_o_v_a_b_l_y ignorable. The same cannot be
- said of underflows flushed to 0.
-
- At the option of an implementor conforming to IEEE 754, other ways
- to cope with exceptions may be provided:
-
- 4) ABORT. This mechanism classifies an exception in advance as an
- incident to be handled by means traditionally associated with
- error-handling statements like "ON ERROR GO TO ...". Different
- languages offer different forms of this statement, but most
- share the following characteristics:
-
- - No means is provided to substitute a value for the offending
- operation's result and resume computation from what may be the
- middle of an expression. An exceptional result is abandoned.
-
- - In a subprogram that lacks an error-handling statement, an
- exception causes the subprogram to abort within whatever program
- called it, and so on back up the chain of calling subprograms
- until an error-handling statement is encountered or the whole
- task is aborted and memory is dumped.
-
- 5) STOP. This mechanism, requiring an interactive debugging
- environment, is more for the programmer than the program. It
- classifies an exception in advance as a symptom of a
- programmer's error; the exception suspends execution as near as
- it can to the offending operation so that the programmer can
- look around to see how it happened. Quite often the first
- several exceptions turn out to be quite unexceptionable, so the
- programmer ought ideally to be able to resume execution after
- each one as if execution had not been stopped.
-
- 6) ... Other ways lie beyond the scope of this document.
-
- The crucial problem for exception handling is the problem of Scope, and
- the problem's solution is understood, but not enough manpower was
- available to implement it fully in time to be distributed in 4.3BSD's
- _l_i_b_m. Ideally, each elementary function should act as if it were
- indivisible, or atomic, in the sense that ...
-
- i) No exception should be signaled that is not deserved by the data
- supplied to that function.
-
- ii) Any exception signaled should be identified with that function
- rather than with one of its subroutines.
-
- iii) The internal behavior of an atomic function should not be disrupted
- when a calling program changes from one to another of the five or
- so ways of handling exceptions listed above, although the
- definition of the function may be correlated intentionally with
- exception handling.
-
-
-
-
- PPPPaaaaggggeeee 8888
-
-
-
-
-
-
- MMMMAAAATTTTHHHH((((3333MMMM)))) MMMMAAAATTTTHHHH((((3333MMMM))))
-
-
-
- Ideally, every programmer should be able _c_o_n_v_e_n_i_e_n_t_l_y to turn a debugged
- subprogram into one that appears atomic to its users. But simulating all
- three characteristics of an atomic function is still a tedious affair,
- entailing hosts of tests and saves/restores; work is under way to
- ameliorate the inconvenience.
-
- Meanwhile, the functions in _l_i_b_m are only approximately atomic. They
- signal no inappropriate exception except possibly ...
- Over/Underflow
- when a result, if properly computed, might have lain barely
- within range, and
- Inexact in _c_a_b_s, _c_b_r_t, _h_y_p_o_t, _l_o_g_1_0 and _p_o_w
- when it happens to be exact, thanks to fortuitous cancellation
- of errors.
- Otherwise, ...
- Invalid Operation is signaled only when
- any result but _N_a_N would probably be misleading.
- Overflow is signaled only when
- the exact result would be finite but beyond the overflow
- threshold.
- Divide-by-Zero is signaled only when
- a function takes exactly infinite values at finite operands.
- Underflow is signaled only when
- the exact result would be nonzero but tinier than the underflow
- threshold.
- Inexact is signaled only when
- greater range or precision would be needed to represent the
- exact result.
-
- Exceptions on this machine:
- The exception enables and the flags that are raised when an
- exception occurs (as well as the rounding mode) are in the
- floating-point control and status register. This register can be
- read or written by the routines described on the man page _f_p_c(3C).
- This register's layout is described in the file <_s_y_s/_f_p_u._h>.
-
- A useful set of ``user trap handlers'' is available. See the man
- page _s_i_g_f_p_e(3C).
-
- The raw interface to the hardware registers is only intended to be
- used by the code to implement IEEE user trap handlers. IEEE
- floating-point exceptions are enabled by setting the enable bit for
- that exception in the floating-point control and status register.
- If an exception then occurs the UNIX signal SIGFPE is sent to the
- process. It is up to the signal handler to determine the
- instruction that caused the exception and to take the action
- specified by the user. The instruction that caused the exception is
- in one of two places. If the floating-point board is used (the
- floating-point implementation revision register indicates this in
- its implementation field) then the instruction that caused the
- exception is in the floating-point exception instruction register.
- In all other implementations the instruction that caused the
-
-
-
- PPPPaaaaggggeeee 9999
-
-
-
-
-
-
- MMMMAAAATTTTHHHH((((3333MMMM)))) MMMMAAAATTTTHHHH((((3333MMMM))))
-
-
-
- exception is at the address of the program counter as modified by
- the branch delay bit in the cause register. Both the program
- counter and cause register are in the sigcontext structure passed to
- the signal handler (see _s_i_g_n_a_l(2)). If the program is to be
- continued past the instruction that caused the exception the program
- counter in the signal context must be advanced. If the instruction
- is in a branch delay slot then the branch must be emulated to
- determine if the branch is taken and then the resulting program
- counter can be calculated (see _e_m_u_l_a_t_e__b_r_a_n_c_h(3X) and _s_i_g_n_a_l(2)).
- Note however, that on systems using the R8000 processor, floating
- point exceptions are generally fatal when trapped unless the process
- is being run in precise exception mode.
-
-
- PPPPLLLLAAAATTTTFFFFOOOORRRRMMMM SSSSPPPPEEEECCCCIIIIFFFFIIIICCCC LLLLIIIIBBBBRRRRAAAARRRRIIIIEEEESSSS
- When compiling -n32 or -64, each processor has specially tuned, hardware
- specific, versions of _l_i_b_m and _l_i_b_f_a_s_t_m, that the run time linker will
- use, by default, whenever available.
-
- The R10000 tuned libraries are found in the directories:
- /usr/lib32/mips4/r10000/
- /usr/lib64/mips4/r10000/
-
- The R8000 tuned libraries are found in the directories:
- /usr/lib32/mips4/r8000/
- /usr/lib64/mips4/r8000/
-
- The R5000 tuned libraries are found in the directories:
- /usr/lib32/mips4/
- /usr/lib64/mips4/
-
- And the R4000 tuned libraries are found in the directories:
- /usr/lib32/mips3/
- /usr/lib64/mips3/
-
- At runtime, each program automatically uses the "best" library for the
- system on which it is executing. For example, if the executing program is
- a mip3 program designed to run on an r4000 processor, it will still use
- the mips4 R1000-tuned math library when running on an r10000 system.
-
-
- BBBBUUUUGGGGSSSS
- When signals are appropriate, they are emitted by certain operations
- within the codes, so a subroutine-trace may be needed to identify the
- function with its signal in case method 5) above is in use. And the
- codes all take the IEEE 754 defaults for granted; this means that a
- decision to trap all divisions by zero could disrupt a code that would
- otherwise get correct results despite division by zero.
-
-
-
-
-
-
-
- PPPPaaaaggggeeee 11110000
-
-
-
-
-
-
- MMMMAAAATTTTHHHH((((3333MMMM)))) MMMMAAAATTTTHHHH((((3333MMMM))))
-
-
-
- SEE ALSO
- signal(2), fpc(3C), emulate_branch(3X), sigfpe(3C), matherr(3M)
- R2010 Floating Point Coprocessor Architecture
- R2360 Floating Point Board Product Description
- An explanation of IEEE 754 and its proposed extension p854 was published
- in the IEEE magazine MICRO in August 1984 under the title "A Proposed
- Radix- and Word-length-independent Standard for Floating-point
- Arithmetic" by W. J. Cody et al. Articles in the IEEE magazine COMPUTER
- vol. 14 no. 3 (Mar. 1981), and in the ACM SIGNUM Newsletter Special
- Issue of Oct. 1979, may be helpful although they pertain to superseded
- drafts of the standard.
-
- AAAAUUUUTTTTHHHHOOOORRRR
- W. Kahan, with the help of Z-S. Alex Liu, Stuart I. McDonald, Dr.
- Kwok-Choi Ng, Peter Tang.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- PPPPaaaaggggeeee 11111111
-
-
-
-