IEEE 854-1987 is the IEEE Standard for Radix-Independent Floating-Point Arithmetic
The goal of the IEEE 854 standardization effort was to define a standard
for floating point arithmetic
without the radix
and word length dependencies of the much better known IEEE Standard for Binary Floating-Point Arithmetic
Although the actual IEEE 854 standard only specifies binary and decimal floating point arithmetic, it implicitly provides considerable guidance to anyone contemplating the implementation of floating point using other bases (e.g. base-16).
Like IEEE 754, the IEEE 854 standard defines four floating point precisions:
- single precision
- double precision
- single-extended precision
- double-extended precision
Each of the precisions supported by a particular implementation represents values of the form
The sign parameter
is a binary
digit (i.e. 0 or 1) indicating the sign of the value.
The value b
is the radix
is adjusted such that the significand
has one digit to the left of the radix-point (i.e. decimal point or binary point, depending on the value of b
The four precisions are themselves parameterized by four integer values:
- a radix b
- the number of base-b digits, p, in the significand
- the minimum exponent Emin
- the maximum exponent Emax
Precisions specified by these four parameters must satisfy the following constraints:
- b must be either 2 or 10 and must be the same for all supported precisions (i.e. an implementation which supports both radixes is, strictly speaking, two separate implementations)
- (Emax - Emin)/p must exceed 5 and should exceed 10 (i.e. a precision must be capable of representing a minimally useful range of values)
- bp-1 must be at least 105 (i.e. a precision must provide a minimally useful amount of precision)
Unlike the IEEE 754 standard, the IEEE 854 standard doesn't specify parameter values which would specifically characterize any of the four defined precisions and it does not specify how values represented in each of the precisions are actually stored in memory.
What it does do is place certain constraints on the parameter values which a conforming implementation must adhere
The single precision parameters are not constrained beyond the basic constraints that all precision-defining parameters must adhere to (see above).
The parameter values defined by a particular implementation for single precision values are referred to by the names Emaxs,
Emins and ps.
A conforming implementation is only required to implement single precision.
If double precision is implemented, the double precision parameters are specified by Emaxd,
Emind and pd.
These parameters are subject to constraints which effectively require that double precision provide at least double the precision and eight times the range of the single precision format.
The extended precision variants are, apparently, intended to provide alternative precisions for applications which require them.
The relationship of the single-extended precision to single precision is roughly equivalent to the relationship of double precision to single precision (i.e. provides at least roughly twice the precision and eight times the range).
The relationship of double-extended to double precision is defined to be the same as the relationship of single-extended to single precision.
With quite minor exceptions, the remainder of the IEEE 854 standard reads almost identically to the IEEE 754 standard.
It defines four equivalent rounding modes, an equivalent set of operations and an equivalent set of exceptions and exception trapping rules.
It also recommends the same set of functions and predicates (e.g. CopySign, Logb, Scalb (it does suggest a couple of additional functions and predicates and extends the suggested definitions of some of the other primitives).
It even specifies signed infinities, signed zeros and NaNs with behaviour rules equivalent to the corresponding IEEE 754 entities.
In fact, the semi-official introduction to the standard states that the committee which developed the IEEE 854 standard believes that a conforming implementation of IEEE 754 binary floating point would also, with the possible exception of a couple of fairly minor points, conform with the IEEE 854 standard.
Sidebar: what's decimal floating point?
Decimal floating point is floating point arithmetic performed using decimal digits as opposed to binary floating point which is performed using binary digits.
The distinction between the two essentially boils down to two points:
- In a binary floating point format, the least significant digit represented by the format is a single bit which is rounded (i.e. set to 0 or 1) depending on the rounding mode currently in effect and the binary digits (i.e. bits) which follow the bit being rounded.
In a decimal floating point format, the least significant digit represented by the format is a single decimal digit which is rounded (i.e. set to a value between 0 and 9 inclusive) depending on the rounding mode currently in effect and the decimal digits which follow the digit being rounded.
- The choice of radix determines how the significand is affected by non-zero exponent values.
In a binary system the significand is adjusted by the power of two indicated by the exponent whereas in a decimal system the significand is adjusted by a power of 10.
The result is most obvious when one considers decimal fractions like 0.1.
In a binary system, the decimal value 0.1 is equal to the repeating binary fraction 0.11001100110011... multiplied by 2-3.
Since the repeating fraction 0.11001100110011... cannot be exactly represented in a fixed length significand, the decimal value 0.1 can't be exactly represented in a binary floating point system.
In a decimal system, the decimal value 0.1 is exactly represented by the decimal fraction 0.1 multiplied by 100 (i.e. the decimal value is trivially and exactly represented in a decimal floating point system).
The inability to exactly represent "convenient" decimal fractions like 0.1 can prove to be quite a challenge when trying to perform exact calculations on decimal fractions in a binary floating point system.
Switching to a decimal floating point system makes the problem simply vanish.
There are other differences which, although they do exist, aren't noticeable to the vast majority of software developers (of course, most software developers are probably unaware of the fundamental differences listed above - did YOU know that the decimal value 0.1 couldn't be represented exactly in IEEE 754 floating point?). For example, the IEEE 754 binary floating point arithmetic standard avoids storing the first bit in the significand in those cases where it is sure to be a 1 (and for reasons that I won't go into here, it is almost always sure to be a 1). This makes an extra bit available to store one more bit of precision.
It's also much easier to implement really high performance binary floating point as the logic for dealing with binary digits and performing binary rounding is considerably simpler than the logic (i.e. circuitry) for dealing with decimal digits and performing decimal rounding.
- IEEE Standard for Binary Floating-Point Arithmetic (IEEE 754-1985)
- IEEE Standard for Radix-Independent Floating-Point Arithmetic (IEEE 854-1987)