User Tools

Site Tools


notes:ieee_754-1985

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
notes:ieee_754-1985 [2013/02/20 16:19]
andy
notes:ieee_754-1985 [2013/02/24 00:11]
andy [Normalised Values]
Line 5: Line 5:
 Briefly, floating point numbers are a way to represent a wide numeric range in a limited set of bits by storing a limited number of significant digits. As the absolute size of the number increases, the absolute precision decreases. Briefly, floating point numbers are a way to represent a wide numeric range in a limited set of bits by storing a limited number of significant digits. As the absolute size of the number increases, the absolute precision decreases.
  
-The principle is similar to exponential notation for numbers (e.g. **3.523 10<sup>6</​sup>​**) except that IEEE floating point values use powers of 2 instead of 10.+The principle is similar to exponential notation for numbers (e.g. $3.523 \times ​10^6$) except that IEEE floating point values use powers of 2 instead of 10.
  
  
Line 45: Line 45:
 The most common form of IEEE floating point numbers is the normalised form --- this is where the exponent has a value in the valid range (once the bias has been subtracted from the unsigned value stored). As explained above, the significand stores only the digits after the leading **1**, which is implicit. The most common form of IEEE floating point numbers is the normalised form --- this is where the exponent has a value in the valid range (once the bias has been subtracted from the unsigned value stored). As explained above, the significand stores only the digits after the leading **1**, which is implicit.
  
-If the standard C library is available, the ''​[[man>​frexp|frexp()]]''​ function normalises a floating point value such that the fractional part will be in the range **0.5 ≤ x < 1.0**. Multiplying this value by **2** and reducing the exponent by **1** yields a value in the desired range **1.0 ≤ x < 2.0**. At this point the leading digit can then be discarded as the implicit leading **1** (see the [[#​Significand]] section for details).+If the standard C library is available, the ''​[[man>​frexp|frexp()]]''​ function normalises a floating point value such that the fractional part will be in the range $0.5 \le \times ​< 1.0$. Multiplying this value by **2** and reducing the exponent by **1** yields a value in the desired range $1.0 \le \times ​< 2.0$. At this point the leading digit can then be discarded as the implicit leading **1** (see the [[#​Significand]] section for details).
  
 If ''​frexp()''​ is available then it should be used, as it will likely use the underlying hardware representation to avoid expensive loops. However, a naive implementation can quite simply mimic its functionality --- the following version demonstrates the principle, but a production version would also need to check for special values (zero, NaN, infinities) as well as catching under- and overflows: If ''​frexp()''​ is available then it should be used, as it will likely use the underlying hardware representation to avoid expensive loops. However, a naive implementation can quite simply mimic its functionality --- the following version demonstrates the principle, but a production version would also need to check for special values (zero, NaN, infinities) as well as catching under- and overflows:
notes/ieee_754-1985.txt · Last modified: 2013/02/24 00:18 by andy