Binary Number System: Representations

Principles of Floating Point Numbers

Calculations may have to deal with extremely large numbers and extremely small numbers.
It would be wasteful to set aside all the digit space to handle (with complete precision) numbers that require all these digits – especially when the output will not require such precision. Further more – if there was a fixed way to be certain to handle all possible numbers – there would always come along a number that would require even more digits.

Need to separate the range from the precision. We need some precision within a desired range. Some precision without having to have precision throughout the range.

Answer: scientific notation

Principles of Floating Point

scientific notation: n = f x 10e
f – fraction (or mantissa) e- a positive or negative integer called exponent
computer version of scientific notation is floating point.

example: 3.14 = 0.314 x 101

range – determined by number of digits in the exponent
precision – determined by the number of digits in the fraction

(one form is usually chosen as the standard)

Floating-point used to model the real-number system of math – however some important differences.

+0.100 x 10-99 to +0.999 x 10+99


underflow can be less serious than overflow.

only 179,100 positive, 179,100 negative and one zero. That is 358,201 numbers out of the infinite number possible between the limits.

Some numbers (a result of calculation) can not be expressed with floating point – therefore round.
rounding – using the nearest representation

Relative error for rounding is basically the same for smaller numbers as for larger number.


Next Page