0.1 + 0.2 == 0.3
False
Since physics is all about numbers we had better develop some understanding of how computers represent them, and the limitations of this representation. Hopefully this example is sufficiently motivating:
0.1 + 0.2 == 0.3
False
Ah…
Let’s begin with something simpler
1 + 1 == 2
True
which is a bit more reassuring. Integers can be represented in binary
3 == 0b11
True
or octal or hexadecimal (with a prefix 0o
or 0h
). You can get the binary string representing an integer using the bin
function
bin(-2)
'-0b10'
Python allows for arbitrarily large integers, so there is no possibility of overflow or rounding error
2**100
1267650600228229401496703205376
The only limitation is the memory required to store it.
Numpy integers are a different story
import numpy as np
2**100) np.int64(
OverflowError: Python int too large to convert to C long
Since NumPy is using C the types have to play nicely. The range of integers that can be represented with 32 bit numpy.int32
s is \(\approx\pm 2^{31} \approx \pm 2.1 × 10^9\) (one bit is for the sign) and 64 bit numpy.int64
s is \(\approx\pm 2^{63} \approx \pm 9.2 × 10^{18}\). Apart from the risk of overflow when working NumPy’s integers there are no other gotchas to worry about.
The reason why \(0.1 + 0.2 \neq 0.3\) in Python is that specifying a real number exactly would involve an infinite number of bits, so that any finite representation is necessarily approximate.
The representation computers use for the reals is called floating point arithmetic. It is essentially a form of scientific notation, in which a significand (it contains the significant figures) is multiplied by an exponent. The name floating point reflects the fact that the number of digits after the decimal point is not fixed (I’m using the base ten terms for convenience)
This representation requires the choice of a base, and Python’s floating point numbers use binary. Numbers with finite binary representations therefore behave nicely
0.125 + 0.25 == 0.375
True
For decimal numbers to be represented exactly we’d have to use base ten. This can be achieved with the decimal
module. Our \(0.1+0.2\) example then works as expected
from decimal import *
'0.1') + Decimal('0.2') Decimal(
Decimal('0.3')
Since there is nothing to single out the decimal representation in physics (as opposed to, say, finance) we won’t have any need for it.
A specification for floating point numbers must give
Including one bit \(s\) for the overall sign, a number then has the form \((-1)^s\times c \times b^q\). The smallest positive nonzero number that can be represented is therefore \(b^{1 + \text{emin} - p}\) (corresponding to the smallest value of the exponent) and the largest is \(b^{1 + \text{emax}} - 1\).
The above representation isn’t unique: for some numbers you could make the significand smaller and the exponent bigger. A unique representation is fixed by choosing the exponent to be as small as possible.
Representing numbers smaller than \(b^{\text{emin}}\) involves a loss of precision, as the number of digits in the significand falls below \(p\) and the exponent has taken its minimum value . These are called subnormal numbers. For binary floats, if we stick with the normal numbers and a \(p\)-bit significand the leading bit will be 1 and so can be dropped from the representation, which then only requires \(p-1\) bits.
The specification for the floating point numbers used by Python (and many other languages) is contained in the IEEE Standard for Floating Point Arithmetic IEEE 754. The default Python float
uses the 64 bit binary64 representation (often called double precision). Here’s how those 64 bits are used
Another common representation is the 32 bit binary32 (single precision) with
If this all a bit theoretical you can just get NumPy’s finfo function to tell all about the machine precision
np.finfo(np.float64)
finfo(resolution=1e-15, min=-1.7976931348623157e+308, max=1.7976931348623157e+308, dtype=float64)
Note that \(2^{-52}=2.22\times 10^{-16}\) which accounts for the value \(10^{-15}\) of the resolution. This can be checked by finding when a number is close enough to treated as 1.0.
=1.0
xwhile 1.0 + x != 1.0:
/= 1.01
x print(x)
1.099427563084686e-16
For binary32 we have a resolution of \(10^{-6}\).
np.finfo(np.float32)
finfo(resolution=1e-06, min=-3.4028235e+38, max=3.4028235e+38, dtype=float32)
One lesson from this is that taking small differences between numbers is a potential source of rounding error, as in this somewhat mean exam question
As well as a floating point system, IEEE 754 defines Infinity and NaN (Not a Number)
1, -1, 0]) / 0 np.array([
/var/folders/xs/y8sn45v943s2_62flnxw0p940000gn/T/ipykernel_38352/2604490398.py:1: RuntimeWarning:
divide by zero encountered in true_divide
/var/folders/xs/y8sn45v943s2_62flnxw0p940000gn/T/ipykernel_38352/2604490398.py:1: RuntimeWarning:
invalid value encountered in true_divide
array([ inf, -inf, nan])
They behave as you might guess
2 * np.inf, 0 * np.inf, np.inf > np.nan
(inf, nan, False)
NaNs propagate through subsequent operations
2 * np.nan
nan
which means that if you get a NaN somewhere in your calculation, you’ll probably end up seeing it somewhere in the output (which is the idea).