If you have answered yes, then you are betting on being lucky. If you are saying: “*I always have done so and never had any issue!!!*“, well, (1) you are lucky that you never executed your code with numbers that would cause problem, (2) you are using a library that handles these things under the hood for you, and/or (3) without knowing, you are using an algorithm that is not susceptible to the issues we are going to discuss here.

## How errors are introduced?

Let’s first learn how numbers are represented on computers, and then we look into how that leads to round-off error.

### How numbers are represented using IEEE-754?

Based on IEEE-754, a floating point is represented using the following equation:

where:

- is a sign bit: 0 being positive and 1 being negative,
- is called Mantissa. It is 23 bits long for float-32 and 52 bits long for float-64,
- is the base, which is 2,
- is the exponent which is 8 bits long for float-32 and 11 bits long for float-64,
- and finally, is the exponent bias. It is 127 for float-32 and 1023 for float-64.

To add a bit of precision, when storing the numbers in a normalized form, the Mantissa is assumed to be part of a fractional binary numbers, i.e. it is assumed to represent the in , and is called fraction.

Confusing? I know. Let’s provide an example and see how this works. Let’s say we have:

- ,
- ,
- .

**(Q)** Is this a positive or negative number?

**(A)** It is a positive number since

**(Q)** Is this a float-32 or float-64?

**(A)** It is a float-64 because has 11 bits. Therefore, must have in total 52 bits.

**(Q)** What is the number being represented by these bits?

**(A)** Let’s first convert to it’s decimal value, which is 1025. The bit at position 11 and 1 are set to 1 and all the other bits are zero therefore:

Now Let’s convert the Mantissa to a decimal value. Remember that Mantissa was representing the fraction, so don’t forget the 1 before the period. For Mantissa we have:

So, now if we put all these in our equation we have:

Now that you have learned how numbers are represented on computers, you can see that not all decimal points could be mapped exactly to a number in binary format using the formula mentioned above. That is the first source of error creeping in your computations.

### Round Off Error

All the operations are performed using numbers which are represented using the above formula. As mentioned in previous section, not all decimal numbers have an exact representation in that format and therefore, the moment you start doing computation, if that number is not exactly representable with the above model, you are introducing some errors into your calculations.

But the round-off error is not limited to only the numbers that are not exactly representable. The operations and calculations performed on the number represented using above formula, inherently has some errors. Therefore, even if the numbers are exactly representable, as you do calculations, there would be some errors gradually creeping into your calculations.

Machine Accuracy () is defined as the smallest number that you can add to 1.0 and get a number that is not equal to 1.0. You can check these numbers using python as follows:

```
import numpy as np
print(np.finfo("float32").eps) # will print 1.1920929e-07
print(np.finfo("float64").eps) # will print 2.220446049250313e-16
if (1.0 + np.finfo("float64").eps) == 1.0:
print("equal")
else:
print("not equal")
# "not equal" will be printed
if (1.0 + np.finfo("float64").eps / 2.0) == 1.0:
print("equal")
else:
print("not equal")
# "equal" will be printed
```

Just a reminder that machine accuracy is not the smallest number that you can represent.

Now if you perform operations, and if you are extremely lucky, the total amount of error that you add to your calculation would be on average . However, the issue is that these errors across different operation do not necessarily cancel each other out. Also, depending on the operation that you are doing and the numbers that you are using, things could get bad pretty fast. For example, we all know from elementary math that the roots to a quadratic function, i.e. could be calculated using:

That equation works when you are doing calculations by hands. But if you are coding a library to calculate the roots, the formula above could lead to a very wrong answers if and are not comparable. A more robust way of finding the solutions to a simple quadratic formula is:

first, calculate:

after that the two solutions are:

## Matrix Inversion and Condition Numbers

In a previous post (here) we show how to solve a system of linear equations, or:

We discussed three different cases:

**Case A:**where matrix has the same number of rows and columns and it is a square matrix,**Case B:**where matrix has more rows than columns. We mentioned that this is commonly known as regression, and**Case C:**where matrix has less number of rows than columns. We show that in this case, if there is one solution, there are many, i.e. non-unique solution.

In this post we focus on square matrix, i.e. Case A. Notice that in Case B, we change the matrix into a square matrix by multiplying it with . Also, in Case C, the part that we were doing matrix inversion involved a square matrix. So, we are safe to limit our discussion to only invertible square matrices. We mentioned that if a matrix has non-zero determinant we are able to calculate it’s inverse, or .

But are we really able to calculate ?

Let’s put aside the fact that directly computing is computationally intensive and as the size of the matrix increases, the number of operation that you need to do increases very fast. So, pretty much for many practical reasons, you won’t directly calculate . It would be very inefficient to calculate .

Here, we are going to discuss, that efficiency aside, directly calculating might suffer from numerical inaccuracies. As we discussed, when you are dealing with numbers, you are always dealing with errors. For one, some numbers are not exactly represented. That’s one source of error. If the equations that you are solving deals with natural or physical measurements your would have some errors associated to it, simply because every measurement methods will have some errors associated with it.

So in reality, by the time that you enter your numbers in the computer, your is not the actual error-free values. Considering this error, what you are actually solving is the following equation:

where we have:

- is the true value or the error-free version of, well, , and
- is the error that was introduced into the value of .

Like wise we have and is the error for .

Now the questions that we are interested to answer is:

You can also say, we are looking to find out how sensitive is our results (i.e. outputs) to the error and changes on the input.

Notice that the relative error in the outputs are measured by:

is any norm that you like to use. Usually it is norm.

Likewise, the error in the inputs are:

**NOTE:** recall that when solving a system of linear equations, i.e. , your unknown or the output is and your known or the input is /

Also notice that:

No if we look at

Now we know that , therefore we can replace with and write:

Now if we rearrange the equation above, we get:

You see that the error in the inputs, i.e. are multiplied by and that gives you an upper bound of how much the error of the output could grow based on the errors on the inputs.

That multiplying factor is called the condition number of a matrix, or:

It is expected that if , that you would roughly lose another n-digits of accuracy on top of whatever you are already losing due to floating-point operation. So, as you can see, if the condition number is big, that could be a problem. Let me correct myself. If the condition number is big, that is a problem, if you don’t use a proper method that tames your system to behave nicely. Essentially, If you don’t control the error propagation from the inputs to the outputs, if the condition number is big, your solution (i.e. outputs) are dominated by error.

## Condition Number Range

It would be nice if condition number of a matrix is less than 1, i.e. , right? If the condition number was less than 1, then the error in the inputs would have been attenuated and you would have received a more accurate results. That would be great.

If you are feeling something is not correct with what you just read, well, you guessed it right! Unfortunately, the condition number cannot be less than 1. In fact, 1 is the lowest condition number that you can get. And for sure, it cannot be a negative number, hello!!! remember properties of the norms?

As you can see from the above equation, the higher the condition number is, the higher the upper bound of the error is. And Although we call it upper bound, don’t feel that you are so lucky and you don’t reach that upper bound error. In many practical and real world problems, if you are dealing with a system which has a high condition number, you would need to pay extra attention on how to efficiently solve that problem. In fact, many algorithms are just dealing with this issue.

When a problem has a high condition number, we call that problem **ILL-CONDITIONED**. Shocking, right? Who would have thought we would call a problem with troubling high condition number as ill-conditioned problem.

## Did we forget about any other error?

In the above computation, you noticed that we assumed has error and we introduced . We studied how the error in the inputs, i.e. is going to determine an upper bound for error in the outputs, i.e. .

mmmm!!! I wonder if we forgot something?

Let’s see! In the equation , we have and . is the input and is what we compute when inverting . Oh wait! What about itself?

We assumed that is accurate. However, for the same reasons that we can get some error in vector, we can also have errors in matrix . Either because inaccuracies in number representations, or due to measurements. Let’s say you do a model like this:

In the equation above, you are trying to model using parameters shown as , and you want to know what are the best values for . What you will do is that you go to the field (or the lab), measure bunch of numbers. Essentially you measure , these are your predictors, and then you also measure , this is what you want to predict using your predictors, where subscript denotes the measurement number, eg. first measurement, second measurement, i-th measurement. When doing regression, you do know what are and you also know a value for . What you don’t know is the values for . So, and are your inputs, known values, and are your unknowns or outputs. Essentially what you are solving is this:

As you can see, as much as could deviate from its actual value (sometimes you might read perturbed), the matrix has the same potential. So, like before we define , we want to know how much that would affect our outputs, i.e. . You can show that in this case:

where:

Notice that the above condition is only valid if you have .

## This is nice; but does it matter in any practical problems?

More than you think. In fact, the only time that these things doesn’t cause any problem is when you are doing class projects. There, everything is nicely arranged that you won’t get into any trouble. Unless of course, you are doing a class on perturbation and errors in computing system; then all you get is among the messiest stuff.

Let’s say you get a data and you want to fit a polynomial to it. Polynomial is one of the easiest model, right? You don’t want to get things too complicated. So, you model your problem as:

So, once you collect your data your matrix will like something like this:

Cool; Polynomials are easy. This should be straight forward, I just compute

and I am done. right? I mean polynomials are the most widely used methods; Everyone is using it, so this should be easy right?

As you guessed, right again, nope! It is not that easy. In fact, This matrix could get ill-conditioned so fast, that it has a name. It is called Vandermonde Matrix. Look it up. There are many papers and methods on how to solve a system defined by Vandermonde matrices efficiently and properly.

## But I used MATLAB’s “\” without any issues so far?

Congratulations. Yes, you did use a good approach. As MATLAB says in its own documentation, when solving , one way of solving it would be , but a better way would be doing . As MATLAB documentation says, it is better both in terms of execution time as well as numerical accuracy of the solution.

The things is that “” operator in MATLAB is not exactly equal to calculating the . In fact if you check the MATLAB documentation on `mldivide`

, and scroll down to the algorithm section of the documentation, you would see that depending on the matrix , a different method is being used to solve for X, without really ever inverting .

The good thing about “” is that it even knows how to solve non-square matrices. So, you don’t need to manually multiply .

## Conclusion

When doing computation, you need to understand that unlike the public perception, computers are not accurate in their representation of numbers. You would need to make sure that you are using a method that is not susceptible to errors and perturbations.

This means that sometimes mathematical equations do not directly translate to code. You would need to modify your algorithm and account for errors. We also discussed that the error is not always because of how the computers represent numbers. The error knows how to creep in, like through your measurements. So, while pay attention to reducing errors (or should I say blunders), but there are some errors that you cannot reduce or eliminate. So, invest on algorithms that are not sensitive to errors, or they are smart enough to handle even an erroneous conditions.

Here, we showed how to calculate the upper bound of error using condition number. Although, we did this calculations for a linear system of equations, our favorite type of system, the same concept applies to non-linear equations and models.

We also discussed that it would be naive to think that: “my problem is not complex, I am just solving a quadratic equation!”; or “I am just fitting a simple polynomial! So, I don’t need to bother about controlling errors”. Even in this simple method, the error would creep in and make your computations wrong, if you don’t control the error. Well, let me give you even a simpler problem:

```
import numpy as np
a = np.sqrt(5)
if a * a == 5:
print("EQUAL")
else:
print("NOT EQUAL")
# this would print NOT EQUAL
```

Even a simple calculation as when doing it on a computer.

So, to answer * “Could I directly translate a mathematical formula into computer code?”*, the answer is a big

**.**

*NO*The fact that many of us don’t need to bother with a lot of these things is that, we do have access to a lot of very good numerical libraries that thanklessly handle these things under the hood for us. So, we do not need to bother with these nuances and focus on solving the business problem that we have. So, instead of rewriting everything from scratch, favor using available libraries. You noticed that even writing a function that solves quadratic equation under all conditions wasn’t what we thought it was. Yes, the alternative was as easy as the original.

If, however, you are working on a pretty new type of problem, from time to time you might notice that your algorithm is failing, while the mathematics and equations on paper is telling you that it should work. You might go over and over reviewing your equations and not finding anything wrong. In fact, your equations are not not wrong. It is just that you assumed everything is flawlessly error free, which is not; and since you are not controlling how the errors are propagated, you are allowing the error in the inputs go rampant and dominate your outputs. This is the time to revisit this theory and make adjustment as necessary.