This post is part of my Game Math Series.

I first saw this technique in an article, Hacking Quaternions, written by Jonathan Blow (author of Braid).

To normalize a vector, you divide its individual components by its length.

const Vec3 Normalize(const Vec3 &v)
{
const float len_sq = v.x * v.x + v.y * v.y + v.z * v.z;
const float len_inv = 1.0f / std::sqrt(len_sq);
return Vec3(v.x * len_inv, v.y * len_inv, v.z * len_inv);
}

The bottle neck of this function is the one divided by a square root, since the square root function is generally not fast. We can do better by using a polynomial approximation of the function around if all we want is to re-normalize unit vectors, which is a very common scenario in games.

Jonathan Blow says that it is more than enough to approximate the function with a polynomial of degree 1 (a straight line); for demonstrative purposes, however, I will show how to approximate around using a polynomial of degree 2 (a quadratic curve). I will use the method covered in this post.

Our polynomial of degree 2 includes three terms:

We will approximate the function around , so we will impose all three conditions at , namely the curve value, first-order derivative, and second derivative, at .

With , we get:

After solving the system of equations, we have:

Let’s see how our approximation looks:

The blue curve is the original function , and the red curve is our approximation. It looks pretty accurate around .

Finally, this is what our faster re-normalization function of unit vectors looks like:

inline float FastSqrtInvAroundOne(float x)
{
const float a0 = 15.0f / 8.0f;
const float a1 = -5.0f / 4.0f;
const float a2 = 3.0f / 8.0f;
return a0 + a1 * x + a2 * x * x;
}
const Vec3 Normalize(const Vec3 &v)
{
const float len_sq = v.x * v.x + v.y * v.y + v.z * v.z;
const float len_inv = FastSqrtInvAroundOne(len_sq);
return Vec3(v.x * len_inv, v.y * len_inv, v.z * len_inv);
}

The performance gain is only about 1.25x on my machine, and it’s a little bit faster using just a polynomial of degree 1. It’s not much, but it’s still beneficial to a game since this is a very commonly used function.

I also compared with the performance of Carmack’s Inverse Square Root and found out that both methods are about the same speed on my machine.

*Related*

## About Allen Chou

Physics / Graphics / Procedural Animation / Visuals

Hey Allen, just stumbled upon this from your tweet. I was curious what the error was like on this versus Quake’s FastInvSqrt, so I threw this together: https://plot.ly/~jason.meisel/1/?share_key=CxRuewgKmK6ZNghMEdiU5e

Not surprisingly, your function is slightly more accurate between 0.9 and 1.1. However, even at its worst, FastInvSqrt is only about 0.0022 off, while yours is practically unusable past those bounds.

Neat!

Thanks for taking your time to do the comparison. Yeah, the polynomial approximation is really limited to being used on re-normalizing unit vectors (or any thing that stays around 1.0) and nothing else.

How exactly have i to understand the graphs? If you have a vector with a large or very smalll length, then the error will be significant, wont it? So you should use the approximation if the vector is around the length one?

PS: after the grapghs you wrote, the blue curve is f(x)=1/x but it must be 1/sqrt(x) 😉

Correct. This approximation is only for re-normalizing vectors that are already nearly normalized. For instance, this approach can be used to re-normalize the front, right, and up vectors of your camera matrix every frame. Another example is re-orthogonalizing an orientation matrix, where you convert an orientation matrix to a quaternion, re-normalize the quaternion using this approach, and then convert the quaternion back to the orientation matrix.

Also, thanks for pointing out the typo. It is fixed now.