The Right Way to Calculate Stuff


1-cos(x)

Use 2*sin(x/2)2.

If you have gnuplot, you can use it to show how badly 1-cos(x) degrades when calculated naively, and that the latter formulation behaves well (all the way down to the smallest expressible numbers, which, on my computer, is the same in gnuplot as for doubles in a C/C++ program):

        % gnuplot
        gnuplot> plot [-6:6] 1-cos(x), 2*sin(.5*x)**2
        gnuplot> plot [-4e-8:4e-8] 1-cos(x), 2*sin(.5*x)**2
        gnuplot> plot [-1e-152:1e-152] 1-cos(x), 2*sin(.5*x)**2 
Note also that it's not just for tiny x that 1-cos(x) catastrophically loses precision, it's near any multiple of 2*pi:
        gnuplot> plot [2*pi-4e-8:2*pi+4e-8] 1-cos(x), 2*sin(.5*x)**2 

cos-1(1-x)

Use 2*sin-1sqrt(x/2).
        gnuplot> plot [0:2] acos(1-x), 2*asin(sqrt(.5*x))
        gnuplot> plot [0:1e-15] acos(1-x), 2*asin(sqrt(.5*x))
        gnuplot> plot [0:1e-304] acos(1-x), 2*asin(sqrt(.5*x)) 

ex-1

Use expm1(x) if your math library (BSD Unix or conforming) has it; that's what it's for. If not, if you have tanh (and if it is not implemented naively in terms of ex as (ex-e-x)/(ex+e-x) !) you can use that: by definition,
        tanh(x/2) = (ex/2-e-x/2)/(ex/2+e-x/2)
                  = (ex-1)/(ex+1)
    
so
        ex-1 = tanh(x/2) (ex+1)
    
Or, use the following C code (I found this, with analysis, in William Kahan's web page http://www.cs.berkeley.edu/~wkahan/Math128/Sumnfp.pdf ; it achieves “nearly full working relative accuracy despite cancellation no matter how tiny x may be”):
        #include 
        double expm1(double x)
        {
            double u = exp(x);
            if (u == 1.)
                return x;
            if (u-1. == -1.)
                return -1.;
            return (u-1.)*x/log(u);  /* where log is natural logarithm */
        }
    
(XXX note, the above can be easily modified to get a nice expm1_over_x function as well; this is mentioned and analyzed in Nicholas J. Higham's book Accuracy and Stability of Numerical Algorithms, section 1.14.1, with attribution to Kahan)

ln(1+x)

Use log1p(x) if your math library supports it. If not, you can use atanh (if you have it, and if it is not implemented naively as log((1+x)/(1-x))/2 ):
        ln(1+x) = 2*atanh(x/(x+2))
    
Or, use the following, again due to Kahan (he's the undisputed master of this stuff):
        #include 
        double log1p(double x)
        {
            double u = 1.+x;
            if (u == 1.)
                return x;
            else
                return log(u)*x/(u-1.);
        }
    
(XXX just like the expm1 function mentioned above, this can be easily modified to get a nice log1p_over_x function)

tanh(x)

Assuming your math library doesn't have tanh (or it's broken), use the expm1 function (either from your math library, or the implementation above):
        tanh(x) = (ex-e-x) / (ex+e-x)
                = (e2*x-1) / (e2*x+1)
                = expm1(2*x) / (expm1(2*x) + 2)
    
(note that only one call of the expm1 function is required here).
(XXX uh oh, that overflows prematurely, when tanh should never overflow... should try to rearrange it.)

tanh-1(x)

Use the log1p function (either from your math library, or the implementation above):
        tanh-1(x) = log((1+x)/(1-x)) / 2
                  = log((1-x+2*x)/(1-x)) / 2
                  = log(1 + 2*x/(1-x)) / 2
                  = log1p(2*x/(1-x)) / 2
    

sinh(x)

        sinh(x) = (ex - e-x) / 2
                = (ex-1) * (ex+1) / ex / 2
                = expm1(x) * (expm1(x)+2) / (expm1(x)+1) / 2
    
To avoid overflow, we can re-order it as follows:
        double u = expm1(x);
        return .5 * u / (u+1.) * (u+2.);
    

sinh-1(x)

        sinh-1(x) = log(x + sqrt(x2+1))
                  = log1p(x + sqrt(x2+1) - 1)
                  = log1p(x + x2/(sqrt(x2+1)+1)   (from "sqrt(1+x)-1", below)
                  = log1p(x * (1 + x/(sqrt(x2+1)+1)))
    

cosh(x)-1

XXX To be done

cosh-1(1+x)

XXX To be done

sqrt(1+x)-1,

Often it helps to multiply an expression like this by its “conjugate”...
        sqrt(1+x)-1 = (sqrt(1+x)-1) * 1
                    = (sqrt(1+x)-1) * ((sqrt(1+x)+1) / (sqrt(1+x)+1))
                    = ((sqrt(1+x)-1) * (sqrt(1+x)+1)) / (sqrt(1+x)+1))
                    = (1+x - 1) / (sqrt(1+x)+1))
                    = x / (sqrt(1+x)+1)
    

1-sqrt(1-x)

We could use the conjugate trick again, or simply use the previous result:
        1-sqrt(1-x) = -(sqrt(1-x)-1)
                    = -(sqrt(1+y)-1)         where y = -x
                    = - y / (sqrt(1+y)+1)    by the previous result
                    = x / (sqrt(1-x)+1)
    

(1+x)2-1

This and the next are the inverse functions of the preceding two. They are easy, but instructive, and important to get right.
        (1+x)2-1 = 1 + 2*x + x2 - 1
                 = 2*x + x2
                 = x * (2 + x)
    

1-(1-x)2

        1-(1-x)2 = 1 - (1 - 2*x + x2)
                            = 2*x - x2
                            = x * (2 - x)
    

sin(x)/x

When we say sin(x)/x, we generally mean sin(x)/x if x != 0, 1 if x = 0.

But do we want to make a special case for very small x, to avoid taking the ratio of two tiny numbers?

Assuming a well-behaved implementation of sin(x), it will actually return x for sufficiently small x, down to the smallest expressible floating-point number, so the following is probably okay, uningeneous as it may seem:

(1)         if (x == 0.)
                return 1.;
            else
                return sin(x)/x; 
However, I'm not sure I trust sin(x) down that small, and furthermore the answer is indistinguishable from 1 long before we get anywhere close to the smallest representable number, so let's see how far we can go in the other direction, i.e. how large x can be before it becomes necessary to actually calculate sin(x)/x. Note that the power series expansion of sin(x)/x is:
            sin(x)/x = 1 - x2/3! + x4/5! - x6/7! + ...  
so if x is small enough so that 1-x2/3! is indistinguishable from 1, then sin(x)/x will be indistinguishable from 1 anyway. So we can say:
(2)         if (1. + x*x*(1./6.) == 1.)
                return 1.;
            else
                return sin(x)/x; 
Since there is a lot of slack between (1) and (2) (either of which should theoretically work), I suggest the following very simple test:
            if (1. + x*x == 1.)
                return 1.;
            else
                return sin(x)/x; 

(1-cos(x))/x

The logic here is similar to that for sin(x)/x above.
            (1-cos(x))/x = x/2! - x3/4! + x5/6! - x7/8! + ...  
so if x is so small that x/2 - x3/24 is indistinguishable from x/2, then the result will be indistinguishable from x/2. This is pretty much the same as 1/2 - x2/24 being indistinguishable from 1/2, but maybe off by as much as 1 ULP (if x is just on the wrong side of a power of 2, or something) so we could be conservative and test whether 1/2 - x2/12 is indistinguishable from 1/2 instead. This is exactly the same as asking whether 1 - x2/6 is indistinguishable from 1 (coincidentally, the exact same condition we arrived at for sin(x)/x above). So the simpler (and slightly harder to satisfy) test should be fine:
            if (1. + x*x == 1.)
                return .5*x;
            else
                return cosf1(x)/x; // where cosf1(x) is the robust implementation
                                   // of 1-cos(x) described earlier 

angle between unit vectors

Don't ever use cos-1(u dot v)! If you ever see yourself taking the inverse cosine of a number close to 1 (or -1), you have usually done something bad: at worst, the argument will be slightly > 1 due to floating-point roundoff error, in which case you get a divide-by-zero error; but even if you avoid this by clamping to 1 before taking the inverse cosine, you are still usually throwing away half the bits (or more) that you could have had.

The following works well:

          2*atan2(|u-v|/2, |u+v|/2)
        = 2*atan2(|u-v|, |u+v|) 
however, it ends up requiring more square roots than necessary, so use this instead.
        if (dot(u,v) < 0.)
            return M_PI - 2*asin(|-v-u|/2)
        else
            return 2*asin(|v-u|/2) 

slerp (spherical linear interpolation) and qslerp (quaternion interpolation)

We know from Shoemake's article [Ken Shoemake, Animating rotation with quaternion curves in SIGGraph '85 proceedings] that spherical linear interpolation between unit vectors (or unit quaternions) v0, v1 can be computed as:
                          sin((1-t)*Ang)       sin(t*Ang)
(1)     slerp(v0,v1,t) = ----------------*v0 + -----------*v1
                             sin(Ang)           sin(Ang) 
where Ang is the angle between v0 and v1.

This is very elegant, but there are at least three things to watch out for:

  1. Calculating Ang robustly, especially when near 0 or 180 degrees. A numerically robust way to find Ang is described earlier on this page (“angle between unit vectors”).
  2. When Ang is 0 degrees or close to it, the above expression is a bunch of zero-divided-by-zeros. But note that we can rewrite it to take advantage of tools we already have:
                      sin((1-t)*Ang)/((1-t)*Ang)              sin(t*Ang)/(t*Ang)
    slerp(v0,v1,t) = ----------------------------*(1-t)*v0 + --------------------*t*v0
                            sin(Ang)/Ang                        sin(Ang)/Ang
    
                           sin_over_x((1-t)*Ang)              sin_over_x(t*Ang)
    (2)            =      -----------------------*(1-t)*v0 + -------------------*t*v1
                              sin_over_x(Ang)                  sin_over_x(Ang) 
    where sin_over_x(x) is the robust implementation of sin(x)/x described earlier on this page. Note that this final expression behaves very nicely when Ang is close to 0: all the sin_over_x's approach 1, and so the whole expression approaches (1-t)*v0 + t*v1, i.e. linear interpolation, as it should.
  3. When Ang is 180 degrees or close to it, both expressions (1) and (2) fall apart (infinity minus infinity). And for good reason: v0 and v1 point in opposite directions, so the answer is ambiguous (but not completely arbitrary! -- there is only one degree of freedom); furthermore, even if a particular path is chosen, there is no way to express any solution as a linear combination of v0 and v1, so no clever rewriting of (2) can help.

    In this case we can pick an arbitrary vector v.5 perpendicular to v0 (and hence v1), and rewrite the problem as:

    (3)         slerp(v0,v1,t) = slerp(v0,v.5,2*t),        if t < .5
                                 slerp(v.5,v1,2*(t-.5))    if t >= .5.
                
    Unfortunately I don't see a graceful way to ease into this case (among other problems, it's impossible to come up with a continuous vector function f(v) from the set of all unit vectors to itself such that f(v) is perpendicular to v for all v; this is the unsolvable “combing a fuzzy sphere” problem). The consequences of using (2) when v0 and v1 differ by close to 180 degrees can be unstable arithmetic, resulting in answers that are truly bad (i.e. don't represent *any* of the infinitely many legitimate spherical linear paths from v0 to v1), so we would do well to switch to avoid it in favor of method (3) fairly early. The best I can suggest is to use a semi-arbitrary threshold test, e.g. if (1. + |v1+v0|4 == 1.) then use (3). (On my computer, using double-precision arithmetic, this is equivalent to roughly |v1+v0| < 1e-5 or so.) Note that splitting (3) into two cases guarantees that the beginning and end of the path will be v0 and v1 respectively (whereas just using slerp(v0,v.5,2*t) may result in missing v1 by more than necessary).

    (XXX actually I think we can do a bit better than this by *always* using (3), with a vector v.5 perpendicular to v0 and v1, chosen using some clever application of atan2 that gracefully degrades into semi-randomness but still always produces a vector exactly half way between v0 and v1.)
    (XXX Or, come up with a vector 90 degrees away from v0 in the direction of v1 (gracefully degrading into some random direction when v1 = v0 or -v0) and slerp between v0 and that instead of v0 and v1. But then the clever equations (1) and (2) become completely irrelevant :-) Actually, I think this is the right way to do it and this section should be rewritten...)

    Fortunately for qslerp this case doesn't happen in practice: if q is a unit quaternion then -q represents the same orientation as q, so when qslerping from q0 to q1 we usually negate either q0 or q1 if their dot product is < 0 (i.e. when their angular distance on the 4D sphere is > 90 degrees, i.e. their distance as a 3D rotation is > 180 degrees around some axis), and then (2) becomes stable.


rotation matrix to euler-angles conversion

Several implementations of this have been published which give wrong answers in cases when there is more than one solution (e.g. Shoemake, Euler Angle Conversion, in Graphics Gems IV).

For definiteness, let's consider heading/pitch/roll Euler angles (or yaw/pitch/roll if you like). The subtlety occurs when the pitch is 90 degrees (or close to it): in this case any heading value could alternatively be expressed as roll, and vice-versa. So, if you are asked to decompose a rotation (matrix) in which the pitch is 90 degrees, you can choose the heading arbitrarily and then choose the roll in terms of it (or vice-versa).

Notice that in this case there is one degree of freedom in your answer, but not two! So if you have determined that the pitch is 90 degrees (or sufficiently close), you can't just throw your hands up in the air and return arbitrary numbers (e.g. zero) for both heading and roll; in general, you won't get a representation of the original matrix, which would be a bad bug in your implementation. You can return an arbitrary number for either heading or roll, but then that determines the other.

A good robust way do do this is to start by choosing the heading (or roll) using the atan2 function as usual. Note that if the pitch is 90 degrees or close to it, then you will be taking the atan2 of two numbers very close (or equal) to zero, which is unstable of course, but the worst atan2 can do is return a random angle, which is fine, and even appropriate, in this case. The next step is to create a rotation matrix representing the inverse of the heading (or roll) you have chosen, and multiply it by the original matrix on the appropriate side, effectively factoring out your chosen heading (or roll). Now you are left with a simpler matrix, from which you can robustly extract the remaining two euler angles, and when you compose them all together you are guaranteed to get back the original matrix.


printing floats or doubles exactly

If using printf (C or C++), use %.9g for floats and %.17g for doubles (i.e. 9 or 17 significant digits, respectively).
Reasoning: XXX to be done

drawing arcs of big or infinite circles

XXX to be done

Last Modified: Fri Jul 4 02:17:56 PDT 2003
Don Hatch
hatch@plunk.org
Back to Don Hatch's home page.