So I stumbled upon Michael Steil’s article on Bill Gate’s Easter Egg in Microsoft Basic. While the story of burying an authorship notice among the numeric coefficients of the sine function is quite amusing in itself, my numeric eye’s attention was immediately drawn to the comments for these coefficients, first used in Microsoft’s 40-bit (5-byte) floating point Basic for the KIM and PET computers.

.;E063 05                        6 coefficients for SIN()
.;E064 84 E6 1A 2D 1B            -((2*PI)**11)/11! = -14.3813907
.;E069 86 28 07 FB F8             ((2*PI)**9)/9!   =  42.0077971
.;E06E 87 99 68 89 01            -((2*PI)**7)/7!   = -76.7041703
.;E073 87 23 35 DF E1             ((2*PI)**5)/5!   =  81.6052237
.;E078 86 A5 5D E7 28            -((2*PI)**3)/3!   = -41.3147021
.;E07D 83 49 0F DA A2               2*PI           =  6.28318531
.;E082 A1 54 46 8F 13            "SOFT!" | backwards and with
.;E087 8F 52 43 89 CD            "MICRO" | random upper bits

This was in 1979. Even with range reduction to the first quadrant \(0\le x\le\frac{\pi}{2}\), using an \(n\)th order truncated Taylor series

\[\sin(2\pi t)\approx\sum_{i=0}^{(n-1)/2}(-1)^{i}\frac{\;\;(2\pi t)^{2i+1}}{(2i+1)!};\;\; x\equiv 2\pi t\]

with its rapidly increasing truncation error for increasing arguments doesn’t exactly look plausible (more details on why in Part 2). Besides, back then, every self-respecting developer of math functions for calculators and computers implemented trigonometrics using CORDIC, didn’t they?

On the other hand, four years later still, the 1983 edition of Dr. Dobbs Journal (Vol 8, pp 15+16/810 in the PDF – 400MB download) recommended using the (4,3) Padé approximant

\[\sin(x)\approx S(x) := \frac{x-\frac{14}{120}x^3}{1+\frac{1}{20}x^2}\]

as being “accurate to about 1 part in 10,000 over the interval \(0\le x\le\frac{\pi}{4}\), a degree of precision which is adequate for typical microcomputer applications“. For \(\frac{\pi}{4}\le x\le\frac{\pi}{2}\), they would compute \(\small\sqrt{1-S(\frac{\pi}{2}-x)^2}\); for more details see the article. Note that this looks like a (3,2) approximant, but the even coefficients in the numerator and the odd coefficients in the denominator are 0, so the actual order is indeed (4,3).

Assuming that this notion of “adequacy” was common sense in the early 80s, no surprise colleagues dissuaded me from buying a microcomputer for doing physics calculations around 1980.

But are these numbers really the Taylor coefficients?

  Taylor term         Taylor value   Coefficient
    2*PI              6.283185307    6.28318531
 -((2*PI)**3)/3!     -41.34170224   -41.3147021
  ((2*PI)**5)/5!      81.60524928    81.6052237
 -((2*PI)**7)/7!     -76.70585975   -76.7041703
  ((2*PI)**9)/9!      42.05869394    42.0077971
 -((2*PI)**11)/11!   -15.09464258   -14.3813907

These deviations do not look fortuitous. Let us compare the absolute errors \(\left|f(x)-\sin(x)\right|\) of Microsoft’s polynomial with that of the truncated Taylor series (both 11th degree), and – just for fun – of Dr. Dobb’s (4,3) Padé approximant. Semilogarithmic plot \(0\le x \le\pi\) – reference is the NumPy IEEE double precision implementation of \(\sin(x)\):

We see that in the first octant \(0\le x \le\frac{\pi}{4}\approx 0.79\) Dr. Dobb’s claim to accuracy \(10^{-4}\) is satisfied. For Microsoft’s power series (approximation domain \(0\le x \le\frac{\pi}{2}\)), the error is way too high for an 11th order approximation. Were my colleagues right about avoiding microcomputers? But back then, Microsoft Basic was ubiquitous even in scientific applications – someone must have noticed if the results were off so badly.

So maybe it’s the comments that were off. Microsoft’s 40-bit encoding of floating point numbers is well-documented, so it is easy to decode and compare the actual coefficients:

power  true (decoded)      comment     error
1      6.28318530694      6.28318531   -3.06e-09
3     -41.3417021036     -41.3147021   -0.027  <<<<<<
           ^^                 ^^
5      81.6052236855      81.6052237   -1.45e-08
7     -76.7041702569     -76.7041703    4.31e-08
9      42.007797122       42.0077971    2.20e-08
11    -14.3813906722     -14.3813907    2.78e-08

Indeed we find two transposed digits in the \(x^3\) coefficient.

What follows is the corrected plot (c_ and t_ referring to coefficients as written in the comments and their true values, and cc_ to the corrected comment value after fixing the transposition). For reference, the results of a 45-step CORDIC algorithm have been added; and note that the right half of the graph \(\frac{\pi}{2}\le x\le\pi\) is outside the domain of all of these approximations):

Note also that due to IEEE double precision numbers being used here, roundoff error becomes visible around \(10^{-15}\) instead of \(10^{-9}\).

Finally, given the requirement that a function approximation should consistently yield a specified number of correct significant digits even close to its zeros, what really matters is the relative error:

We see that across the entire first quadrant \(0\le x\le\frac{\pi}{2}\), the relative error of Microsoft’s 11th order polynomial as implemented (t_mspetkim_5_11) is well-balanced and less than \(10^{-10}\), thus consistently within the limits of the 9.6-digit numerical accuracy of 40-bit floats with their 32-bit mantissa \((32 \lg 2\approx 9.6)\), and it is substantially better than the 11th order Taylor error in the right half of this interval. We also see that Cordic’s relative error is inadequate when implemented naively. Somewhat surprising is that after correcting the transposed digits (cc_mspetkim_5_11), the values obtained with the coefficients in the comments show practically constant relative error in the entire interval \([0,\frac{\pi}{2}]\). However, this error is significantly worse than t_mspetkim_5_11.

Conclusion: someone had thoroughly done their homework devising the MS-Basic coefficients. We leave it to the reader’s discretion if the transposition in the comments was a bona fide mistake or a deliberate act of concealment of trade secrets. Or, what is probably closest to the truth as we shall see later, the comments referring to Taylor coefficients were added later, not by the original authors, who obviously knew their approximation theory.

In Part 2, we’ll start to delve into the effects of truncation and roundoff error, more MS-Basic implementations, and some theoretical background of these findings.

Notes

The Easter Egg article contains links to several versions of the source codes. The first release of Microsoft Basic for the Altair used 32-bit (4-byte) floats (6 significant decimal digits), a cursory analysis of some aspects of the algorithms (briefly noting that the coefficients are modified with respect to Taylor but not digging deeper) can be found on the Altairbasic web site.