In Part 2, we delved into the details of how and why truncated Taylor series are not the best choice for approximation elementary functions and found out that MS-Basic used better polynomials than Taylor polynomials. But typical annotations in source code listing seldom noted the obvious deviations from Taylor coefficients, and lacked any hints to the source of these deviations. This is what had sparked my curiosity.

An extensive search eventually led to the original source-code for Microsoft’s GW-BASIC interpreter, as of 1983 (according to the web site). The file GW-BASIC/MATH1.ASM contains the following coefficients (lines merged) for the sine function:

$SINCN:
;       COEFFICIENTS FOR SINE FUNCTION
;       SEE HART'S #3341
;       RELATIVE ERROR 8.27
203 111 017 333     #   6.283185272
206 245 135 341     #  -41.34167747
207 043 064 130     #   81.60223119
207 231 046 145     #  -76.57498378
206 036 327 373     #   39.71091766

These are coefficients (called SINCN here) for a 9th degree polynomial coded as 32-bit MBF floats (vs. 11th degree, 40-bit; byte representation varies between octal and hexadecimal in various sources).The binary values are identical to the 1978 BASIC M6502 8K VER 1.1 BY MICRO-SOFT [SIC] source, which contains both 32-bit MBF coefficients SINCON4 for a 9th degree polynomial and 40-bit SINCON5 for degree 11. The values in above GW-Basic comments are good for a little surprise – see below.

But the actual gem is the reference to “HART'S #3341”. The book Computer Approximations by John F. Hart & al. (1968) is a comprehensive textbook on theory and practice of approximations of elementary and special functions (square and cube roots; exponential, hyperbolic and trigonometric functions and their inverses; gamma, error, and Bessel functions, and elliptic integrals), including mathematical and algorithmic foundations how to arrive at optimal (i.e. minimum effort at a given precision) approximations ab initio, and extensive tables of coefficients for polynomial and rational approximations. Paraphrasing from the preface, this book takes another step (the first being Hasting’s 1955 book Approximations for Digital Computers) of turning the art of computer approximation into a science. I am lucky that my library has Hart’s book for loan.

Chapter 3 describes the theory of Least Minimum Approximations and the Remez algorithm. The idea is if you want to approximate a function \(f(x)\) by a polynomial \(p_n(x)\) with coefficients \(c_0, c_1,\ldots,c_n\) in an interval \([a,b]\), how to choose the coefficients \(c_i\) such that the worst-case error \(||f-g||_\infty := \max_{x\in[a,b]}|(f(x)-g(x)|\) (i.e. maximum absolute value of the error) is minimized. (This is sometimes also called a minimax approximation). Variants for rational approximations and minimizing the relative error are also given.

Almost half of the book consists of tables. Here we find SIN 3341 and SIN 3342, which are effectively (up to a scale transformation) coefficients for 9th and 11th degree approximation polynomials for \(\sin(x)\) that minimize the relative error for \(x\in[0,\frac{\pi}{2}]\). The counterparts for minimal absolute error are SIN 3301 and SIN 3302.

First let us compare the error graphs of the sine polynomial approximations using Hart’s coefficients vs. SINCON of MS Basic. First, Hart’s 11th order polynomials vs SINCON5:

We see that the Hart polynomials are perfectly balanced; as expected the oscillations of SIN 3342 are smaller close to the origin to keep the relative error within consistent bounds. SINCON5 appears to be systematically off; we’ll return to this later.

Due to the symmetry, we can limit our attention to \(0\le x\le\frac{\pi}{2}\) and for the next graph, which compares the relative errors of the 9th and 11th degree approximations, we return to a semilogarithmic graph and additionally evaluate Hart’s polynomials at the respective floating point precisions for which they were used in the code (3341 for 32-bit, and 3342 for 40-bit).

While both SINCON4 and 5 polynomials systematically differ from Hart 3341 and 3342, they lie within the numerical limits of the respective floating point representations MBF32 and MBF40.

Note that the cusps correspond to zeros of the error. The equioscillation theorem states that for sufficiently well-behaved functions, the error \(p_n(x)-f(x)\) will be equal in magnitude with opposite signs at \(n+2\) points \(x\in[a,b]\) (including the ends of the interval), so the error as a function of \(x\) will have \(n+1\) zeros. This is exactly what we expect to see with absolute error of the related Hart 3302 approximation, which has been optimized for absolute error:

Astute readers will notice, though, that there is one too many zero – this is an effect of \(\sin(x)\) actually being approximated by a 5th order polynomial in \(x^2\), i.e. \(S(x)=x P_5(x^2)\), which guarantees that \(S(x)\) is an odd polynomial, including its zero at the origin. Indeed, \(P_5\) has 6 zeros in \([0,\frac{\pi}{2}]\).

For Hart 3342 and relative error, the zero at the origin is turned into a maximum (\(\frac{\mathrm{err}(x)}{\sin(x)} \to \frac{0}{0}\) – L’Hôpital’s rule), so the relative error graph will have one less zero:

Let us return to the quality of SINCON5. Can its deviations from the Hart coefficients be explained by the limited precision of the coefficients represented by MBF? We compare Hart’s SIN 3342 with versions that have coefficients \(c_i\) randomly modified by 5 parts in 1010 (i.e. \(c_{i,r}=c_i\times(1+\mathcal{N})\) where \(\mathcal{N}\) is a Gaussian random variable with \(3\sigma=5\times10^{-10}\)), which corresponds roughly to the expected roundoff error of MBF40.

The orange graphs are 10 random variants of Hart 3342. SINCON5 is qualitatively and quantitatively well within the range of these random variations.

According to various sorces (e.g. the Pagetable article) Monte Davidoff wrote the Math package, and as far as I can tell, (aside from byte order and octal or hexadecimal notations) this has never been touched again, because all published versions of the MS-Basic source code implement identical coefficients SINCON4 and SINCON5.

The numbers given in the comments, though, vary greatly, as illustrated by the following graphs:

Curiously, the GW-Basic comments have accurate values which agree with Hart’s SIN 3341 although the “true” coefficients are less accurate, which raises the question if even these comments might be a later addition.

For the 11th degree approximations, the original coefficients in the comments are way off, in both cases as a result of typos.

t_ = TRUE coefficients converted from binary representation
c_ = coefficients from COMMENTS

c_mspetkim_5
t_sincon5 c_mspetkim_5 error
1 6.28318530694 6.28318531 -3.0600002531855353e-09
3 -41.3417021036 -41.3147021 -0.02700000360000132
^^ ^^
5 81.6052236855 81.6052237 -1.4500002976092219e-08
7 -76.7041702569 -76.7041703 4.310000178975315e-08
9 42.007797122 42.0077971 2.2000001820288162e-08
11 -14.3813906722 -14.3813907 2.7800000523825474e-08

c_m6502mac_sincon5
t_sincon5 c_m6502mac_sincon5 error
1 6.28318530694 6.283185307 -6.000000496442226e-11
3 -41.3417021036 -41.34170209 -1.3600001125269046e-08
5 81.6052236855 81.60522369 -4.5000092541158665e-09
7 -76.7041702569 -76.704133676 -3.658090000158154e-05
9 42.007797122 42.07777095 -0.06997382800000196
^ ^ ^ ^
11 -14.3813906722 -14.381383816 -6.8562000006267e-06

This concludes our musings on the MS-Basic implementation of the sine function and its background. Given the material at hand, the question on the true origin of MS-Basic’s SINCON4 and SINCON5 cannot be answered conclusively. In Part 4, we’ll take a look at the competition.

Additional material: the C64-Wiki identifies Hart’s book as the likely source of MS-Basic’s elementary functions and gives additional insight. Wikipedia’s article on approximation theory gives a good first overview of the topic.

Implementation notes: all graphs were created using Python 3.10, NumPy, Mpmath, and Matplotlib. Coefficients were taken from Hart’s book and published source code. For demonstration of MBF32, MBF40 and FLOATnn effects, no emulation was available, instead the Mpmath precision was set to the number of significant bits of the respective formats.