Solving cubic polynomials: Difference between revisions
| Line 153: | Line 153: | ||
This approach has been implemented in plain C code as {{srcbranch|python3|models/johnpye/fprops/cubicroots.c}}. The code includes debugging output that be turned off by disabling <code>CUBICROOTS_DEBUG</code>. There is also a test suite that can be compiled by passing <code>-DTEST</code> to the compiler. Currently this code is embedded within [[FPROPS]], but it is trivial to convert it to a standalone code. | This approach has been implemented in plain C code as {{srcbranch|python3|models/johnpye/fprops/cubicroots.c}}. The code includes debugging output that be turned off by disabling <code>CUBICROOTS_DEBUG</code>. There is also a test suite that can be compiled by passing <code>-DTEST</code> to the compiler. Currently this code is embedded within [[FPROPS]], but it is trivial to convert it to a standalone code. | ||
The algorithm generally gives solutions with absolute tolerance of <math>\pm 10^{-11}</math>, in some specific cases, such as those with double or near-double roots, does a bit worse, due to limitations of double-precision floating point arithmetic. | |||
One example of a base case is | |||
:<math>\left(x + 1000\right)\left(x - 1000\right)\left(x - 1001\right) = 0</math> | |||
:<math>x^3 -1001 x^2 -1000000 x+ 1001000000 = 0 </math> | |||
This appears to fail because <math>P>>0</math> and <math>Q<<0</math>, causing <math>D<<<0</math> but at the same time<math>\frac{\left|Q\right|}{\sqrt{\left|P\right|^3}} \approx 1</math>. The precision with which <math>\frac{QS}{\left|P\right|\sqrt{\left|P\right|}}</math> can be evaluated appears to be suffering. | |||
This can possibly be improved by a variable substitution such as <math>w = x/\sqrt{\left|P\right|}</math>. | |||
== Calculation checking == | == Calculation checking == | ||
Revision as of 06:09, 3 July 2022
The purpose of this page is to describe a robust process for finding real roots of a real cubic polynomial. It turns out that this is one of the great problems in the history of mathematics, and not at all simple, as this video helpfully explains. This page attempts to stick to the essential details and describe how to implement a solution suitable for coding.
General solution
We will consider the monic cubic polynomial in <math>x</math>, with <math>a</math>, <math>b</math> and <math>c</math> real-valued:
- <math> x^3 + a x^2 + b x + c = 0 </math>
To solve for <math>x</math>, we first calculate the coefficients of an equivalent 'depressed cubic' <math>y^3 - 3P y - 2Q = 0</math>, without loss of generality, by substituting <math>x = y - a/3</math>:
- <math>3P = \frac{a^2}{3} - b</math>
- <math>2Q = -\frac{2}{27} a^3 + \frac{1}{3} ab - c</math>
If <math>P</math> and <math>Q</math> are both zero, then we can stop here with a triple root at <math>y = 0</math>, which is equivalent to <math>x = -\frac{a}{3}</math>.
If only <math>P</math> is zero, then <math>y^3 = 2Q</math>, so <math>x = \sqrt[3]{2Q}-\frac{a}{3}</math> is the single real root (regardless of the sign of <math>Q</math>).
If only <math>Q</math> is zero, the depressed cubic is simply <math>y^3 - 3Py = y \left(y^2 - 3P \right) = 0</math>, which has a real root at <math>y=0</math> and two more at <math>y = \pm \sqrt{3P}</math>, which may be either real or imaginary, depending on the sign of <math>P</math>.
Otherwise, the depressed cubic can then be converted into a quadratic in <math>z^3</math> via the 'magic' Vieta substitution <math>y = z + \frac{P}{z}</math>,
- <math>\left(z^3\right)^2 - 2Q \left(z^3\right) + P^3 = 0</math>
for which the solution is
- <math>z^3 = Q \pm \sqrt{Q^2 - P^3}</math>(1)
The original cubics (in <math>x</math> and <math>y</math>) have real coefficients, which means that any complex roots must appear with their complex conjugate. As such, the cubic equations must have either one or three real roots, and two or zero complex roots. The determinant <math>\Delta = Q^2 - P^3</math> is critical in determining which case arises. (Note that the discriminant can also be arrived at via another path with the same result.)
Case of negative determinant <math>\Delta<0</math>
If <math>\Delta = Q^2 - P^3 < 0</math> then <math>P^3</math> and hence <math>P</math> must be positive, and we must enter the complex domain to solve
- <math>z^3 = R</math> or <math>z^3 = \overline{R}</math>
where, using <math>\sqrt{\Delta}=-i\sqrt{-\Delta}</math>,
- <math>R = Q \pm i \sqrt{P^3 - Q^2}</math>
or equivalently,
- <math>\left|R\right| = \sqrt{Q^2 + P^3 - Q^2} = \sqrt{P^3} </math>
- <math>\arg R = \pm\theta</math>, where
- <math>\theta = \cos^{-1} \left(\frac{Q}{\sqrt{P^3}}\right)</math>.
The solution for <math>z</math> is then
- <math>z = \sqrt{P} \exp \left(i \frac{\pm \theta + 2\pi k}{3}\right)</math> where <math>k \in \left\lbrace 0,1,2 \right\rbrace</math>, for <math>z^3 = R</math>
There are six values for <math>z</math> here, arising from combining the three roots <math>k \in \left\lbrace 0,1,2 \right\rbrace</math> and the <math>\pm</math> cases for <math>\theta</math>.
Using these values of <math>z</math>, we can now evaluate the solutions <math>y</math> of the depressed cubic, via <math>y = z + \frac{P}{z}</math>. Here, we make use of the facts that <math>\frac{1}{r e^{i\theta}} = \frac{1}{r} e^{-i\theta}</math> and <math>e^{i\phi}+e^{-i\phi}=2\cos\phi</math>
- <math>
\begin{alignat}{2} y &= \sqrt{P} \exp \left(i \frac{\pm \theta + 2\pi k}{3}\right) + \frac{P}{\sqrt{P} \exp \left(i \frac{\pm \theta + 2\pi k}{3}\right)} \\ &= \sqrt{P} \exp \left(i \frac{\pm \theta + 2\pi k}{3}\right) + \sqrt{P} \exp \left(-i \frac{\pm \theta + 2\pi k}{3}\right) \\ &= 2\sqrt{P} \cos \left(\frac{\pm \theta + 2\pi k}{3}\right) \end{alignat} </math>
It can readily be seen that the six solutions have reduced to three, because <math>\cos\left(-\phi\right)=\cos\phi</math> and <math>\cos\left(\phi+2\pi\right)=\cos\phi</math>, and also that all of the solutions are real. We can drop the <math>\pm</math>, therefore, and write:
- <math>y = 2\sqrt{P} \cos \left(\frac{\theta + 2\pi k}{3}\right)</math>, with <math>k \in \left\lbrace 0,1,2 \right\rbrace</math>,
Finally, substituting <math>y</math> and <math>\theta</math>, a final expression for <math>x</math> is:
- <math>x = -\frac{a}{3} + 2\sqrt{P} \cos \left(\frac{1}{3}\cos^{-1} \left(\frac{Q}{\sqrt{P^3}}\right) + \frac{2\pi k}{3}\right)</math>, with <math>k \in \left\lbrace 0,1,2 \right\rbrace</math>.
After allowing for differences in notation, this solution can be seen to be identical to the one obtained via use of the cosine triple-angle identity.
Case of positive determinant <math>\Delta>0</math>
When <math>\Delta = Q^2 - P^3 > 0</math>, we can no longer infer that <math>P > 0</math>, and the complex domain is less helpful. The solution arises from the use of two 'triple angle' hyperbolic trigonometric identities, which can be readily derived from the hyperbolic trigonometric summation formulae:

- <math>\cosh 3\theta = 4 \cosh^3 \theta - 3 \cosh \theta</math>
- <math>\sinh 3\theta = 4 \sinh^3 \theta + 3 \sinh \theta</math>
We would like to get our depressed polynomial <math>y^3 - 3Py - 2Q = 0</math> into a form similar to either of the above. We can substitute <math>y = u~\mathrm{hyp}\left(\theta\right)</math> (where <math>\mathrm{hyp}</math> is an unspecified hyperbolic trigonometric function) to obtain:
- <math>u^3~\mathrm{hyp}^3\left(\theta\right) - 3 P u~\mathrm{hyp}\left(\theta\right) - 2 Q = 0</math>
Then multiply the equation by <math>4/u^3</math> (requiring that <math>u \ne 0</math>):
- <math>4 \mathrm{hyp}^3\left(\theta\right) - 3 \frac{4P}{u^2} \mathrm{hyp}\left(\theta\right) - \frac{8Q}{u^3} = 0</math>
Now, if <math>P > 0</math>, we can require that <math>\frac{4P}{u^2} = 1</math> and use <math>\mathrm{hyp}=\cosh</math>. Or, if <math>P < 0</math>, we can require that <math>{4P}/{u^2} = -1</math> and <math>\mathrm{hyp}=\sinh</math>. Together,
- <math>\cosh 3\theta = 4 \cosh^3 \theta - 3 \cosh \theta = \frac{8Q}{u^3} \text{ (for }P>0\text{), or}</math>
- <math>\sinh 3\theta = 4 \sinh^3 \theta + 3 \sinh \theta = \frac{8Q}{u^3}</math> (for <math>P<0</math>).
There are no problems with the <math>P<0</math> case, because <math>\sinh^{-1}</math> is always defined, and with <math>u=2\sqrt{\left|P\right|}</math>, we get <math>\frac{8Q}{u^3} = \frac{Q}{\sqrt{\left|P\right|^3}}</math>.
However, for the <math>P>0</math> case, <math>\cosh^{-1}</math> will only be defined when <math>\frac{8Q}{u^3}\ge 1</math>. We carefully choose <math>u = 2\sqrt{\left|P\right|}\sgn{Q}</math>, which means that our condition is
- <math>\frac{8Q}{u^3} = \frac{8 \left|Q\right| \sgn{Q}}{4P \cdot 2\sqrt{\left|P\right|}\sgn{Q}} = \frac{\left|Q\right|}{\sqrt{\left|P\right|^3}} = \frac{Q\sgn Q}{\sqrt{\left|P\right|^3}} \ge 1</math>
Given that <math>\Delta = Q^2-P^3 > 0</math> and <math>P>0</math>, hence <math>\left|Q\right| > \sqrt{\left|P\right|^3}</math>, we can see that this condition is always satisfied with the chosen <math>u</math>.
The overall result for <math>\Delta > 0</math> can then be written, using that <math>\left|Q\right| = Q \sgn{Q}</math>:
- <math>y = 2S\sqrt{\left|P\right|}~\mathrm{hyp}\left(\frac{1}{3}~\mathrm{hyp}^{-1} \frac{QS}{\sqrt{\left|P\right|^3}}\right)
\text{ where } \begin{cases} \text{if }P>0 \text{,}& \mathrm{hyp}=\cosh\text{ and } S = \sgn{Q} \\ \text{if }P<0 \text{,}& \mathrm{hyp}=\sinh\text{ and } S = 1 \end{cases} </math>
and hence
- <math>x = -\frac{a}{3} + 2S\sqrt{\left|P\right|}~\mathrm{hyp}\left(\frac{1}{3}~\mathrm{hyp}^{-1} \frac{QS}{\sqrt{\left|P\right|^3}}\right)
\text{ where } \begin{cases} \text{if }P>0 \text{,}& \mathrm{hyp}=\cosh\text{ and } S = \sgn{Q} \\ \text{if }P<0 \text{,}& \mathrm{hyp}=\sinh\text{ and } S = 1 \end{cases} </math>
The case <math>P=0</math> was already handled earlier. The above solution is a real root to the original cubic for the case <math>\Delta > 0</math>. However, it remains to be shown here why there are no other real roots in this case.
Collapsing the cases
The cases of <math>Q=P=0</math>, as well as <math>P=0</math> and <math>Q=0</math>, are all very simple, as noted above.
For <math>\Delta < 0</math>, note that it always be the case that <math>P>0</math>, hence <math>\left|P\right| = P</math> and we have:
- <math>x = -\frac{a}{3} + 2\sqrt{\left|P\right|} \cos \left(\frac{1}{3}\cos^{-1} \left(\frac{Q}{\sqrt{\left|P\right|^3}}\right) + \frac{2\pi k}{3}\right)</math>, with <math>k \in \left\lbrace 0,1,2 \right\rbrace</math>.
For <math>\Delta > 0</math>, we obtained
- <math>x = -\frac{a}{3} + 2S\sqrt{\left|P\right|}~\mathrm{hyp}\left(\frac{1}{3}~\mathrm{hyp}^{-1} \frac{QS}{\sqrt{\left|P\right|^3}}\right)
\text{ where } \begin{cases} \text{if }P>0 \text{,}& \mathrm{hyp}=\cosh\text{ and } S = \sgn{Q} \\ \text{if }P<0 \text{,}& \mathrm{hyp}=\sinh\text{ and } S = 1 \end{cases} </math>
A unified equation is therefore
- <math>x = -\frac{a}{3} + 2S\sqrt{\left|P\right|}~\mathrm{trig}\left(\frac{1}{3}~\mathrm{trig}^{-1} \left(\frac{QS}{\left|P\right|\sqrt{\left|P\right|}}\right)+\frac{2\pi k}{3}\right)
\text{ where } \begin{cases} \text{if }\Delta \le 0\text{,} & \mathrm{trig}=\cos\text{, } S = 1 \text{, and }k \in \left\lbrace0,1,2\right\rbrace \\ \text{if }\Delta > 0\text{ and }P>0\text{,} & \mathrm{trig}=\cosh\text{, } S = \sgn{Q} \text{, and }k=0 \\ \text{if }\Delta > 0\text{ and }P<0\text{,} & \mathrm{trig}=\sinh\text{, } S = 1 \text{, and }k=0 \\ \end{cases} </math>
The case of <math>\Delta=0</math> gives three real roots, two of which are a double real root, and can be handled by the <math>\cos</math> branch of the equation above. For <math>Q > 0</math>, then the single root will be that with <math>k=0</math>, because <math>\cos^{-1}\theta = 0</math>. But it <math>Q < 0</math> then the single root will be at <math>k=1</math>, because <math>\cos^{-1}\theta = \pi</math>.
Implementation in C
This approach has been implemented in plain C code as python3:models/johnpye/fprops/cubicroots.c. The code includes debugging output that be turned off by disabling CUBICROOTS_DEBUG. There is also a test suite that can be compiled by passing -DTEST to the compiler. Currently this code is embedded within FPROPS, but it is trivial to convert it to a standalone code.
The algorithm generally gives solutions with absolute tolerance of <math>\pm 10^{-11}</math>, in some specific cases, such as those with double or near-double roots, does a bit worse, due to limitations of double-precision floating point arithmetic.
One example of a base case is
- <math>\left(x + 1000\right)\left(x - 1000\right)\left(x - 1001\right) = 0</math>
- <math>x^3 -1001 x^2 -1000000 x+ 1001000000 = 0 </math>
This appears to fail because <math>P>>0</math> and <math>Q<<0</math>, causing <math>D<<<0</math> but at the same time<math>\frac{\left|Q\right|}{\sqrt{\left|P\right|^3}} \approx 1</math>. The precision with which <math>\frac{QS}{\left|P\right|\sqrt{\left|P\right|}}</math> can be evaluated appears to be suffering.
This can possibly be improved by a variable substitution such as <math>w = x/\sqrt{\left|P\right|}</math>.
Calculation checking
The following sympy code can be used to confirm the above results. Note that this page uses the depressed form <math>y^3-3Py-2Q=0</math>, while most other sources use <math>y^3+3py +q=0</math>, with results that are equivalent but have more constants lying around.
Invalid language.
You need to specify a language like this: <source lang="html">...</source>
Supported languages for syntax highlighting:
a4c, abap, abc, abnf, actionscript, ada, agda, alan, algol, ampl, amtrix, applescript, arc, arm, as400cl, ascend, asciidoc, asp, aspect, assembler, ats, autohotkey, autoit, avenue, awk, ballerina, bat, bbcode, bcpl, bibtex, biferno, bison, blitzbasic, bms, bnf, boo, c, carbon, ceylon, charmm, chill, chpl, clean, clearbasic, clipper, clojure, clp, cmake, cobol, coffeescript, coldfusion, conf, cpp2, critic, crk, crystal, cs_block_regex, csharp, css, d, dart, delphi, diff, dockerfile, dts, dylan, ebnf, ebnf2, eiffel, elixir, elm, email, erb, erlang, euphoria, exapunks, excel, express, factor, fame, fasm, felix, fish, fortran77, fortran90, frink, fsharp, fstab, fx, gambas, gdb, gdscript, go, graphviz, haml, hare, haskell, haxe, hcl, html, httpd, hugo, icon, idl, idlang, inc_luatex, informix, ini, innosetup, interlis, io, jam, jasmin, java, javascript, js_regex, json, jsp, jsx, julia, kotlin, ldif, less, lhs, lilypond, limbo, lindenscript, lisp, logtalk, lotos, lotus, lua, luban, makefile, maple, markdown, matlab, maya, mercury, meson, miranda, mod2, mod3, modelica, moon, ms, msl, mssql, mxml, n3, nasal, nbc, nemerle, netrexx, nginx, nice, nim, nix, nsis, nxc, oberon, objc, ocaml, octave, oorexx, org, os, oz, paradox, pas, pdf, perl, php, pike, pl1, plperl, plpython, pltcl, po, polygen, pony, pov, powershell, pro, progress, ps, psl, pure, purebasic, purescript, pyrex, python, q, qmake, qml, qu, r, rebol, rego, rexx, rnc, rpg, rpl, rst, ruby, rust, s, sam, sas, scad, scala, scilab, scss, shellscript, slim, small, smalltalk, sml, snmp, snobol, solidity, spec, spn, sql, squirrel, styl, svg, swift, sybase, tcl, tcsh, terraform, tex, toml, tsql, tsx, ttcn3, txt, typescript, upc, vala, vb, verilog, vhd, vimscript, vue, wat, whiley, wren, xml, xpp, yaiff, yaml, yaml_ansible, yang, zig, znn
References
Very useful:
- Burkard Polster, 2019 '500 years of NOT teaching THE CUBIC FORMULA. What is it they think you can't handle?' (Youtube)
- RWD Nickalls, 'A new approach to solving the cubic: Cardan's solution revealed', The Mathematical Gazette 77(480), pp. 354-359. doi:10.2307/3619777
- Wikipedia, 'Cubic equation' (accessed 3 Jul 2022).
- EW Weisstein, 'Cubic Formula' in MathWorld--A Wolfram Web Resource (accessed 3 Jul 2022).
Also interesting:
- Cardano, 1545 'Artis Magnae, Sive de Regulis Algebraicis Liber Unus ', PDF. See also wikipedia. (This is the original source for Cardano's formula, but it's quite inscrutible!)
- Lagrange, Joseph-Louis 1771, "Réflexions sur la résolution algébrique des équations: De la résolution des équations du troisième degré", pp. 207-254, in OEuvres complètes, tome 3, 205-421. Via mathdoc.fr, accessed 3 Jul 2022.
- T Zho., D Wang and H Hong, 2011. 'Solution formulas for cubic equations without or with constraints', J. Symbol. Comput. 46, pp. 904-918. doi:10.1016/j.jsc.2011.02.001
- I Baydoun, 2018 'Analytical formula for the roots of the general complex cubic polynomial', arXiv:1512.07585v2
- AT Tiruneh, 2020, 'A simplified expression for the solution of cubic polynomial equations using function evaluation', arXiv:2002.06976v1
- RG Kulkharni, 2022, 'Unified Method for Solving General Polynomial Equations of Degree Less Than Five' arXiv:2201.01197