Jacques Charles François Sturm's eponymous theorem is chiefly remarkable for its formulation. Proved in 1829, it was considered obscure. It gives an algebraic decision procedure (an algorithm) for determining the number of real roots of a polynomial as well as their locations. As a practical algorithm, it's not too useful (the numbers involved grow very quickly, and high accuracy must be maintained). But as an algorithm for proving roots exist, or for finding bounds on some parameters of a polynomial to get roots, it can be useful.
I assume Sturm intended his theorem as a tool for working on differential equations. Various polynomial characteristics exist for these. knowing whether a real root exists on some interval is important for determining the qualitative behaviour of solutions.
But that such an algorithm even exists is decidedly non-trivial. Indeed, over a century later its importance was recognized by Alfred Tarski, who used it as a chief tool in creating a decision procedure for checking propositions in the language of algebra on the real numbers. (This will (I hope) be covered in a later node...)
Definition. Let P(x)∈R[x] be a separable polynomial in one variable x with real coefficients. The Sturm chain of polynomials is the chain of polynomials defined by
- P0(x) = P(x).
- P1(x) = P'(x).
- Pk+1(x) = - (Pk-1(x) % Pk(x)),
- where (borrowing from C and C-like languages...) a(x) % b(x) is the remainder when dividing the polynomial a(x) by b(x) (e.g. by synthetic division).
Note that the Sturm chain is in practice finite: If the degree deg P(x) = n, then ∀k≤n: deg Pk(x) = n-k, so Pn is constant, and all subsequent Pks are zero. The theorem only looks at changes of sign, so only the first n+1 elements of the chain can be of interest.
Theorem. Let P(x)∈R[x] be a separable polynomial in one variable x with real coefficients, and let the Sturm chain P0(x), P1(x), ... be as above. For all t∈R, let N(t)∈N0 be the number of sign changes in the chain P0(t), P1(t), ... . Then for any interval I=[a,b] with P(a), P(b) ≠ 0, the number of roots of P(x) in I is N(a)-N(b).
We determine the number and approximate locations of the roots of the polynomial P(x)=x5-5x2+3. This polynomial is separable (has no repeated roots), as can be seen e.g. by following the method described in Noether's "separable" writeup, and verifying that P(x) and P'(x) share no common factor.
Calculation (courtesy of Emacs' calc package) yields this Sturm chain:
P0(x) = x5-5x2+3
P1(x) = 5x4-10x
P2(x) = 3x2-3
P3(x) = 10x-5
P4(x) = 9/4.
How many real roots are there? It is easy to see
that there can be no roots with absolute value
greater than 2. We calculate N(-2) and N(2). The signs of the chain at t=-2 are -++-+, giving N-2
=3; the signs of the chain at t=2 are +++++, giving N2
=0, so there are 3 real roots.
We can also calculate N-1=3 (signs -+0-+), N0=2 (signs +0--+) and N1=1 (signs --0++). So there is 1 root between -1 and 0, 1 root betwen 0 and 1, and 1 root between 1 and 2. If we like, we can continue in this fashion -- but now that we have intervals each containing one root, it's faster to use any iterative method for finding roots, such as the Newton-Raphson method or the "regula falsi" method (or, better yet, one of their accelerated versions -- we're seeking the roots of a polynomial, which is an exceedingly well-behaved function!).
By comparison, we could have directly used the intermediate value theorem. It immediately implies that there is a real root (for any polynomial of odd degree). In fact, with a bit more work we see that any separable polynomial of odd degree has an odd number of real roots. But we'd have to guess the location of the roots. Knowing that there is one root we'd have to guess that there are two more roots. Even knowing that there are 3, we'd be pushed to show that there aren't 5! For higher degree polynomials, the problem becomes progressively higher; Sturm's theorem is the easy way out.