proof of Sturm's theorem

The proof of Sturm's theorem is of less interest than the theorem itself. While the theorem is interesting for its applications (in logic, in some calculations) and for its novelty, the proof is interesting due to the weird and wonderful mixture of algebra and elementary real analysis to get at the desired result. The mixture is almost inevitable, given its echoes in the formulation of the theorem.

This proof uses the concepts and terminology of the theorem; why not read it first?

Let P(x) be a real separable polynomial, and let P₀(x)=P(x), P₁(x)=P'(x), ... be its Sturm chain and N(x) be the number of sign changes in the chain at point x. To prove that the number of roots in an interval [a,b] with P(a),P(b)≠0 is N(a)-N(b), we need to show that N(x) is constant except at roots of P, and that it decreases by one at each root.

Lemma. For any t, the Sturm chain cannot have two consecutive zeroes at t.
Proof. We prove by induction on i that P_i-1(t) and P_i(t) cannot both be zero.

For i=1:

P₀(x)=P(x) and P₁(x)=P'(x). By our assumption that P(x) is separable (see Noether's writeup there), P(x) and P'(x) have no common factor. In particular, if P(t)=P'(t)=0 then x-t is a common factor (and therefore t is a root with multiplicity >1 and P(x) is not separable). Thus P₀(t) and P₁(t) cannot both be 0.

For i>1:

Suppose we had P_i(t)=P_i+1(t)=0. We know that
P_i-1(x) = q(x)⋅P_i(x) - P_i+1(x), (*)
so then P_i-1(t)=0 too, contrary to the induction hypothesis.

To prove the theorem, imagine a left to right sweep of the real number line. Polynomials are continuous functions, so clearly N(x) cannot change except when passing through a root of one of the P_i(x)'s. We need to show it decreases by 1 for a root of P₀(x), and stays constant for a root of P_i(x), i>0.

When P(a)=P₀(a)=0:

If P(x) switches from being positive to being negative when passing through a, then it is (locally) decreasing there, and P'(x)<0 in the neighborhood of a (recall that -- by our lemma -- P'(a)≠0). Thus, the chain of signs switches from "+-..." to "--..." when passing through a, so N decreases by 1.

If P(x) switches from being negative to being positive at a, then it is (locally) increasing there, and P'(x)>0 near a. This time the sign of the chain switches from "-+..." to "++...", and again N decreases by 1 when passing through a.

The value of N at a itself is of no interest, as the theorem specifically does not apply there when one of the endpoints is a root.

When P_i(a)=0, i>0:

From (*) we have that

P_i-1(a) = 0 - P_i+1(a),

and from the lemma we know that P_i-1(a), P_i+1(a) ≠ 0. So, passing through a, the chain of signs either switches from "...++-..." to "...+--...", or from "...+--..." to "...++-...", or from "...-++..." to "...--+...", or from "...--+..." to "...-++". Each of these patterns of signs contributes exactly one sign change. So the total number of sign changes remains the same, passing through a.

Even at a, the pattern of signs is "...-0+..." or "...+0-...", and contributes one sign change. So N(x) is constant in the neighborhood of a.

That's it! A little mixture of real analysis and algebra, and Sturm's theorem is proved.

Recommended Reading

About Everything2

User Picks

Editor Picks

New Writeups

Sturm's theorem	proof of the ham sandwich theorem	Separable	algebra
Intermediate Value Theorem	Four color theorem	Polynomial	continuous
Strom Thurmond	real	constant