In mathematics as in a lot of other things there is a lot to be said for changing one's frame of reference from time to time. For example, the
Laplace transform allows us to shift between
differential equations and
algebraic equations, and the
Fourier transform which allows us to strip
periodic functions into groups of similar components. Mobius inversion is an
invertible process that allows us to do similar things in the
discrete world.
To begin with, of course, we need a frame of reference to switch from. Let us set the scene. Let P be a partially ordered set with partial ordering ≤. We say that P is locally finite if, for any two elements x and y of P with x ≤ y, there are only finitely many elements z of P satisfying the relation x ≤ z ≤ y. This is the context within which we shall work.
Let us collect all the real-valued functions f(x,y) for x and y in P with the property that f(x,y) = 0 if it is not the case that x ≤ y. We shall call this collection B. Certainly the sum of two elements of B is again in B. Similarly, if r is a real number and f is in B then rf(x,y) is also in B. So we see that B is a vector space over the real numbers. We take this one step further and define a product on B as follows: for f and g in B write
f*g(x,y) := Σx ≤ z ≤ yf(x,z)g(z,y).
This is where the assumption that P is locally finite comes into play for only under this condition can we guarantee that the above sum is well-defined. Endowed with this product, B becomes an algebra over the real numbers. We call B the incidence algebra of P.
If we let δ(x,y) denote the Kronecker delta function (δ(x,y) = 1 precisely when x = y and is 0 otherwise), we see that δ*f(x,y) = f*δ(x,y) = f(x,y) for all functions f in B. Thus δ serves as the identity element of B. The natural question, then, is how do we characterize the elements of B which are invertible? In other words, for which functions f in B does there exist a function g in B such that f*g = g*f = δ?
Let us first assume that f is invertible in B. Then there is a g in B such that
Σx ≤ z ≤ yf(x,z)g(z,y). = δ(x,y) for all x and y in P (by definition of invertibility). Since δ(x,x) = 1 for all elements x of P, we see that both f and g must have the property that f(x,x) and g(x,x) are nonzero. Now, we ask ourselves, is it sufficient for f to have the above property in order for it to be invertible?
Well, let's take a function f in B with the property that f(x,x) ≠ 0 for all x in P. We shall try to find a g such that f*g = δ. Fix an x in P. Since f(x,x)g(x,x) = δ(x,x) = 1, we see that g(x,x) must be the number f(x,x)-1. Now we need to figure out what g(x,y) is when x < y (that is, x ≤ y but x ≠ y). Note that in this case δ(x,y) = 0. So
Σx ≤ z ≤ yf(x,z)g(z,y) = 0 meaning that
Σx < z ≤ yf(x,z)g(z,y) = -f(x,x)g(x,y) and so
g(x,y) = (-f(x,x))-1Σx < z ≤ yf(x,z)g(z,y).
Since P is locally finite, we can assume by induction that we have found g(z,y) for every z such that x < z ≤ y. Thus the sum on the right hand side of the above equation is actually defined and we have an expression for g(x,y). So we have a function g such that f*g = δ. There is one slight problem, however. f*g and g*f are different creatures. Well, by applying a similar process to the one we used above, we can find a function g' in B such that g'*f = δ. Let us pause for a second and survey the situation:
g' = g'*δ = g'*(f*g) = (g'*f)*g = δ*g = g.
So g does in fact act as an inverse for f. We have therefore characterized the invertible functions in B. This gives us the following theorem.
Theorem. An element f of B is invertible if and only if f(x,x) ≠ 0 for all x in P.
Now, the ζ function on P is a function which encodes all the information about the partial order ≤. It is the function ζ(x,y) which is 1 if x ≤ y and 0 otherwise. Since x ≤ x for all x in P, we see that ζ(x,x) ≠ 0 for all x in P. Thus, by our previous theorem, ζ is an invertible element of B. We denote by μ its inverse function. μ is called the Mobius function of P under the partial ordering ≤. We are now in position to state the Mobius Inversion formula.
The Mobius Inversion Formula. Let P be a locally finite partially ordered set as above with a minimal element which we shall denote by 0 (so 0 ≤ x for all x in P).Let f and g be real-valued functions on P satisfying
g(x) = Σy≤xf(y) for all x in P,
then we have
f(x) = Σy≤xg(y)μ(y,x) for all x in P.
Proof. Note that y ≤ x means that 0 ≤ y ≤ x so the sums given above are well-defined. Let us fix an element x of P. Then
Σy≤xg(y)μ(y,x) = Σy≤x(Σz≤yf(z))μ(y,x)
by the relationship enjoyed by g and f. But we can easily introduce a ζ into the picture since if z ≤ y, ζ(z,y) = 1, and ζ(z,y) = 0 otherwise. So
Σy≤xg(y)μ(y,x) = Σy≤x(Σzf(z)ζ(z,y))μ(y,x).
Now we switch the order of summation which is the standard maneuver to perform when one encounters a double sum. Doing this gives us
Σy≤xg(y)μ(y,x) =
Σzf(z)Σy≤xζ(z,y)μ(y,x).
But Σy≤xζ(z,y)μ(y,x) = ζ*μ(z,x) = δ(z,x) since ζ and μ are inverses. Therefore, finally, we get
Σy≤xg(y)μ(y,x) =
Σzf(z)δ(z,x) = f(x). QED.
In the case where P is the set of natural numbers and ≤ is the partial order given by d ≤ n if d divides n, we get the version of the Mobius Inversion Formula given in the writeup above. Admittedly, that is the context in which the formula is applied the most frequently, and with much success. We can rest easy, however, knowing that the formula is there to bail us out in messier situations, too. For example, if S is a finite set, we can consider its power set 2S under the partial ordering of inclusion. In this case, the Mobius function is given by μ(x,y) = (-1)|y| - |x|. It is an entertaining exercise to show how applying the Mobius inversion formula to this set gives rise to the Principle of Inclusion and Exclusion.
Note: In the context of number theory, the multiplication we have described above is called Dirichlet convolution. The ζ function is usually denoted by u in this situation to avoid confusion.