Got math?

This is the E2 usergroup e2, which was originally a proper subset of the e2science group (e^2e2science). At first, the group name was e π i + 1 = 0, but some whiners thought that would be too hard to send a message:

/msg <var>e</var><sup>_&pi;_<var>i</var></sup>_+_1_=_0 After typing that, I forgot what I was going to say.

So here we are instead with a simpler (but more boring) name e2theipiplus1equalszero. Update: more complainers. Now we're just e^2. (Now does that means e² or e XOR 2 ? That is my secret.) Tough luck for those without a caret key.

e^2 often erupts into long mathematical discussions, giving members more /msgs than they care to digest. So, you have a few other options if the math is going to get seriously hairy:

  • Send to only those members of the group currently online:

    /msg? e^2 Brouwer was a Communist!.
     
  • Speak in the chatterbox. But be prepared to give non-math noders headaches.
  • Add the room Mathspeak to the list of rooms you monitor in squawkbox or Gab Central. Mathspeak died of loneliness.

You may want to read some of these while you are calculating ln π.


Venerable members of this group:

Wntrmute, cjeris, s19aw, Brontosaurus, TanisNikana, abiessu, Siobhan, nol, flyingroc, krimson, Iguanaonastick, Eclectic Scion, haggai, redbaker, wazroth, small, Karl von Johnson, Eidolos, Ryouga, SlackinWhileSleepin, ariels, quantumlemur, futilelord, Leucosia, RPGeek, Anark, ceylonbreakfast, fledy, Oolong@+, DutchDemon, jrn, allispaul, greth, chomps, JavaBean, waverider37, IWhoSawTheFace, DTal, not_crazy_yet, Singing Raven, pandorica, Gorgonzola, memplex, tubular, Tom Rook
This group of 45 members is led by Wntrmute

In the discussion of topological manifolds, one often comes across the useful concept of starting with two manifolds M1, M2 and building a new manifold from them, using the product topology: M1 × M2. A fiber bundle is a natural and useful generalization of this concept.

Intuitively, the product topology places a copy of M1 at each point in M2. Alternatively, it places a copy of M2 at each point of M1. An example of this is R2 = R × R, where we take a line R as our base, and place another line at each point of the base, forming a plane. We could formulate another example with M1 = S1, a circle, and M2 = (0,1), a line segment. The product topology here just gives us a piece of a cylinder, since this is what we get when we place a circle at each point of a line segment, or place a line segment at each point of a circle.

A fiber bundle is an object that is closely related to this idea. In any local neighborhood, a fiber bundle just looks like M1 × M2. Globally, however, a fiber bundle is generally not a product manifold.

The prototype example for our discussion will be the möbius band, as it is one of the simplest examples of a non-trivial fiber bundle. We can create the möbius band by starting with the circle S1, and (similarly to the case with the cylinder) at each point on the circle we attach a copy of the open interval (0,1), but in a nontrivial manner. Instead of just attaching a bunch of parallel intervals to the circle, our intervals perform a 180° twist as we go around. This gives the manifold a much more interesting geometry, as the boundary consists of only one curve, and the band is no longer orientable (there is no "inside" face or "outside" face).

Now, if we look at the object we've formed, we note that locally, it is indistinguishible from the cylinder. That is, the "twist" in the möbius band is not located at any particular point on the band; it is entirely a global property of the manifold. Motivated by this example, we seek to generalize the language of product spaces, to include objects like the möbius band which are only locally a product space. This generalization is what we will come to know as a fiber bundle.

The Informal Description

When we build up the language to describe a fiber bundle, we want to think intuitively that a fiber bundle is "locally a product space" in the same sense that a manifold is "locally euclidean". Thus, our language describing fiber bundles will mimic our language of manifolds closely.

The fiber bundle itself will be called the total space, E. It will be constructed from a base manifold M, and a fiber F. In our examples of the cylinder and the möbius band, we call S1 the base manifold, and the interval (0,1) is the fiber. However, since globally the cylinder and the möbius band differ, we're definitely going to need some additional data to distinguish them.

We will find that for a general point q in E, we can directly associate this with a point p in the base manifold, M. However, we cannot directly associate q with a point in the fiber, F. This is the first sign of asymmetry between M and F, and it can be seen readily in the case of the mobius band:

Say we want to parameterize the möbius band by a point θ on the circle, and a real number f in the interval (0,1). Concretely, say θ = 0 and f = ¾. Now, transport the point around the möbius band by increasing θ and keeping f fixed at ¾. When θ → 2π, we should return to the same point q in E, since it corresponds to the same+ θ and f, and hence the same q. However, because of the inescapable twist in the mobius band, the point we return to is associated with θ = 0, f = ¼. Our parameterization for F somehow "flips" when we move one turn around the möbius band.

What you should take away from this is that parameterization for F only works in a local sense; not globally. In this way, it is a great deal like coordinate parameterization on a manifold. In a neighborhood of a point, we can parameterize points in E by (p,f), but when we go to another neighborhood, we use a different parameterization (p,f ').

How to express this mathematically? First, we associate each point in E with a point in M, which we can globally do. This association can be accomplished by a projection map π: E → M, which projects points q in E to their associated point p in M. This map is generally not one-to-one, of course; we want it to map entire fibers Fp to points p, capturing the fact that we're attaching a copy of the fiber F to each point p. We can enforce this condition by the requirement that the preimage++ π -1(p) = F, for each p in M. Now, we still want to locally parameterize these fibers, but leave the definition open to include different parameterizations of F for different neighborhoods. This is the part that may be familiar from the standpoint of coordinate parameterization.

When we defined coordinates on manifolds, it was accomplished by an open covering {Ui}, and a homeomorphism Φi for each Ui associating it with an open set of Rn. We will do something very similar in the case of fiber bundles. We take an open covering of M, {Ui}, and a set of smooth homeomorphisms, {Φi}, associating an open set of E given by π -1(Ui) with a product space. Formally,

Φi: Ui × F → π -1(Ui)

Since we have a map Φi between π -1(Ui) and Ui × F, we can locally express points in E using points in Ui × F. We first project the point q in E to a point p in M, using π(q) = p, then find an open neighborhood Uj of p, then there is a corresponding map Φj which associates q with a point in Uj × F, i.e. a pair of points (p,f).

Now, it is possible that we could have chosen a different Uk about p, and thus a different map Φk associating it with a different point (p,f ') in Uk × F. This is fine, but we need to understand the relationship between f and f '. In other words, to distinguish the fiber bundle properly, we have to know about all possible choices of fiber parameterization. In the case of the cylinder, there was only one fiber parameterization, because the space was globally a product space. In the case of the mobius band, there are two possible parameterizations, and we can make the transformation explicit by f = 1 - f '. Neither parameterization f nor f ' works globally; we can cover the circle with two overlapping segments, and choose one parameterization for one segment, and the opposite for the other segment.

Changes in parameterization of the fiber are known as transition functions. These are written formally as tij = Φi • Φj-1: Ui ∩ Uj × F → Ui ∩ Uj × F, so they may be thought of as smoothly carrying a point from one product space to another, in the overlapping region Ui ∩ Uj. However, there are more enlightening ways to look at the transition functions. First of all, note that they carry the point p to itself: (p,f) → (p,f '). Thus, it may be more enlightening to think of this as a set of maps from F to itself for each point p in the overlap. Symbolically,

tij(p): F → F

Now, notice that tij satisfies group axioms:

tij(p) [ tjk(p) ] = tik(p) is a transition function.

tii(p) = (identity map) is a transition function

tji(p) = tij-1(p) is a transition function.

Thus, we can now think of {tij(p)}All p,i,j as a group. Specifically, since we are interested in smooth transition functions, we think of {tij(p)} as a lie group. This group is denoted G, and is called the structure group of the fiber bundle. Of course, we can't forget that it is also a map from the fiber to itself. This can be thought of as a realization of the group, i.e. an action of the group G on the set F of points in the fiber. This group action is also required to be smooth, since that was an original requirement on the transition functions. To summarize, tij is a map

tij: Ui ∩ Uj → G

into the structure group G which acts smoothly on the fiber F. The transition functions characterize the fiber bundle. In the case of the cylinder, the structure group is just the trivial group of one element, the identity. In the mobius band, the structure group is the group of two elements, Z2, given by {1,a} where a2 = 1. In other words, we only have two parameterizations, and thus only one transition function other than the identity, which is its own inverse ( 1-(1-f) = f ).

In these two cases, the structure group has a discrete, finite number of elements, and thus the dimensionality of the lie group is zero. Keep in mind that these are very simple cases of fiber bundles, and generally the lie group consists of a continuous spectrum of transition functions; in other words, we call this a lie group for a reason. Also keep in mind that we have a choice when determining our structure group, since we don't have to use all of the elements. For example, if the fiber bundle is trivial, like the cylinder, we can use any group G we want, but only use the identity element when defining transition functions. However, it makes the most sense to choose the smallest group that is convenient for our purposes.

The Bloated Non-Elegant Attempt at a Formal Description

So, we have finally laid out all the pieces we need to describe a fiber bundle. Let's give a preliminary formal definition, before eventually refining it more nicely.

    A Differential Fiber Bundle (E, π, M, F, G) consists of the following:

  1. A differential manifold E called the total space
  2. A differential manifold M called the base manifold
  3. A differential manifold F called the fiber
  4. A surjective map π: E → M, called the projection, such that π -1(p) = Fp, the fiber at p in M.
  5. An open covering Ui of M with a diffeomorphism Φi: Ui × F → π -1(Ui) called the local trivialization, with π(Φi(p,f)) = p
  6. A lie group, G, known as the structure group, which acts on the fiber F.

    Finally, there is the requirement that the transition functions, Φi • Φj-1 = tij(p), are smooth and live in G, the structure group.
    Φj(f) = Φi(tij(p)f).

So, the base manifold and fiber tell you exactly what the bundle looks like locally. At the level of the manifold M, open neighborhoods just look like pieces of Rn, and M's transition functions tell us precisely how they are sewn together. At the level of the bundle E, open neighborhoods are just pieces of Rn × F, and there is an additional sewing operation. We need to glue the fiber Fp over p from the patch Ui to the same fiber Fp from the patch Uj. The structure group gives you the additional information required to tell you how to "glue" the fibers together. In this light, a fiber bundle is often seen as a natural generalization of the very concept of a manifold.

Special Types of Fiber Bundles

In the general case of fiber bundles, F can be any differential manifold and G can be any lie group that acts on F. By adding further requirements, we can define certain special bundles.

  • The Trivial Bundle

    Almost unnecessary to include, except that it draws the important connection that shows that bundles are generalizations of product manifolds. Simply put, a trivial bundle is a product manifold. The base and fiber are interchangeable, and the structure group is just the trivial group of one element. The trivial bundle can be covered with one patch; the entire manifold M1. The "local trivialization" Φ is really a global trivialization, since Φ covers all of M1. The trivial bundle can clearly be formed using any two manifolds as base and fiber.

  • Vector Bundles

    A vector bundle is defined by two things: first, the requirement that F be isomorphic to Rk, i.e. that F is a vector space. Second, that the structure group acts linearly on the vector space. Since the structure group acts on F linearly and F is a vector space, the transition functions have a k-dimensional representation on the fibers. In other words, the transition functions can be represented by k × k matrices.

  • Special Case: The Tangent Bundle

    For example, take the base manifold to be M, and the fiber at p to be the tangent space TpM, which is indeed a vector space. The projection operator π sends TpM → p. For the open covering, we can use the same coordinate patches {Ui} that we used to define M.

    This space is known as the Tangent Bundle of M, E = TM. We have a local trivialization in any given patch, simply given by the coordinate representation of the vectors Vi in TpM. In other words, the coordinate charts not only give us a local parameterization for M, they also give us a local parameterization for TM, i.e. the vector components.

    In the neighborhood UA, we use coordinates {xi}, and Vp ∈ TM = Vpi ∂/∂xi, and in the the neighborhood UB, we use coordinates {yj}, and Vp ∈ TM = Wpj ∂/∂yj.

    At the level of the manifold, the transition functions are given by x(y) and y(x), but at the level of the bundle, we see that Vpi = Wpj ∂xi/∂yj|p.

    The transition functions, tAB(p) = ∂xi/∂yj|p. This map can be thought of as an n × n matrix mapping the components {Wj} to the components {Vi}. Since we can arbitrarily write down coordinate transformations on M, we can construct any set of matrices to produce the transition functions tAB. Thus, the structure group of the tangent bundle of an n-dimensional manifold is GLnR, the set of all invertible n × n real matrices. The transition functions of the manifold itself produce the maps tAB which glue the fibers together.

    Vector bundles in general are quite useful, due to their concrete nature. They are often viewed as generalizations of the tangent bundle. In this light, we can use similar formalisms for defining parallel transport and curvature in vector bundles, mainly because we can still use objects with indices, like Γαβδ.

  • Principal Bundles

    A principal bundle is a fiber bundle in which the fiber over any point p ∈ M is a copy of the bundle's structure group, F = G. Since G is a lie group, it is a manifold by definition. The group G can act on itself by left-multiplication. For example, imagine placing a circle at each point in a manifold, but smoothly rotating the circles as you move around the manifold. The circle can be thought of as a copy of U1 (Since U1 is a circle when viewed as a manifold), and the structure group is also U1; it can be considered the group of rotations of a circle, but in this context of principal bundles we are thinking of U1 as a group which rotates itself.

  • The Frame Bundle

    Take a manifold M, and let the fiber over p ∈ M be the space of all ordered bases {ei} for TpM. An ordered basis provides a frame at p. So, we are looking at all frames {ei} at each p ∈ M. The reason we choose "ordered" bases is so that we don't distinguish between two sets of bases {ei} and {fj} where the {ei} are just a permutation of the {fj}. This is not quite a vector bundle, because a given element of E is a set of n linearly independent vectors. The independence condition prevents the fiber from being a vector space. For example, there is no "zero" element in the frame bundle. Now, note that given any initial basis for TpM, you can get to any other by operating on this basis by a suitable element of GLnR:

    fj = gij ei

    Since the {fj} are a linear combination of the {ej}, and they are linearly independent since g is nonsingular, the {fj} do indeed form a basis for GLnR. Moreover, this exhausts the space of frames. If you provide me with a frame {hj}, I can always express each hj as a sum of basis vectors in my basis. This is equivalent to writing hj = gij ei, where gij is an invertible matrix.

    Thus, by starting with any fiducial or "point-of-reference" basis, we can get all other bases by acting with elements of GLnR. Then, we can just label a given frame by the element of GLnR that got us there from the fiducial frame. In this way, the fiber of all frames is nothing but the group GLnR! In other words, the frame bundle is equivalent to a principal bundle.

    The equivalence we've just shown seems useful. Is there any way of naturally going the other direction? That is, can we produce some kind of useful fiber bundle from a principal bundle? The answer is yes, from a principal bundle we can build associated k-dimensional vector bundles, provided that G has a k-dimensional representation. This is useful, because vector bundles are the most concretely-defined fiber bundles.

  • Associated Vector Bundles

    The basic idea of constructing associated vector bundles is as follows: Rip out the copy of G at each p ∈ M. Replace by a vector space V of dimension k. Find a k-dimensional representation ρk of G. Then choose the transition functions to be ρk(tij) = k × k matrices acting on V.

    More formally, let V be a k-dimensional vector space on which g acts via a k-dimensional representation, ρ. Then, given a principal bundle P, define the associated vector bundle P ×ρ V by starting with P × V and imposing the equivalence relation

    (u,v) ~ (u • g -1, ρ(g)v)

    This equivalence relation does indeed replace each fiber G with a copy of V. Let's say u ∈ P is written locally as (x,h) where x ∈ M, h ∈ G. Then:

    (x, h, v) ~ (x, h • h -1, ρ(h)v) = (x, e, ρ(h)v).

    Thus, we can always rotate h into the identity thereby effectively collapsing the G-fiber to a point. At the same time, ρ(h)v ∈ V, so the V-fiber persists. Hence, we've replaced the G-fiber with a V-fiber.

    It is easily seen above that the transition functions are just ρ(tij).

    So, the large-scale picture you should have in mind is of a single principal bundle, underneath which we place a multitude of associated vector bundles. For every matrix representation of the structure group G, there exists a unique associated vector bundle. This should give you an idea of why they are called "principal bundles".

What Actually is a Fiber Bundle? (The more elegant description)

As mathematicians, we are inclined to rigorously define the tools that we use. Specifically, where do they live, and what distingishes them? For a fiber bundle, we have not yet explained this. Is it the total space? Is it the collection of spaces? What is the specific object we are calling the "bundle", and how does it specify all of the underlying structure? This is a delicate question, which is why it has been put off until we could get a more intuitive conceptual picture (also, jrn's inquiries made me realize that this was a gaping hole in the writeup).

Upon close inspection, you may notice that the fiber bundle is entirely specified by the projection map, π, subject to a rigorous series of requirements. All the other objects are defined through π. E and M are its domain and range, and it is required that they are both differential manifolds. Since it is required that E is a differential manifold, it is assumed that its differential structure is already fixed, but this structure is subject to all the requirements in the definition. F is given by π's preimage of a point in M. Local trivializations Φi are required to exist and be compatible, but they play a similar role to that of local coordinate charts on M. Since we required that E has a specific topology and differential structure, the local trivializations are just all possible maps which are compatible with this structure. Once all trivializations are given, this implicitly defines the set of all transition functions, and hence the structure group, G. Thus, all the pieces are truly given by just the projection map, and for this reason, it is the projection map itself which is often referred to as the "fiber bundle".

It will sometimes be useful to deem two different bundles π1 and π2 "equivalent". To do so, we need to be sure first that the two total spaces E1 and E2 are equivalent differential manifolds, i.e. there exists a diffeomorphism f: E1 → E2. However, there must be an equivalence of the maps as well, so that the base manifolds are the same. In other words,

f(π1(p)) = π2(f(p))

Since a map specifies a bundle, this diffeomorphism equates the two bundles.

Sections

A section S of a fiber bundle is a map from the base manifold into the total space, picking out a point on each fiber Fp over any point p on the base. It's possible to think of a section as a fiber-valued function defined on the base. Since the section just picks out a point on the same fiber, we can project back down and get to the original point:

π [ S(p) ] = p

A section of a tangent bundle is more commonly known as a vector field.

Parallel Transport

Fiber bundles generally run into the same issue of ambiguity that we saw in the tangent bundle. There is no natural way to compare points in different fibers. As such, there is no god-given notion of parallel or horizontal transport. As another way to put it, it is not as yet meaningful to speak of "constant" sections. Naively setting f = constant will not work, as this will give us a different value f ' = tij(p) • f ≠ constant in a different fiber parameterization. In contrast, there is a well-defined notion of vertical transport, since the projection map π will tell us that we are still at the same point in the base. To fix some notion of "horizontalism", we need additional information, and this turns out to be the most general notion of a connection. Connections lead to curvature, which really moves us outside the scope of this writeup.

Physics

All of classical physics (the four fundamental forces of nature) can essentially be presented geometrically, using the language of fiber bundles. The electromagnetic field, for example, is often presented in the form of a rank-2 antisymmetric tensor, Fμν. This tensor, it turns out, is just the curvature tensor of a U1 principal bundle. The strong and weak nuclear forces also have field strength tensors, which are really just curvature tensors of the SU2 and SU3 principal bundles. Gravity manifests itself in the form of curvature of the tangent bundle, or (if you prefer) the curvature of the base manifold. Matter is described by sections of associated vector bundles. The data needed to prescribe an associated vector bundle is given by a representation. In other words, a particle's electric charge, color and flavor are really described by the associated vector bundle in which the particle is defined. A section of an associated vector bundle is better known by physicists as a wavefunction.


+Try not to be confused by the fact that θ = 0 and 2π at the same time. This is a manifold issue, not a bundle issue. Don't let it distract you from the real parameterization problem, that of the fiber.

++Even when a map is not invertible, there is a well-defined notion of a preimage; that is, the space of all points q in E which map to the point p in M.

The exponential map in mathematics provides a concrete relationship between the tangent space of a manifold, and the manifold itself. There are many ways to conceptually approach the exponential map, and therefore a few different "definitions" are provided.

From the Infinitesimal to the Macroscopic

Often in mathematics (and very often in physics), one deals with the effect on a function f(λ) when we change λ by a small parameter, ε. Formally, we can expand f(λ + ε) in a taylor series about λ:

f(λ + ε) = f(λ) + ε df/dλ + ½ ε2 d2f/dλ2 + …

Now, if ε is infinitesimally small, by which we mean it is as small as any number you can imagine, we can ignore terms of order greater than ε2 and the result is simply:

f(λ + ε) = (1 + ε d/dλ) f(λ)

We can think of this as an operator, T(ε) acting on f(λ), whose operation is to translate λ by an infinitesimal amount, ε.

f(λ + ε) = T(ε) f(λ), where T(&epsilon) = 1 + ε d/dλ

What if we wanted a more general operator, T(Δλ), whose operation translated λ by a finite amount, Δλ? We could read off this operator from the fully-expanded taylor series above, but it is more instructive to think of this finite-translation operator as a product of a large amount of successive infinitesimal-translation operators:

T(Δλ) = [ T(ε) ]N, where Δλ = N × ε.

We then take the limit as N → infinity (and ε → 0). In other words,

T(Δλ) = LimN → infinity [ (1 + (Δλ d/dλ) /N ) ]N

Does this formula look familiar yet? Let us pretend for the time being that the operator Δλ d/dλ is just a number, k. Then the formula looks like:

T(Δλ) = LimN → infinity [ (1 + k/N ) ]N

This you should recognize as one definition of the exponential function,

= ek.

We carry the notation over to describe the translation operator:

T(Δλ) = eΔλ d/dλ

This is typically evaluated by expanding the exponential in its usual power series. It is straightforward to check that its action on a function just gives the full taylor series expansion of that function.

Now, we make a small conceptual transition. Instead of thinking of this as an operator-valued function with respect to the interval, Δλ, it is more natural to think of it as an operator-valued function on derivatives, Δλ d/dλ = d/dt. In other words, we can vary the translation distance by varying our reparameterization t = f(λ). So, we symbolically rewrite this as

T(d/dt) = ed/dt

This is a more natural form of our translation operator, the exponential map. You can think of this as generating a translation T, given a derivative d/dt.

From Vector Fields to Integral Curves

On an arbitrary differential manifold M, imagine a smoothly varying family of curves Φ(p), covering the manifold (or at least filling some open set in the manifold) without intersecting. Such a family of curves is known as a congruence of curves. At each point p in the manifold, there is exactly one curve passing through p. Such a curve is associated with a particular vector Vp in the tangent space of M at p; Vp is the velocity of the curve. Since we can do this at every point p ∈ M, this determines a smoothly varying vector field V(p).

We can go the other direction, too. On a differential manifold M, a smooth vector field V(p) determines a smoothly varying family of curves Φp: R → M, called the integral curves of V. You can think of this set of curves as the effect of trying to smush our flat tangent space TpM onto our curved manifold, M.

Φp(λ) = (x1p(λ), x2p(λ), ..., xnp(λ)) in a specific coordinate representation {xi}.

This family of curves provides a map from the tangent space to the manifold, which we call the exponential map, exp: TM → M. The curves Φp are determined by demanding that the velocity of each curve Φ(λ) is equal to the vector field evaluated at that point, VΦ(λ). This demand can be represented in a coordinate-dependent manner:

dxi/dλ = Vi(x1(λ), x2(λ), ..., xn(λ)).

This is simply a set of first-order ordinary differential equations for xi(λ). There always exists a unique solution about a sufficiently small neighborhood of p. Note that this requirement implies that the directional derivative d/dλ = Vi ∂/∂xi, i.e. that the curve parameter λ appears in the directional derivative associated with the vector field V.

For notational use, we make the association p ↔ (x10), x20), ..., xn0)). Then, explicitly, we have:

xi0 + ε) = xi0) + ε dxi/dλ + ...

= [ 1 + ε d/dλ + ½ ε2 d2/dλ2 + ... ]|λo xi

xi0 + ε) = eε d/dλ xi

As before, we notice that ε d/dλ = ε V is a vector by itself. That is, instead of thinking of this as a map which inputs a vector V and gives us a curve, and inputs a distance ε and moves us this distance along the curve to produce a point in M, we can think of this as a map which inputs vectors ε V and outputs the point on our manifold found by moving a unit distance along its integral curve. We can cut through all the unnecessary notation by simply evaluating our expression at ε = 1:

xi0 + 1) = ed/dλ xi

This is the exponential map of d/dλ acting on xi. We could be more explicit by expressing d/dλ as Vk ∂/∂xk:

xi0 + 1) = exp { Vk ∂/∂xk } xi.

This expression may seem strange-looking, as we are taking partial derivatives with respect to xk of xi, which we expect to just give us a kronecker delta, δik, but don't forget that Vk is also dependent on the {xi}. Thus, the expansion of this formula should look like:

xi0 + 1) = [ xi0) + Vi|λo + ½ Vk ∂Vi/∂xk|λo + ... ]

Now, this formula was only guaranteed to work in a small neighborhood of p (meaning we cannot justify setting ε = 1 the way we did), but we can get around this by restricting the domain, i.e. requiring that our vector fields be small enough to keep within some neighborhood of p in M. Moreover, we can often find solutions which cover a large portion of the manifold M. For example, if we just choose a coordinate vector field ∂/∂xk, then the integral curves produced are simply the coordinate curves xi ≠ k(λ) = constant, xk = constant + λ. This exponential map will be well-defined as far as the coordinate chart reaches, which may nearly be the entire manifold (For example, the sphere S2 can be covered minus one point, by stereographic projection). For this reason, the exponential map is often thought of as a map from the local structure of TpM to the more global structure of M itself.

A lie derivative (pronounced "lee", named after the mathematician Sophus Lie) is a well-defined way of taking derivatives of vectors on a manifold. Computationally, it can be quite simple, though conceptually it's actually very subtle. The depth of this concept requires a good understanding of the tangent space at different points on a manifold.

Given a differential manifold M, it is useful to be able to take derivatives of vector fields. However, until we provide additional information, the concept of a "derivative" of a vector field is not well-defined.

Comparing Tangent Vectors

We want to think of the derivative as "the rate at which a vector field changes as we move a given distance" in the manifold. A small problem with this is that there is as yet no notion of "distance" in our manifold. This is additional structure that we would need to impose, if we wanted to include "distance" in our definition of the derivative. A much greater fundamental problem is that we are trying to measure the rate of change of a vector field as we move from one point to another in the manifold, which means we are implicitly comparing tangent vectors defined at different points p and q in M. There is no god-given way to do this, since tangent vectors at p live in the tangent space TpM, and tangent vectors at q live in a different space, TqM.

There are many ways of defining maps between these two spaces, but there is no special or natural map. Choosing a particular map between tangent spaces on the manifold imposes additional structure, and this structure is known as a connection. A derivative which uses a connection defined on M is called a covariant derivative, and will not be used here.

The Lie Derivative provides an alternative method for differentiating vector fields, which does not require a connection. Instead, the additional information specified to compare tangent vectors is known as a congruence of curves.

Congruence of Curves

A congruence of curves defined on a manifold M is simply a smoothly varying family of curves which fill+ the manifold, without intersecting. Each point p in the manifold lies in exactly one curve.

Here is the key concept which allows the lie derivative to function computationally: As was stated in the tangent space node, each vector in TpM can be considered the velocity of a curve passing through p. In other words, TpM is the space of equivalence classes of curves passing through p. Thus, any given curve defines a vector at each point in the curve. Therefore, a congruence of curves defines a smooth vector field at each point in the manifold.

Can we go the other direction? That is, given a smoothly varying vector field W on M, can we produce a unique congruence of curves in M, such that each point p in M is associated with a curve whose velocity at p is equal to Wp, the vector field evaluated at p? The short answer is yes,++ and the resulting congruence is called the set of integral curves of W.

So, let us take our vector field W which we want to differentiate, and transform it into a congruence of curves using this method. As we know, W is also associated with a directional derivative operator at each point, which we shall call ∂/∂μ. The integral curves of W can then be parameterized by the parameter μ. We can write an integral curve of W passing through p as αp(μ). Now, as was stated previously, we need to provide an additional congruence of curves in order to differentiate W properly. Equivalently, we can provide a vector field V, since we know V is itself uniquely associated with some congruence of curves. We shall write V as ∂/∂λ, and the integral curves of V will be parameterized by λ.

So, we have two congruences of curves. How do we take the derivative of one with respect to another? Conceptually, we want to look at how the curves of W change when we move a small distance Δλ along curves of V. We still have to deal with the issue of comparing vectors at different points in the manifold; we've just transformed the problem into comparing curves at different points in the manifold. Fortunately, the congruence of curves given by V gives us a natural way of transporting a curve of W to different points on the manifold. We can define a new congruence of curves about the point p in the following manner:

Lie Dragging (You may need to draw a picture to follow along)

At p, look at the integral curve of W passing through p. Call this curve αλ(μ). Move a distance Δλ along the integral curve of V passing through p. Call this new point q. To produce a transported curve α*λ + Δλ(μ) passing through q, simply transport each point in αλ(μ) this same distance Δλ along the integral curve passing through αλ(μ). This produces a new curve which we can compare with αλ + Δλ(μ) by simply taking the difference between their velocities:

Δ V [ W ] = α'(μ)|λ + Δλ - (α*λ + Δλ(μ))'

A note about notation: we've introduced a great many concepts at once, and it's good to keep our head on straight about why things are written down the way they are. αλ(μ) is an integral curve of W passing through p with parameter μ. We write the subscript "λ" instead of "p" to accentuate the fact that p is given as a point on an integral curve of V, parameterized by λ. αλ + Δλ(μ) is simply another integral curve of W, this one instead passing through q, the point gotten by moving a distance Δλ along an integral curve of V. α*λ + Δλ(μ) is not an integral curve. It is the curve found by transporting αλ(μ) a distance Δλ along integral curves of V passing through each μ of αλ(μ). The family of curves produced in this manner is said to be lie dragged. αλ + Δλ(μ) and α*λ + Δλ(μ) intersect each other at q, which is why we can compare their velocities.

Now, why did I write the velocities of α and α* in the way I did? Well, since αλ + Δλ(μ) is an integral curve of W, the velocity of this curve is exactly what we mean by Wq, the vector field evaluated at q. This can be written by just taking the velocity of integral curves at arbitrary λ, and evaluating it specifically at the point q, which corresponds to λ + Δλ (I'm consistently using the convention that a prime means differentiation by μ). For α*, we must first transport the curve before computing its velocity. This is noted symbolically by putting the subscript λ + Δλ inside the parentheses. This will become important shortly.

Computation of the Lie Derivative

We can turn this difference into a derivative by dividing by Δλ and taking the limit as Δλ goes to zero. This specifies the lie derivative:

£V[W] = Lim(Δλ → 0) [α'(μ)|λ + Δλ - (α*λ + Δλ(μ))' ] / Δλ

The simplest way to compute this is to expand these terms to first order in a taylor series in Δλ (higher order terms vanish when taking the limit Δλ → 0):

£V[W] = Lim(Δλ → 0) [α'(μ)|λ + Δλ(∂/∂λ)α'(μ)λ - (αλ(μ) + Δλ(∂/∂λ)αλ(μ))' ] / Δλ

= Lim(Δλ → 0) [ Δλ(∂/∂λ)α'(μ)λ - (Δλ(∂/∂λ)αλ(μ))' ] / Δλ

The Δλ's cancel, so we can just get rid of the limit:

= (∂/∂λ)(∂/∂μ) αλ(μ) - (∂/∂μ)(∂/∂λ) αλ(μ)

= [ (∂/∂λ)(∂/∂μ) - (∂/∂μ)(∂/∂λ) ] αλ(μ)

In a particular coordinate system:

= Vj ∂/∂xj [ Wi ∂/∂xiλ(μ)) ] - Wj ∂/∂xj [ Vi ∂/∂xiλ(μ)) ]

The second derivatives of α cancel, and we get:

£V[W] = [ Vj ∂Wi/∂xj - Wj ∂Vi/∂xj ] ∂αλ(μ)/∂xi

We've been interpreting this as the velocity of a curve, α, but we can now think of this as a directional derivative operator acting on the function αλ(μ). In this way, it is readily seen that the components of the vector we've produced are simply the coefficients of ∂α/∂xi:

£V[W]i = Vj ∂Wi/∂xj - Wj ∂Vi/∂xj

We usually write this as the commutator, [ V,W ], meaning the result of commuting directional derivative operators V and W. Written this way, it is often simply called the lie bracket of V with W.

Lie Derivatives of Other Tensors

We don't have to stop here; we can now take the Lie derivative of arbitrary tensors. For example, we can take the lie derivative of a one-form. We first must fix our lie derivative with two reasonable requirements. First, the lie derivative of a scalar is just the directional derivative:

£V[ f ] = ∂f/∂&lambda

Then we note that a scalar function can be formed by operating with a one-form on a vector field:

ω(W) = ωi Wi

Then we finally require that our derivative satisfies a Leibnitz rule,

£V[ ωi Wi ] = £V[ω]i Wi + ωi £V[W]i

We can then compute all of these terms, plugging in a coordinate basis vector field for W = j (Wi = δij):

∂(ωj)/∂λ = £V[ω]j + ωi £V[∂j]i

Vi ∂(ωj)/∂xi = £V[ω]j + ωi (-∂jVi)

£V[ω]j = Vi ∂(ωj)/∂xi + ωi ∂Vi/∂xj

In a similar fashion, we can compute the lie derivative of tensors of arbitrary rank. Generally, the lie derivative is most useful in its rank-1 interpretation, the change in the congruence of curves as described above. In this case, it is also simpler computationally, as it is just given by the lie bracket [ V,W ].

Coordinate Bases

Now that we have a new way of comparing tangent spaces, how can we make use of it? Well, the most common use appears when we look at basis vectors that we want to use in different tangent spaces. An important question is, when we choose a set of basis vectors at each tangent space, and this set of basis vectors varies smoothly in the manifold, can we find a coordinate chart in some neighborhood of a point p whose coordinate basis vectors correspond to our choice of basis vectors? In other words, given a set of basis vectors {eμ(p)}, can we find a coordinate system {xμ} whose partial derivatives {∂/∂xμ} are the associated directional derivative operators of {eμ}?

Before telling you why, let me just tell you the answer. The necessary and sufficient conditions for this to be possible is that the lie brackets of all the basis vectors vanish: [ eμ(p), eν(p) ] = 0, for all μ, ν.

Computationally, it's easy to see why this is a necessary condition. If it's possible to write {eμ} as a set of partial derivatives {∂μ}, then:

[ eμ, eν ] = [ ∂μ, ∂ν ] = 0,

because partial derivatives commute (when acting on smooth functions). So, if the lie bracket does not vanish, clearly we cannot write the vectors in terms of partial derivatives of a given coordinate system. However, if the lie bracket does vanish, how do we know we can always find such an appropriate coordinate system?

I don't intend to give a formal proof, but such a coordinate system can always be found via the integral curves of {eμ(p)}. The reason this works (and fails when the lie bracket does not vanish) is that the integral curves agree when we drag them in the way we did before, when calculating £V[W]. Since the lie derivative is zero, that means that α(μ)|λ + Δλ = α*λ + Δλ(μ), i.e. the lie-dragged curve is the same as the integral curve. This insures that our coordinate system is not ambiguous (when we move a parameter distance Δμ along one coordinate then a distance Δλ along another, we get the same result as if we reverse the order). There are deep topological reasons behind all of this, but this is the basic idea.

How the Lie Derivative differs from the Covariant Derivative

If we are given a vector field V, we specify the lie derivative, £V. If we are given a connection Γ, we specify the covariant derivative, ∇μ. You might now be tempted to ask, is there a relationship between V and Γ? That is, given a vector field, V, can we produce a connection, Γ, such that £V = ∇?

The simple answer is no. The lie derivative and the covariant derivative are simply two different beasts. One way of understanding this is to note that a lie derivative is a map from (p,q) tensors to (p,q) tensors, and the covariant derivative is a map from (p,q) tensors to (p,q+1) tensors. The "equation" £V = ∇ simply makes no sense. It is possible to write down some relationships between the two, but it is really best to think of them as different objects which live in different spaces.


+Really, we're just looking at some small region of M. Thus, the lines of longitude on a sphere work, if we stay away from the poles. If we want to work near the poles, we can use some other congruence of curves. You're not always able to find a congruence that works globally, but you can always find one locally, which is good enough for us.
++The long answer involves a proof, and perhaps some way of constructing these curves. This probably calls for a writeup on the exponential map.

Thanks to unperson for providing some ideas on how to clarify this writeup.

All rational numbers fall into two categories, terminating and repeating. Terminating decimals are those said to have a definite last digit, such as -10, 2.5 or 1.793. Repeating decimals have a digit or series of digits that repeats ad infinitum, and include 0.673333... and 2.142857142857....

Although repeating decimals can be expressed in the ellipsis notation that I just used, there are easier ways. One such way is to put a vinculum bar over the repeating digits. Thus, 0.67333... and 2.142857... become

    _       ______
0.673 and 2.142857 

respectively. Another involves placing a dot over the first and last digits in the sequence. Using this format, the same two examples become:

    .       .    .
0.673 and 2.142857
In spoken English, the decimal would be called "two point one four two eight five seven repeating." Or, of course, you could just write out the fraction.

To find the fraction, follow these steps:

  1. Subtract the decimal's integral part. (Thus, 2.142857 repeating becomes 0.142857 repeating.) We will call this number n.
  2. Multiply n by a power of ten until the difference between n and it is a terminating decimal. This number can be expressed as a multiple of n. (In the example above, we would have 142,857.142857 repeating = 1,000,000n.)
  3. Subtract the two values to get a number equal to a multiple of n. (We now have 142,857 = 999,999n.)
  4. Now divide to find the value of n. (In the example, n = 142,857/999,999 = 1/7.)
  5. Finally, add the integer you subtracted at the beginning. The decimal can be expressed as a fraction or mixed number. (We now have 2+1/7 or 15/7 as our final answer.)

What makes a decimal repeating? A decimal is repeating when the denominator of its simplest fractional form cannot be decomposed into 2's and 5's; it is terminating, in other words, if its denominator is a factor of a power of ten. Thus, 387/1000 is terminating, but 2801/3600 is not. Of course (as N-Wing reminds me), in other bases than decimal, an n-mal will terminate with a denominator which is a factor of a power of n. Thus, a bimal will only terminate with a denominator which is a power of 2

But be warned! All terminating decimals can be expressed as a repeating decimal: even benign 2.5 can become 2.5000000... or 2.4999999.... While this may seem surprising, it is quite true and even useful—Cantor himself used it to show that the real numbers are uncountable.