e^2

tangent space (thing) by chomps Wed Apr 19 2006 at 17:14:46

linked by jrn

A tangent space is a useful structure which can be extracted from the differential structure of a differential manifold M. On such a manifold, it is handy to be able to define vectors, so that we can calculate things like, say, the velocity of a given curve defined on M. However, since a general manifold doesn't have to be defined as a subspace of flat euclidean space, we can no longer just think of a vector as some "arrow" living in the ambient space. In euclidean space, we would just draw an arrow, and state that it's "located" at the coordinates of the non-pointy end of the arrow, and it had a "direction" and "magnitude" given by the coordinates of the "point" of the arrow. We were able to get away with this nonsense because there was no curvature in euclidean space, and our vectors therefore are given by straight lines. The essential core of a vector is linearity, and it is that property of vectors that we cannot lose when generalizing to curved spaces. Therefore, we can no longer define our vectors as living in the manifold itself. If we try to represent our vectors as "curved" arrows in the manifold, there is no proper way of "adding" them together; we lose linearity. The bottom line is, we need a new place for our vectors to live. So, we seek to define the tangent space to M at any given point p in the manifold. This can be done in a few different ways.

If M is an n-dimensional differential manifold, the tangent space to M at p, denoted T_pM, is an n-dimensional vector space. You might like to think of it as the "closest flat approximation to M at p", but so far, it's really just a copy of Rⁿ where vectors at p can be defined.

Curves in the Manifold

Let's begin in a familiar setting. We look at the velocity vector for a curve in R³. Define the curve as a map C: (-1,1) → R³. In local coordinates on R³ (which we can just take to be the cartesian coordinates), we can represent C by (x¹(t), x²(t), x³(t))*. The tangent vector to C inside of R³ is given by just taking the derivative with respect to the curve parameter, t: (dx¹/dt|_p, dx²/dt|_p, dx³/dt|_p). This can be interpreted as the velocity vector of the curve at the point p.

Now, there is no reason we can't do this for arbitrary manifolds; in a small enough neighborhood of a point p, we have a local coordinate representation of any curve passing through p, and thus we can compute the tangent vector to such a curve in those coordinates. We can then compare the tangent vectors to multiple curves passing through p in a natural way, since we can use the same coordinate representation to look at both curves.

All of this is useful to us, because in a way we can now associate vectors defined at a point p on M with curves in M passing through p; "vectors" are really just the velocities of curves. This definition is nice, because it is a completely coordinate-independent statement. Note, however, the relationship between vectors at p and curves passing through p is not a one-to-one mapping. For a given vector v_p at p, there are many different curves passing through p which have the same velocity v_p. Therefore, in order to associate the tangent space at p with the set of curves passing through p, we must first define an equivalence class of curves.

Consider the set of curves on M passing through p. An element C_j of this space is a map C_i: (-1,1) → M such that C_j(0) = p.

Define two such curves C_j and C_k to be "equivalent" when their tangent vectors at p agree in a local coordinate representation {xⁱ} about p. You can check that this definition is independent of the choice of coordinates {xⁱ}.

Now, we can define a tangent vector to M at p to be an element of the equivalence class of curves on M through p defined as above. The tangent space, T_pM, is therefore the space of all such equivalence classes. Notice we use the notation v_p, hilighting the fact that vectors at different points on the manifold live in different tangent spaces. Before defining additional structure, there is no way of comparing two vectors v_p and v_q at different points on the manifold.

This is one good way of defining the tangent space. It is nice because its definition does not rely on our choice of coordinates. However, it is a bit abstract, and therefore it is common to use the coordinate-dependent version, especially when doing calculations:

Coordinate Representations of Curves

Given a point p in M, and local coordinates {xⁱ} on a neighborhood of p, the tangent space T_pM is the space of all possible velocities (dC¹/dt|_p, dC²/dt|_p, ..., dCⁿ/dt|_p) for curves C(t) mapping into M. The components of a vector v in M are given by the components of this coordinate representation for dC/dt;

vⁱ = dCⁱ/dt

Note that the components of v are coordinate-dependent. If we chose a different patch to represent the neighborhood of p in M, we would generally get different components vⁱ. While v is a coordinate-independent object, its components are entirely coordinate-dependent.

Directional Derivatives

There is another way of defining a tangent vector space, which involves looking at derivatives of functions defined on M. At a point p, given a tangent vector, v_p (defined in one of the two ways above), and a function f: M → R, we ask how quickly is f changing in the v-direction?

The notation we will use will be v_p(f) = the directional derivative of f at p in the v-direction. To calculate v_p(f), we choose an associated curve C in M through p (from the equivalence class of curves corresponding to v) with velocity v at p. Then, the rate of change of f in the v-direction is given by the rate of change of f o C with respect to the curve parameter t, evaluated at p:

v_p(f) = d/dt(f(C(t)))|_p.

This definition for v_p(f) is manifestly coordinate-independent, since we haven't chosen a coordinate chart yet. However, we haven't yet shown that this definition is independent of our choice of curve, C (remember, we just picked one out of the set of curves with velocity v at p). We can do this by choosing a coordinate chart, {xⁱ}, and carrying through the t-derivative using the chain rule:

v_p(f) = d/dt(f(C(t)))|_p = (df/dxⁱ)(dxⁱ/dt)|_p

where xⁱ(t) is just the coordinate representation of the curve C in M. Note we are implicitly summing over the coordinate index i, using the Einstein summation convention. Simplifying a little of the cumbersome notation, we write:

v_p(f) = (dxⁱ/dt) (∂_if)

Thus, the directional derivative of f is specified by n parameters, dxⁱ/dt, for i = {1,...,n}. These parameters are exactly the components vⁱ in the coordinate representation given above. Now let us think of v_p as an operator on functions in M. By the above equation, it is clear that this is a linear operator, and that its operation on functions is completely specified by the components vⁱ. This means we have a new representation of our tangent space, T_pM:

The tangent space T_pM is the space of all directional derivative operators v_p, acting on smooth functions f: M → R, and returning a real number given by the equation:

v_p(f) = vⁱ (∂_if)|_p.

So, now we can think of vectors as linear maps from functions into R. This concept is a bit abstract upon first viewing, so let's play with the algebra until you're a bit more convinced.

First of all, notice that by this definition adding two vectors together is the same thing as adding their components together, as one would expect. Secondly, multiplying a vector by a given constant just multiplies the components by a constant. Thus, directional derivative operators do form a linear space. Now, let's say we choose a given set of coordinates {xⁱ} and choose a particular directional derivative, which we will call e₁, given by e₁¹ = 1, all other e₁ⁱ = 0. We will show that this is one of a natural set of basis vectors for T_pM. The equation above shows us how this operator acts on functions:

e₁(f) = e₁ⁱ (∂_if) = ∂₁f.

In other words, e₁ = ∂₁. Likewise, for any j, e_j = ∂_j, the j'th partial derivative. Since every directional derivative is a linear combination of partial derivatives, we can always express any vector in T_pM as a linear combination of the e_j's. Thus, we have established that we can think of {∂_j} as a basis for T_pM.

Change of Basis

Now that we're a bit more comfortable with the notion of a vector as a linear map, let's look at the consistency of this definition when going from one coordinate patch to another. Remember, when we first gave the definition for a directional derivative operator, we noted that it was manifestly coordinate-independent. However, when we chose a set of coordinates {xⁱ}, we got a set of components for v, {vⁱ}, which were coordinate-dependent. So, a natural question to ask would be: How exactly do the components vⁱ for a vector v transform when we make a new choice of coordinates? This question is treated very thoroughly in the tensor node**, but the intuitive picture is that the components must somehow transform in a way "opposite" that of the basis vectors, so that the overall definition is coordinate-independent. Specifically, if the coordinate transformation results in a change of basis which can be represented as a matrix acting on basis vectors,

f_i = ∂/∂yⁱ = M_i^j e_j = M_i^j ∂/∂x^j

then the components of a given vector v must transform via the inverse matrix:

(v')ⁱ = (M^-1)ⁱ_j v^j.

Since we are expressing the basis vectors as partial derivatives, the matrix is just given by the chain rule,

M_i^j = ∂x^j/∂yⁱ

and its inverse is given by using the chain rule in the opposite direction,

(M^-1)ⁱ_j = ∂yⁱ/∂x^j.

We have written down three equivalent approaches to understanding the tangent space T_pM. In this third picture, we can think of T_pM as all possible directional derivative operators at p on functions defined on M. In local coordinates, these are all expressible as linear combinations of partial derivatives, {∂/∂x^j|_p}, j = {1,...,n}. The transformation rule for a change of coordinates is just given by the chain rule on partial derivatives; the basis vectors transform like derivatives, and the components of a vector transform inversely to the basis vectors. If we wanted, we could now define vectors in a fourth way, i.e. that they are merely a set of components {vⁱ} which transform via the inverse chain rule, as above. This definition for vectors would be entirely equivalent to the other three given above, but there is no corresponding intrinsic description of the space T_pM. Vectors in this picture lose their intrinsic value as a mathematical object; they no longer "live" anywhere as an element of a topological space. This definition is the most common, however, because it's easiest for purposes of direct computation.

A Bigger Picture

Now that we've constructed the tangent space T_pM from several different perspectives, it is a useful question to ask, "How is T_pM related to T_qM, for different points p and q in the manifold M?" We know that since the manifold is smooth, the vector spaces should somehow mesh smoothly with each other. We can also be sure about how this "meshing" should work on a global scale; this is how we define the tangent bundle, denoted simply TM. However, there is no god-given way of answering this question locally; we need to add additional structure to our manifold before we can compare tangent spaces. This leads to discussions of connections, parallel transport, covariant differentiation, and curvature. All of these notions can be defined before we add the further structure of a metric to the manifold.

*Try not to be confused by the fact that I'm using superscripts for the coordinate indices. This isn't supposed to represent "x squared" and "x cubed". It's just the most standard notation used for coordinate indices in this particular branch of mathematics and physics.
**This writeup was partially constructed as penance for my writeup in the tensor node. Since I produced that writeup, other noders created much better descriptions of tensors, describing them as multilinear maps acting on functions, rather than a set of components which transforms. It seemed silly to change my writeup at that point, considering that everything had been explained by others already, so I figured it would make better sense to try describing the place that vectors live, which is a similar but distinct concept.

totally disconnected (idea) by shaogo Sat Apr 08 2006 at 6:30:49

linked by jrn

Once again the opportunity has arisen to explain in simple language what's going on hereinabove.

This is topology for the common man.

GLOSSARY:

Topology: the study of bottle caps, screw-on jar lids, catsup bottle lids, and toupees. (You see, all part of the set X="tops").

Totally disconnected: the state of being out of touch with reality. Aunt Minerva's been in and out of psychiatric hospitals for years, and she's heavily medicated. She's a delightful old dear, but is indeed totally disconnected.

Cantor set: when the singing rabbi does a few numbers, then sits down and takes a break. The typical Cantor does two sets, one at the beginning of the service, and one at the end.

Cardinality: anything to do with a certain species of bird, the male of which displays fiery red-colored plumage.

Continuum: When one cleans with a vacuum cleaner, the act is referred to as "vacuuming." I will vacuum, she vacuums, I am vacuuming, we have vacuumed. However, if one stops this act and then resumes it, one may say "I will continuum."

R!: C'mon, silly, the middle initial of "Toys Я Us," with exclamation added.

Homeomorphic: I don't know what this is, but I'm pretty sure it has to do with that Brokeback Mountain stuff.

"Equipped with the product topology:" a line used by computer salesmen to get you to buy the top-of-the-line machine that's loaded with features.

_______

The definition is sublime. It makes so much sense. The phrase "each of its members is its own connected component" has to do with a gentleman's, ahem, member. Any man worth his salt should be pretty damn sure that his member is a connected component. Hey, remember the gal who cut her boyfriend's member off? His member wasn't connected anymore.

totally disconnected (thing) by ariels Fri Apr 07 2006 at 17:25:21

linked by jrn

(Topology)

A set X is called totally disconnected if each of its members is its own connected component.

The most famous example of a totally disconnected set is the Cantor set. Note that it has the cardinality of the continuum ℵ=2^ℵ₀. Totally disconnected sets can be big, even in R! This set is homeomorphic to the topological space {0,1}^N; in fact, every set F^N with F finite is totally disconnected when equipped with the product topology.

Can't get node id 1799607 by never

linked by jrn

positional notation (idea) by Gorgonzola Wed Mar 29 2006 at 7:03:09

linked by jrn

The most common method of representing numbers involves making strings out of a (small) finite alphabet, and using each position within the string as a multiplier with the symbol in that position. To be less abstract, a decimal positional notation is now in use by nearly every culture on Earth.

We'll get into the mathematics of positional notation later, but we'll start by saying the only really interesting mathematical property of a positional notation is its base, that is, the number of symbols in the alphabet that the strings are derived from. Our decimal positional system is base ten, that is, we use ten symbols (which vary from culture to culture) to represent numbers. In this system, a string position multiplies the contribution of a symbol there by a power of ten:

                    354 - represents "three hundred fifty-four"
three hundreds -----'||
five tens (fifty) ---'|
four -----------------'

The advent of computers has made systems with other bases (particularly two, eight, and sixteen) occasionally useful. (Now that I've introduced the concept, I'll drop the pretense of using words for numbers).

This little piggy goes to market...

It is far easier to perform complex calculations using positional notations than with physical analogues or earlier agglomerative symbols, primarily by allowing straightforward algorithms for doing them:

                            _______________
    735          DCCXXXV   |ooooo-----ooooo|
  x  49        x    XLIX   |ooooooo-----ooo|
  -----        ---------   |ooo-----ooooooo|
   6615        arrrrgh!!    
  2940                     |o-----ooooooooo| 
  -----                    |oooooo-----oooo|
  36015                         
                               go buy me a bunch of abaci...

Positional notations probably date from the earliest methods of counting using physical analogues. A crude form of positional notation was in use by preliterate people who, when counting objects using their fingers, ran out of fingers on one hand and used each finger on the other hand to represent a whole hands' worth of objects. Later methods involved putting small pebbles into cups or pits in the ground. Assigning positional importance to physical analogues continues with the use of the abacus in many places today.

Once people began writing things down, they invented symbols to stand in for numbers. The earliest number symbols resembled "tallies": New symbols for larger values piled up in the same way the smaller symbols did, and their order was not important.

The Babylonians invented the first written positional notation, using a base 60 (sexagesimal) system. Although keeping track of the system's 59 symbols seems unwieldy today, the symbols were actually aggolmerative collections of symbols representing 1, 5, and 20. At any rate, 59 symbols is not enough for a base 60 system, as 60 symbols are required. The missing symbol was a "zero" symbolizing no contribution at all for a particular position. As a result there was no way to tell the difference between 72, 702, and 720. However, the Babylonian system evolved to include a kind of zero: "placeholder" symbols eventually appeared in the middle of numbers, and Babylonian astrologers began using these placeholders at the beginnings and endings of numbers around the time of Buddha.

Two cultures later developed positional notations involving true zeroes: The Mayans who had a base 20 system, and Hindus who began using a base 10 system in the early centuries of the Common Era (to be more precise, the earliest archaeological evidence for decimal numbers in India comes from that time). This system spread west into Persia in the succeeding centuries, where it was picked up by Islamic scholars after the conquest. The most influential Islamic work on decimal numbers is an 825 treatise by Al-Khwarizmi. This was in turn read by Leonardo of Pisa, who introduced the notations and algorithms to Western Europe in his 1202 work Liber abaci. This method became known in Europe as augrim or algorism, attempts to represent the name al-Khwarizmi, and precusors to our modern word algorithm. During the Renaissance, algorism competed with other methods of calculation (including a graphical abacus) until the introduction of movable type made it the most feasible.

Other bases

Most of the time, other bases are for the sole purpose of demonstrating that you can use other bases. For example, base 3 uses the symbols {0, 1, 2}, and positions represent powers of 3 (1, 3, 9, 27, 81, …). 736 in base 3 would be 100021.

However, as described earlier, bases 2 ("binary"), 8 ("octal"), and 16 ("hexadecimal") have become useful for various aspects of computing.

Some mathematical stuff

Given an alphabet S = { S₀, S₁, S₂,… S_b-1} containing b symbols, a number N can be represented by selecting one symbol S_{d_i} from the set (a "digit") for each integer i, where

d_i = [b^-iN] - [b^1-iN]

where [x] represents "the greatest integer in x".

It should be evident that 0 <= d_i < b. Consequently, d_i is completely determined† for all i, and we can calculate the original number as N=Σbⁱd_i. Once all of the d_i have been determined we can represent the number of a string. For a positive integer:

…S_{d_i}S_{d_i-1}…S_d₁S_d₀

Nonintegers can be represented in this way, but since the digit strings could theoretically extend to infinity in both directions, it's necessary to "anchor" the string in some way. The usual way to do this is to introduce a "radix" symbol ("decimal point" in English-speaking countries) which is placed between S_d₀ and S_{d_-1}. This way, the symbols appearing after the radix symbol represent the fractional part of the number.

…S_{d_i}S_{d_i-1}…S_d₁S_d₀.S_{d_-1}S_{d_-2}…

Every real number is "finite", and so there is an integer j such that b^j-1 ≤ n < b^j and so d_k = 0 for all k≥j. It becomes unnecessary to include all of those "leading zeroes" in the string, but it's occasionally convenient to do so, and necessary after the radix symbol for representing fractions. In the fractional part (mantissa) of the string, digits can extend into infinity. If it's known that the digits past a certain point are all 0 they are customarily omitted, but some are frequently added to signify the precision of a measurement. If the fractional part contains an infinite repeating sequence, you will sometimes see a line drawn over the repeating part:

                         ______
1/7 = 0.142857142857…= 0.142857

Rational numbers

I went through all of this just so that I could show that every rational number has an infinite repeating sequence of symbols in its fractional part.

We can represent the reciprocal of an integer q > 0 in manner analogous to long division. If we define r₀=1 and d_i+1=[br_i/q]

br₀ = d₁q + r₁
br₁ = d₂q + r₂
br₂ = d₃q + r₃
br_i = d_i+1q + r_i+1

where 0 ≤ d_i < b and 0 ≤ r_i < q.

Multiply both sides of the last equation by 1/qbⁱ⁺¹:

r_i/qbⁱ = d_i+1/qbⁱ⁺¹ + r_i+1/qbⁱ⁺¹

Since we set r₀ = 1 at the outset,

1 / q = d₁/b¹ + r₁/qb¹
      = d₁/b¹ + d₂/b² + r₂/qb²
      = d₁/b¹ + d₂/b² + … + d_i/bⁱ …

The definition of the "Greatest integer in" function determines d_i and r_i for all i. In addition, r_i completely determines d_i+1 and r_i+1. Thus, r_i = r_j means d_i+1 = d_j+1 and r_i+1 = r_j+1, such that whenever i < j and r_i = r_j, the series of digits S_{d_i}..S_{d_j-1} is repeated forever.

Since q is an integer, all of the r_i are integers. Thus, there are only q different values for the various r_i. So eventually, r_i=r_j for some i, j where i < j, triggering the infinite sequence in 1/q's expansion.

Multiplying 1/q by some integer p will also result in an infinite sequence, but I'll leave that proof to you.

Not only that...

If the fractional part of a digit string contains an infinite repeating sequence, the resulting number must be rational. Let's say that a sequence of k digits repeats forever, starting at digit i. Thus:

r = d₁/b¹ + … + d_i-1/b^i-1 
       + d_i/bⁱ + … + d_i+k-1/b^i+k-1
       + d_i/b^i+k + … + d_i+k-1/b^i+2k-1
       + d_i/b^i+2k + … + d_i+k-1/b^i+3k-1
       +…

We can group the terms such that the first group contains the first i-1 terms, and each group after that contains the next k terms. Then, we multiply the first group by b^i-1/b^i-1, and each successive group by b^i+nk-1/b^i+nk-1:

r = (d₁/b^i-2 + … + d_i-1/b⁰) / b^i-1 
       + (d_ib^k-1 + … + d_i+k-1b⁰) / b^i+k-1
       + (d_ib^k-1 + … + d_i+k-1b⁰) / b^i+2k-1
       + (d_ib^k-1 + … + d_i+k-1b⁰) / b^i+3k-1
       +…

If we let S = d₁/b^i-2 + … + d_i-1/b⁰ and T=d_ib^k-1 + … + d_i+k-1b⁰,

r = S / b^i-1 + T / (b^i-1 * Lim(n->∞) Σ(p=1..n, b^-pk)) )

The sum of the geometric series Lim (n->∞)&Sigma(p=1..n, b^-pk)) converges to 1/(b^k-1) so

r = S / b^i-1 + T / (b^i-1(b^k-1))
 = (S(b^k-1)+T) / (b^i-1(b^k-1))

which is clearly†† a rational number.

Irrational numbers

A method similar to the one used for rational numbers can be used to show that a digit sequence exists for all real numbers, not just rationals. For a real number a, r₀ = 1 and br_i = d_i+1a + r_i+1 still results in 1 / a = d₁/b¹ + d₂/b² + … + d_i/bⁱ …, but since a might not be integral, the various r_i might not be integers and a pair of r_i might never match. Since a repeating sequence appears only for a rational number, an irrational must be represented by an infinite non-repeating sequence.

Oh boy, the most tiresome flamewar on `sci.math`!

The positional representation of fractional real numbers introduces ambiguity into the system. The sum of the geometric series

Lim(j->∞)Σ(i=1..j, b^-i)

converges to 1/(b-1), and so

Lim(j->∞)Σ(i=1..j, (b-1)b^-i) = 1.

This means that whenever a number is a multiple of bⁱ for some i, there are two ways of representing the number:

If we assign d_i as above, d_j = 0 for all j < i.
However, when [b^-iN] - [b^1-iN] > 0, the limit above allows us to assign d_i = [b^-iN] - [b^1-iN] - 1, after which we can assign d_j = b-1 for all j < i and still get the same number.

In base 10, this means the string "1" represents the same number as the string "0.99999…". Mathematicians care about this because: 1) cranks love to argue the point, and 2) it causes difficulty when mapping the power set of the integers to the reals.

†Well, opinions differ on the truth of this; see Law of Trichotomy for an explanation.

††A famous result of Euler's shows that for all integral a, b, there must be some k < a such that a divides b^k-1. Remind me to node this some time.

<- newer | older ->

Venerable members of this group:

This is topology for the common man.

(Topology)

This little piggy goes to market...

Other bases

Some mathematical stuff

Rational numbers

Not only that...

Irrational numbers

Oh boy, the most tiresome flamewar on sci.math!

Oh boy, the most tiresome flamewar on `sci.math`!