The above writeups deal very well with the concept of a function
being differentiable (and little more is need to deal with functions from R
but what about function
s of several variable
s? Clearly the existing definition will not suffice as one cannot divide by a vector.
This writeup will deal with the slightly more general case of functions from a normed space U to another normed space V(for example Rn and Rm).
This is by no means a trivial problem, differentiability of function from one normed space to another was the subject of research papers as late as the beginning of the 20th century. The approach here is that of the French mathematician Frechet.
The rise and fall of partial derivatives
The natural thing to do when dealing with a function of several variables is to ask what happens if you fix all but one of them and change the remaining one. This gives you partial derivatives, and more generally directional derivatives, which are dealt with in more detail here.
In practise partial derivatives work very well and enable to do things like finding extrema, but partial derivatives do not tell you much about what it actually means for a function to be differentiable.
There are also fairly "nice" looking functions where some directional derivatives exist, but not all, or where all directional derivatives exist, but if you approach a point along some other path, for example a parabola, then the resulting function is not differentiable.
The directional derivative in a given direction is basically the rate of change of the function if you move only in the given direction.
Let u be a unit vector in U, a a point in U
Let g:R→V be defined by g(t)=f(a+tu)
Then directional derivative of f in the direction u at the point a is g'(0);
A bit of work will give you that this is the same as (∇f).u
Explicitly, if f is a function from R2 to R and u has coordinates (u1,u2) then the directional derivative in the direction u should be :
u1∂f/∂x + u2∂f/∂y
However, if you take f to be the function x2
), you will have a nasty surprise at (0,0). All the partial and directional derivatives exist, but if you look closely you will find :
but the directional derivative in the direction (1/√2,1/√2) is (1/√2)3
The reason this happens is that this function is not actually differentiable at (0,0), but without a proper definition of differentiability you can't know that.
Higher order partial derivatives usually commute, i.e. ∂f/∂x∂y = ∂f/∂y∂x but sometimes they don't, merely looking at them will not yield necessary conditions for this to happen (Lecturers will often say that it works for "nice" functions). What I also found irritating, is that there was no object which I could point at and call THE derivative of the function.
A New Hope
For a function of R to R, being differentiable also means that there is a 1st order taylor series for f, i.e.
f differentiable at x ⇔ f(x+h)=f(x)+hf'(x) +o(x)
's approach builds on this idea of approximating the change of function with a linear
A function f from a normed space U to a normed space V is said to differentiable at a point x if there is a linear map α(x):U → V such that :
The linear map
α(x), usually written Df(x) or f'(x) is the derivative
of f at x. At first this may seem a very strange and convoluted idea, but it works very well.
If we go back to the case when U=V=R, we can see that
Df(x)(h)=hf'(x) (where f' here means the "normal" 1-D derivative); it is just the linear map with slope f'(x).
But what does this do for me?
At this point you may be wondering why bother using something abstract like a linear map when you could just use partial derivatives; what does one gain through this approach?. This definition of differentiability is a solid one which enables you to prove many things. To solve one of the problems of partial derivatives I mentioned we have the following theorem:
If there exists δ>0 such that in the open ball centered at x and of radius δ all the partial derivatives of f exist and are continuous then f is differentiable at x (The converse is false (thanks jrn))
This definition also allows you to produce a necessary and sufficient condition for partial derivative
s to commute
Partial derivatives and Df(x) are closely related. If Df(x) exists then the matrix representing it in the canonical basis is the matrix of partial derivatives (but if the function is not differentiable you may still be able to write down this matrix, it just won't satisfy the definition of Df(x)).
This definition of differentiability also enables you to prove many theorems closely resembling those valid for functions of R→R, such as :
While this definition may be confusing at first, it certainly is a useful one which provides many results, and may even seem natural after a while.