The idea behind shape from

shading is to recover the shape of an object
given pictures of it taken under different lighting conditions.
Some implementations are able to get a rough idea of the object
with only one single picture of it, but to get good results you
generally need more.

A typical application is the analysis of satellite snapshots
(for example to obtain a height map of an area, or to acquire
the geometry of buildings). Pictures of the same spot are taken
at different day time and information is extracted by analysis of
the shadows casted on the floor.
Shape from shading makes it possible to build cheap 3d scanners.
With a digital camera, white paint and a spotlight it is possible
to acquire the shape of objects and then use them in your favorite
ray tracing program. I successfully implemented this technique
and I explain it below.

A related scheme is shape acquisition from a set of stereoscopic views.
Seeing the same object under two slightly different angles,
just like your two eyes do, makes the perception
of volumes possible. This is called stereoscopic vision.
On the contrary it makes things easier for shape from shading if only
lighting conditions vary and the viewer occupies the same spot throughout
the different pictures.

**If I were a light beam**

The first step in shape from shading is to understand how light is
diffused by objects that surround us and many models have been developed
to do this. They all root down to an equation called the bi-directional
reflectance diffusion function, application of the first law of thermodynamics :
The surfacic power of light emitted by a small surface is equal to the sum of
all the incident surfacic powers.

Inter-reflections make this calculation very complex. Suppose two objects
face each other, they both send light back and forth to each other and
as a result each object is indirectly lit by himself !
To simplify this, the global illumination model has been introduced in computer graphics.
It divides light in three parts :

- Ambient : this is a constant component that models inter-reflections.
Even if not directly lit by a source, objects receive light. Ambient lighting
supposes that the amount of indirect lighting is constant throughout the scene
(or the part of the scene). This technique contrasts with radiosity.
- Diffuse : this is the component of light emitted by the object in every direction.
The same power is emitted per solid angle.
It represents the "rough" aspect of the surface.
- Specular : this is the component of light emitted that creates a highlight on
the surface. The power emitted per solid angle has a peak around the
direction of the normal reflection of the ray.
It represents the "shiny" aspect of the surface.

On the beautiful diagram below, you can see a sphere lit from the top-right
angle and you can spot each of the three regions :
ambient (

`X`

), diffuse (

`.`

), and specular (

`O`

).

.... + light
X...OO.. source
XX....O...
XXX.......
XXX.....
XXX.

The

Phong illumination model is commonly used to model the specular term.
Working with shiny objects is complex because the

power emitted isn't
uniformly distributed. The easiest way to do shape from shading is to
use non-shiny surfaces, or to flag specular regions as "corrupted data".

The diffusion term can be calculated with Lambert's law, also called
Lambert's cosine law. It has the great advantage that the power emitted
per solid angle (called radiance)
is constant and only depends on the amount of light received (the irradiance).
This means that the surface will be seen having the same luminosity
whatever angle it forms with a detector (camera or eye) ;
luminosity is indeed the power captured per solid angle by a receptor.

**Let's get those normals**

Since the irradiance (surfacic power received by a surface)
can be obtained by measuring the radiance (surfacic power emitted
per solid angle) and that the irradiance is linked to the angle between
the normal to the surface and the incident ray of light, it is possible
to extract those normals from multiple observations.

Let's say the light source is located far enough away from the object to consider
that all rays are parallel and convey a constant surfacic power P (in Watt*m^-2).
An rough example of a such source is the Sun.
If the surface does an angle θ with the light ray, the surfacic power
absorbed is cos(θ)*P.

_________________/
light ________________/
beam _______________/
______________/ θ
/______

Lambert's law states that the

radiance is k*P*cos(θ), and thus the

luminosity
is I=λ*k*P*cos(θ). Let's group all constants in one, this relation can be expressed
with a

scalar product of

vectors. If s is a

vector that points to the light source and n is
the

normal vector to the surface.

I = K * s . n

Say we have three pictures with the light incoming from three different directions. This yields
three intensity measurements, hence :

I_{1} = K * s_{1} . n

I_{2} = K * s_{2} . n

I_{3} = K * s_{3} . n

The normal is obtained by solving this 3x3

linear system. This operation should be repeated for each
point of the object. With the bunch of

normals and a little assumption on the surface, it is possible
to determine it completely. In a nutshell, we're almost done !

**Let's get this surface**

If the surface is differentiable (that means that it is continuous, ie. it
has no holes in it, and that it has a normal vector, which is the case of almost
everything at our scale), let's express its equation as :

f(x,y) - z = 0

A normal vector to this surface at point (x, y, f(x,y)) is
(df/dx(x,y), df/dy(x,y), -1). This one should be parallel to the one computed in the step above.
f is known by its

partial derivatives. The method I suggest
to obtain f is to use

finite differencing which yields a huge

linear system that can be solved
by

Gauss-Seidel iteration.

Finite differencing in a glimpse :

df/dx(x,y) ∼ (f(x+h,y) - f(x,y)) / h

Take h as small as possible.

**Strengths and drawbacks**

You'd be surprised to see how fast this method is. Normal calculation is a O(n)
step, where n is the number of pixels.
Solving the linear system obtained by finite differencing
is fast since it is diagonally dominant (about O(n) too).
All in all, you can do shape from shading in O(n) operations.

But the surface used must be perfectly lit : It must be perfectly diffusive
(ie. it should not be shiny/have a specular) because only diffusion is considered, and
have a uniform color because intensities are computed on the same basis,
and all the surface must be directly hit by the light because shadows are not supported.
So this method isn't very interesting for outdoor objects (for example satellite snapshots)
but I obtained very interesting results with white wax/playdoh models and my digital camera.