Many workers in the field of Image Processing borrow techniques from the field of signal processing.

Signals (such as radio signals, sounds etc..) are a function of time. I.e. The amplitude (or some other dependent variable) changes with time.

In images, we have a variable (A single value for a monochrome image, a vector of three values for a color image) which changes across the width and height of the image. This means that whereas the radio signal was a "one dimensional" function (i.e. dependent upon one value - time) the image is now a "two dimensional" function - (i.e. dependent upon both the x and y locations of each pixel).

We can generalise this still further - a motion picture represents a function in three dimensions, as does a volumetric medical scan, and an animation of a volumetric medical scan is a function in four dimensions. If the medical scan animation contains wavelength/frequency information, we can generalise to five or more dimensions.

In any case, we can use the same mathematical techniques that we have been using for a century or more to analyse radio and electrical signals to analyse these newly available higher dimensionality signals.

The most traitional technique for signal processing is the fourier transform. Other techniques include (but are not limited to) the cosine transform, the haar wavelet wavelet transform, the debauchies wavelet wavelet transform and the gabor transform, also sometimes called the gabor wavelet transform.

A fourier transform looks at our signal and tries to decompose it as a sum of sine waves of varying amplitudes and frequencies. In two dimensions- a sum of sine waves of varying amplitudes, frequencies and directions.

A cosine transform attempts the same thing, but it tries to decompose the signal into cosine waves instead of sine waves. JPG compression uses a cosine transform.

Both sine waves and cosine waves are periodic functions of infinite extent - they go on forever. This has drawbacks. It means that each coefficient of the transform is dependent upon every part of the image. This is a bad thing, because we want to look at each part of the image independently.

A wavelet transform attempts to use pulses instead of waves - a wavelet transform looks at our signal and tries to decompose it as a sum of pulses or wavelets of varying amplitudes, dilations and translations. (I.e. pulses with different heights, widths and locations).

The pulse that is used is called the mother wavelet and wavelet transforms are classified according to mother wavelet. Haar wavelets use a step function, Debauchies wavelets use a wierd looking spikey Debauchies function, and Gabor wavelets use a gaussian function - which is the normal distribution from your schoolboy statistics textbooks.

Wavelet transformss do also, however, differ according to how the mother wavelet is transformed. Traditional wavelet transforms use large discrete translations - which can cause problems analysing objects located between these translation values. They also use large discrete dilations which causes similar problems to the large translation values used. Confusingly, amplitudes usually have plenty of freedom to change - the exception to the rule.

We have a classical engineering tradeoff between the computational complexity (read speed) of the transform computation and the degree of freedom offered to the transform to enable it to represent the objects in the signal or image.

The Gabor transform attempts (and fails) to find an optimal coverage of the transformation space of the mother wavelet -which is usually, but not neccessarily a gaussian distribution.

In real applications we must choose our mother wavelet and our coverage of the transformation space after careful consideration of the data that we intend to analyse. This is an engineering problem, not a scientific one.

William Payne, 28 Sept 2003.

Log in or register to write something here or to contact authors.