David Marr's computational model of vision

David Marr, who tragically died of leukaemia in his 30s, was among the first to come up with a model of brain function based around advances with computers in the 1970s and 1980s. He saw the brain as analogous to a very complex computer. Marr applied his ideas to try and model the function of the visual system. Marr saw this computational approach as having three levels:

Computational theory: What is it that the model is trying to accomplish? What are the processes for?

Algorithmic level: What algorithm is needed? What sort of processes might be needed?

Mechanism level: What mechanism is needed to implement the algorithm? With the brain this would be associated with its neurology.

There is a problem with such theories, and that is we cannot really be sure that they are truly representative of actual brain function. We cannot infer an algorithm from the properties of cells, and nor can we definitely infer a mechanism if we know the algorithm.

Marr’s theory has five stages after the initial visual input from the eyes:

Grey-level representation:

At this stage the intensity of light at each point in the visual field is made explicit. Colour is irrelevant.

Raw primal sketch:

At the stage, the location of intensity changes is made explicit by finding “zero-crossings”. The “laplacian of a gaussian” is applied to each point in the visual field. Mathematically speaking, this involves finding the second derivative of the curve of light intensity. This shows the rate of change of the rate of change of light intensity. The points on this new curve where it changes from a positive to a negative value correspond to the changes in light intensity.

Primal sketch:

At this stage, the locations of changes of light intensity are interpreted as contours and boundaries. There is a potential problem here in that changes in light intensity do not necessarily correspond to edges of objects and so forth. They can be caused by the textures of objects, and also random noise in the visual input. To combat such problems, the location of intensity changes is measured over several different sized areas. If there is an intensity change over a small area, but not a larger area, then that change is likely to be random noise. If there is a change over a large area, but not a small area, then this can be interpreted and a gradual shading pattern.

2 ½ D sketch:

At this stage, local depth and surface orientation is made explicit. The 3D information that we detect is combined to perceive localised depth of objects.

3D model:

Here all the visual information is combined to produce a complete 3D representation of the world.

Marr’s theory takes a somewhat object-centred approach. His system says object recognition involves finding axes of symmetry to interpret objects. He also identifies the use of “generalised cones” in recognising things, like animals or humans. For instance, the brain would treat the human form as a set of conical shapes – rather like an artist’s mannequin. This sort of generalisation makes sense, as when we look at a person it would be very inefficient for the visual system to process every hair on their head individually. Instead, Marr would say we generalise it to get a general impression of “hairiness”.

Marr’s theory is a workable system and could probably be programmed on a computer. However, of all the possible ways we could process visual information, it is unlikely to actually be the way Marr has described.

Word superiority effect	Wada test	direct vs. indirect perception	Predicate Logic
Wernicke's aphasia	On the Origin of Genres	Computational Models of Experience	George Washington's 1794 State of the Union Address
prosopagnosia	Jerry Fodor	Weber's Law	Laplacian
The function of the brain and nervous system	Leukaemia	computation	neurology
algorithm

David Marr's computational model of vision

Page category: