Within the field of machine learning
and expert system
s, a decision tree is a particular type of classifier
. In such a situation, you are trying to classify an object that can be one of several different classes. At each point, a particular choice as to which edge in the tree to follo on the basis of the value of some attribute of the object. Eventually, you get down to a leaf node
, and you classify it using the label at the leaf node.
For example, consider different types of fruit, say
apples, bananas, limes and lemons. Each fruit, let's say, is described in terms of shape, texture and colour. Then a decision tree might look like:
yellow green red
| | |
shape? texture? apple.
| | | |
long round smooth dimpled
| | | |
banana. lemon. apple. lime.
Hence, let's say I come across a fruit that's green
, round and dimpled
. Then starting at the top of the tree, it ask it's colour. It's green, so I go down the middle edge. The next question is what is the texture? It's dimpled
, so I go down the right edge. Hence I conclude that the fruit is a lime
These trees can be hand constructed, but more impressively, given a training set (in other words, lots of examples of fruits described in terms of attributes), they can be constructed automatically using various heuristics. Hence, a decision tree builder "learns" to classify fruit.
This is one of the earliest and simplest types of machine learning. It can be used for many types of problems; for example, if you apply for an American Express card, chances are it's an automatically generated decision tree that's going to decide whether you get one or not (on the basis of attributes like yearly income, past defaults on loans, etc.). It turns out that automatically generated decision trees perform better than humans on this task!
Many decision tree building algorithms are available for free. The most well known of these is C4.5, written by Ross Quinlan; but others include CART and Aq. It is also included in the free machine learning toolkit Weka.