display | more...

A regression tree is, on the face of it, simply a decision tree which predicts an continuous variable (alternately called an interval variable) instead of a categorical variable (alternately called a nominal variable).

Simply put, a regression tree is simply a decision tree that predicts a value from a continuum instead of from a discrete set. This simple definition glosses over some important and interesting details.

In order to grow a regression tree, one cannot simply apply splitting methods from decision trees. Indeed, many decision tree splitting metrics (e.g. entropy, Gini index) have no or no simple extension to continuous response variables.

Often, regression tree programs use change in variance resulting from a candidate split or the p-value corresponding to an F-test to determine a node's split.

Finally, the value being predicted at a (leaf node or terminal node) must somehow be assigned. This is conventionally done by predicting the value of a leaf node to be the average of the response variable in the training set observations (or records) assigned by the regression tree to that leaf.

Log in or register to write something here or to contact authors.