A previous node suggested the following:

A statistical method in which the relationship between the mean value of a random variable and the corresponding values of one or more independent variables is observed.

I would like to clarify a few things. Only in the most technical sense is this correct, but it fails to note many of the subtleties, some of which I discuss below.

1) The node above is only technically correct for what we call Ordinary Least Squares Regression, or OLS which has a variety of assumptions associated with it. Among these assumptions is that the independent and dependent variables are normally distributed, random variables. However, this assumption can be loosened, thus making the general definition above actually too general.

2) Regression is almost always employed when there are MORE THAN one independent variables. When we have simply one dependent and one independent variable, the Beta coefficient that is generated is functionally equivalent to r, or the correlation coefficient. Furthermore, bivariate ( one independent one dependent variable) almost certainly violates another one of the assumptions of regression, that of specification. An ill specified model (regression of course being a model of reality) leads to estimators ( in the form of Beta coefficients) which are unreliable. Sadly, there is no way of knowing a priori if they are biased upwards or downwards.

There are additional considerations to take into account with regression, but these should go into another discussion.

To clarify the clarification posted by dogboy, the assumptions of simple and multiple linear regression are as follows (these are called the Gauss-Markov conditions):

  1. Linear relationship between the independant variables and the dependant variable; if this is not the case, then one of the following steps is necessary:
    • A polynomial term needs to be applied;
    • The variable should be transformed (logarithm, square root, arcsine square root etc.);
    • The suitability of a regression approach should be reconsidered.
  2. Homoscedasticity of the residuals
  3. Independance of the observations
  4. The independant variables must either be controlled or measured with great precision compared to the dependant variable; if this is not the case, then major axis regression should be applied

If these conditions are not met, not only will the beta coefficients be biased, but the model's predictive capacity will likely be misrepresented.

Re*gres"sion (r?*gr?sh"?n), n. [L. regressio: cf. F. r'egression.]

The act of passing back or returning; retrogression; retrogradation.

Sir T. Browne.

Edge of regression (of a surface) Geom., the line along which a surface turns back upon itself; -- called also a cuspidal edge. -- Regression point Geom., a cusp.

 

© Webster 1913.

Log in or register to write something here or to contact authors.