In statistics, omitted-variable bias (OVB) is the bias that appears in estimates of parameters in a regression analysis when the assumed specification is incorrect, in that it omits an independent variable (possibly non-delineated) that should be in the model.
Omitted-variable bias in linear regression
Two conditions must hold true for omitted-variable bias to exist in linear regression:
- the omitted variable must be a determinant of the dependent variable (i.e., its true regression coefficient is not zero); and
- the omitted variable must be correlated with one or more of the included independent variables.
As an example, consider a linear model of the form
where
- xi is a 1 × p row vector, and is part of the observed data;
- β is a p × 1 column vector of unobservable parameters to be estimated;
- zi is a scalar and is part of the observed data;
- δ is a scalar and is an unobservable parameter to be estimated;
- the error terms ui are unobservable random variables having expected value 0 (conditionally on xi and zi);
- the dependent variables yi are part of the observed data.
We let
and
Then through the usual least squares calculation, the estimated parameter vector
based only on the observed x-values but omitting the observed z values, is given by:
(where the "prime" notation means the transpose of a matrix).
Substituting for Y based on the assumed linear model,
Taking expectations, the final term
- (X'X) − 1X'U
falls out by the assumption that U has zero expectation. Simplifying the remaining terms:
The second term above is the omitted-variable bias in this case. Note that the bias is equal to the weighted portion of zi which is "explained" by xi.
References
- Greene, WH (1993). Econometric Analysis, 2nd ed.. Macmillan. pp. 245–246.
|
|||||||||||
| This statistics-related article is a stub. You can help Wikipedia by expanding it. |
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)


![X = \left[ \begin{array}{c} x_1 \\ \vdots \\ x_n \end{array} \right] \in \mathbb{R}^{n\times p},](http://wpcontent.answers.com/math/6/a/8/6a852b783d056c37d9d82a99a74a598a.png)
![Y = \left[ \begin{array}{c} y_1 \\ \vdots \\ y_n \end{array} \right],\quad Z = \left[ \begin{array}{c} z_1 \\ \vdots \\ z_n \end{array} \right],\quad U = \left[ \begin{array}{c} u_1 \\ \vdots \\ u_n \end{array} \right] \in \mathbb{R}^{n\times 1}.](http://wpcontent.answers.com/math/7/2/8/728711b2ec4afef6fb7988583dd5c3e2.png)


![\begin{align}
E[ \hat{\beta} ] & = \beta + (X'X)^{-1}X'Z\delta \\
& = \beta + \text{bias}.
\end{align}](http://wpcontent.answers.com/math/1/6/4/164beca832645ba4ceec2fa218fe483b.png)



