In an economic model, a parameter or variable is said to be endogenous when there is a correlation between the parameter or variable and the error term. Endogeneity can arise as a result of measurement error, autoregression with autocorrelated errors, simultaneity, omitted variables, and sample selection errors (Kennedy p 139).
For example, in a simple supply and demand model, when predicting the quantity demanded in equilibrium, the price is endogenous because producers change their price in response to demand and consumers change their demand in response to price. In contrast, a change in consumer tastes or preferences would be an exogenous change on the demand curve. In this case, the price variable is said to have total endogeneity once the demand and supply curves are known.
In Econometrics
In econometrics the problem of endogeneity occurs when the independent variable is correlated with the error term in a regression model. This implies that the regression coefficient in an OLS regression is biased. There are many methods of overcoming this, including instrumental variable regression and Heckman selection correction.
Sources
The following are some common sources of endogeneity.
Omitted Variable
In this case, the endogeneity comes from an uncontrolled confounding variable. A variable is both correlated with an independent variable in the model and with the error term. (Equivalently, the omitted variable both affects the independent variable and separately affects the dependent variable.) Assume that the "true" model to be estimated is,
- yi = α + βxi + γzi + ui
but we omit zi (perhaps because we don't have a measure for it) when we run our regression. zi will get absorbed by the error term and we will actually estimate,
(where
)
If the correlation of x and z is not 0 and z separately affects y (meaning
), then x is correlated with the error term u.
Measurement Error
Suppose that we do not get a perfect measure of one of our independent variables. Imagine that instead of observing xi we observe
where νi is the measurement "noise". When we try to estimate the following univariate regression,

we actually end up estimating,


(where
)
Since both
and ui depend on νi, they are correlated. Measurement error in the dependent variable, however, does not cause endogeneity (though it does increase the variance of the error term).
Simultaneity
Suppose that two variables are codetermined, with each affecting the other. Suppose that we have two "structural" equations,
- yi = β1xi + γ1zi + ui
- zi = β2xi + γ2yi + vi
We can show that estimating either equation results in endogeneity. In the case of the first structural equation, we will show that
. First, solving for zi we get (assuming that
),

Assuming that xi and vi are uncorrelated with ui, we find that,


Therefore, attempts at estimating either structural equation will be hampered by endogeneity.
In time series
The endogeneity problem is particularly relevant in the context of time series analysis of causal processes. It is common for some factors within a causal system to be dependent for their value in period t on the values of other factors in the causal system in period t-1. Suppose that the level of pest infestation is independent of all other factors within a given period, but is influenced by the level of rainfall and fertilizer in the preceding period. In this instance it would be correct to say that infestation is exogenous within the period, but endogenous over time.
See also
References
Peter Kennedy. "A Guide to Econometrics". Sixth Edition. (c) 2008. Page 139.