The Least Squares Assumptions for Multiple Regression

Assumptions of Multiple Regression These are the conditions under which the OLS estimator is valid and has the nice statistical properties we rely on (like unbiasedness and consistency). Assumption 1: Zero Conditional Mean E ( u | X 1 = x 1 ,…, X k = x k ) = 0 Meaning: On average, the omitted factors uuu are unrelated to the included regressors XXX. Put differently: Once you control for the regressors, there’s no leftover systematic relationship between uuu and XXX. Why it matters: If this fails, your regression suffers from omitted variable bias . Example: If PctEL (percent English learners) belongs in the model but you leave it out, and it’s correlated with STR, then the STR coefficient gets biased. Solution: Include the omitted variable (if you can measure it). Assumption 2: i.i.d. Sampling ( X 1 i ,…, X ki ,Y i ), i =1,…, n , are i.i.d. Meaning: Each observation comes from the same population and ...