Heterogeneity and Nonlinearity
As we saw in the previous section, a linear causal model in combination with the CIA leads to a linear CEF with a causal interpretation. Assuming the CEF is linear, …
Differences-in-differences: Pre and Post, Treatment and Control
The fixed effects strategy requires panel data, that is, repeated observations on the same individuals (or firms or whatever the unit of observation might be). Often, however, the regressor of …
The Wald Estimator
The simplest IV estimator uses a single binary (0-1) instrument to estimate a model with one endogenous regressor and no covariates. Without covariates, the causal regression model is Y i …
Tricky Points
The language of conditional quantiles is tricky. Sometimes we talk about "quantile regression coefficients at the median," or "effects on those at the lower decile." But it’s important to remember …
LATE with Multiple Instruments
The multiple-instruments extension is easy to see. This is essentially the same as a result we discussed in the grouped-data context. Consider a pair of dummy instruments, Zu and Z2i. …
Regression Meets Matching
The past decade or two has seen increasing interest in matching as an empirical tool. Matching as a strategy to control for covariates is typically motivated by the CIA, as …
Regression DD
As with the fixed effects model, we can use regression to estimate equations like (5.2.2). Let NJs be a dummy for restaurants in New Jersey and dt be a time-dummy …
Grouped Data and 2SLS
The Wald estimator is the mother of all instrumental variables estimators because more complicated 2SLS estimators can typically be constructed from an underlying set of Wald estimators. The link between …
Quantile Treatment Effects
The $42,000 question regarding any set of regression estimates is whether they have a causal interpretation. This is no less true for quantile regression than Ordinary Least Squares. Suppose we …
Covariates in the Heterogeneous-effects Model
You might be wondering where the covariates have gone. After all, covariates played a starring role in our earlier discussion of regression and matching. Yet the LATE theorem does not …
Control for Covariates Using the Propensity Score
The most important result in regression theory is the omitted variables bias formula: coefficients on included variables are unaffected by the omission of variables when the variables omitted are uncorrelated …
Fixed Effects versus Lagged Dependent Variables
Fixed effects and differences-in-differences estimators are based on the presumption of time-invariant (or group-invariant) omitted variables. Suppose, for example, we are interested in the effects of participation in a subsidized …
Asymptotic 2SLS Inference
4.2.1 The Limiting Distribution of the 2SLS Coefficient Vector We can derive the limiting distribution of the 2SLS coefficient vector using an argument similar to that used I
The QTE Estimator
The QTE estimator is motivated by the observation that, since the parameters of interest are quantile regression coefficients for compliers, they can (theoretically) be estimated consistently by running quantile regressions …
Average Causal Response with Variable Treatment Intensity*
An important difference between the causal effects of a dummy variable and a variable that takes on the values {0, 1, 2, . . .} is that in the first …
Propensity-Score Methods vs. Regression
Propensity-score methods shift attention from the estimation of E[Yj|Xj, Dj] to the estimation of the propensity score, p(Xi) = E[Dj|Xj]. This is attractive in applications where the latter is easier …
Appendix: More on fixed effects and lagged dependent variables
To simplify, we ignore covariates and year effects and assume there are only two periods, with treatment equal to zero for everyone in the first period (the punch line is …
Over-identiflcation and the 2SLS MinimandF
Constant-effects models with more instruments than endogenous regressors are said to be over-identified. Because there are more instruments than needed to identify the parameters of interest, these models impose a …
Nonstandard Standard Error Issues
We have normality. I repeat, we have normality. Anything you still can’t cope with is therefore your own problem. Douglas Adams, The Hitchhiker’s Guide to the Galaxy (1979) Today, software …
IV Details
2.5.1 2SLS Mistakes 2SLS estimates are easy to compute, especially since software like SAS and Stata will do it for you. Occasionally, however, you might be tempted to do it …
Mostly Harmless Econometrics: An Empiricist’s Companion
The universe of econometrics is constantly expanding. Econometric methods and practice have advanced greatly as a result, but the modern menu of econometric methods can seem confusing, even to an …
Linear Regression and the CEF
So what’s the regression you want to run? In our world, this question or one like it is heard almost every day. Regression estimates provide a valuable baseline for almost …
Acknowledgments
We had the benefit of comments from many friends and colleagues as this project progressed. Special thanks are due to Alberto Abadie, David Autor, Amitabh Chandra, Monica Chen, John DiNardo, …
Asymptotic OLS Inference
In practice, we don’t usually know what the CEF or the population regression vector is. We therefore draw statistical inferences about these quantities using samples. Statistical inference is what much …
Organization of this Book
We begin with two introductory chapters. The first describes the type of research agenda for which the material in subsequent chapters is most likely to be useful. The second discusses …
Saturated Models, Main Effects, and Other Regression Talk
We often discuss regression models using terms like saturated and main effects. These terms originate in an experimentalist tradition that uses regression to model discrete treatment-type variables. This language is …
Questions about Questions
‘I checked it very thoroughly,’ said the computer, ‘and that quite definitely is the answer. I think the problem, to be quite honest with you, is that you’ve never actually …
The Experimental Ideal
It is an important and popular fact that things are not always what they seem. For instance, on the planet Earth, man had always assumed that he was more intelligent …
The Selection Problem
We take a brief time-out for a more formal discussion of the role experiments play in uncovering causal effects. Suppose you are interested in a causal “if-then” question. To be …
Random Assignment Solves the Selection Problem
Random assignment of Dj solves the selection problem because random assignment makes Dj independent of potential outcomes. To see this, note that E[yj |dj = 1] - E[Yj|Dj =0] = …
Regression Analysis of Experiments
Regression is a useful tool for the study of causal questions, including the analysis of data from experiments. Suppose (for now) that the treatment effect is the same for everyone, …
Making Regression Make Sense
’Let us think the unthinkable, let us do the undoable. Let us prepare to grapple with the ineffable itself, and see if we may not eff it after all.’ Douglas …
Regression Fundamentals
The end of the previous chapter introduces regression models as a computational device for the estimation of treatment-control differences in an experiment, with and without covariates. Because the regressor of …
Economic Relationships and the Conditional Expectation Function
Empirical economic research in our field of Labor Economics is typically concerned with the statistical analysis of individual economic circumstances, and especially differences between people that might account for differences …