Program evaluation and causal inference with high-dimensional data
- Alexandre Belloni
- Victor Chernozhukov
- Ivan Fernandez-Val
- Christian Hansen
Published on 19 March 2016
This Working Paper has been replaced by Program evaluation with high-dimensional data
In this paper, we provide efficient estimators and honest confidence bands for a variety
of treatment effects including local average (LATE) and local quantile treatment effects (LQTE)
in data-rich environments. We can handle very many control variables, endogenous receipt of
treatment, heterogeneous treatment effects, and function-valued outcomes. Our framework covers
the special case of exogenous receipt of treatment, either conditional on controls or unconditionally
as in randomized control trials. In the latter case, our approach produces ecient estimators and
honest bands for (functional) average treatment effects (ATE) and quantile treatment effects (QTE).
To make informative inference possible, we assume that key reduced form predictive relationships
are approximately sparse. This assumption allows the use of regularization and selection methods to
estimate those relations, and we provide methods for post-regularization and post-selection inference
that are uniformly valid (honest) across a wide-range of models. We show that a key ingredient
enabling honest inference is the use of orthogonal or doubly robust moment conditions in estimating
certain reduced form functional parameters. We illustrate the use of the proposed methods with an
application to estimating the effect of 401(k) eligibility and participation on accumulated assets.
The results on program evaluation are obtained as a consequence of more general results on
honest inference in a general moment condition framework, which arises from structural equation
models in econometrics. Here too the crucial ingredient is the use of orthogonal moment conditions,
which can be constructed from the initial moment conditions. We provide results on honest inference
for (function-valued) parameters within this general framework where any high-quality, modern
machine learning methods can be used to learn the nonparametric/high-dimensional components
of the model. These include a number of supporting auxilliary results that are of major independent
interest: namely, we (1) prove uniform validity of a multiplier bootstrap, (2) oer a uniformly valid
functional delta method, and (3) provide results for sparsity-based estimation of regression functions
for function-valued outcomes.