Centre for Microdata Methods and Practice

Showing 61 - 72 of 109 results

Working paper graphic

Incomplete English auction models with heterogeneity

Working Paper

This paper studies identification and estimation of the distribution of bidder valuations in an incomplete model of English auctions. As in Haile and Tamer (2003) bidders are assumed to (i) bid no more than their valuations and (ii) never let an opponent win at a price they are willing to beat. Unlike the model studied by Haile and Tamer (2003), the requirement of independent private values is dropped, enabling the use of these restrictions on bidder behavior with affiliated private values, for example through the presence of auction specifi…c unobservable heterogeneity. In addition, a semiparametric index restriction on the effect of auction-specifi…c observable heterogeneity is incorporated, which, relative to nonparametric methods, can be help- ful in alleviating the curse of dimensionality with a moderate or large number of covariates. The identification analysis employs results from Chesher and Rosen (2017) to characterize identified sets for bidder valuation distributions and functionals thereof.

31 May 2017

Working paper graphic

Fixed-effect regressions on network data

Working Paper

This paper studies inference on fixed effects in a linear regression model estimated from network data. An important special case of our setup is the two-way regression model, which is a workhorse method in the analysis of matched data sets. Networks are typically quite sparse and it is difficult to see how the data carry information about certain parameters. We derive bounds on the variance of the fixed-effect estimator that uncover the importance of the structure of the network. These bounds depend on the smallest non-zero eigenvalue of the (normalized) Laplacian of the network and on the degree structure of the network. The Laplacian is a matrix that describes the network and its smallest non-zero eigenvalue is a measure of connectivity, with smaller values indicating less-connected networks. These bounds yield conditions for consistent estimation and convergence rates, and allow to evaluate the accuracy of first-order approximations to the variance of the fixed-effect estimator. The bounds are also used to assess the bias and variance of estimators of moments of the fixed effects.

30 May 2017

Working paper graphic

Inference under covariate-adaptive randomization

Working Paper

This paper studies inference for the average treatment effect in randomized controlled trials with covariate-adaptive randomization. Here, by covariate-adaptive randomization, we mean randomization schemes that first stratify according to baseline covariates and then assign treatment status so as to achieve "balance" within each stratum.

24 May 2017

Working paper graphic

Who should be treated? Empirical welfare maximization methods for treatment choice

Working Paper

One of the main objectives of empirical analysis of experiments and quasi-experiments is to inform policy decisions that determine the allocation of treatments to individuals with different observable covariates. We study the properties and implementation of the Empirical Welfare Maximization (EWM) method, which estimates a treatment assignment policy by maximizing the sample analog of average social welfare over a class of candidate treatment policies. The EWM approach is attractive in terms of both statistical performance and practical implementation in realistic settings of policy design. Common features of these settings include: (i) feasible treatment assignment rules are constrained exogenously for ethical, legislative, or political reasons, (ii) a policy maker wants a simple treatment assignment rule based on one or more eligibility scores in order to reduce the dimensionality of individual observable characteristics, and/or (iii) the proportion of individuals who can receive the treatment is a priori limited due to a budget or a capacity constraint. We show that when the propensity score is known, the average social welfare attained by EWM rules converges at least at n-1=2 rate to the maximum obtainable welfare uniformly over a minimally constrained class of data distributions, and this uniform convergence rate is minimax optimal. We examine how the uniform convergence rate depends on the richness of the class of candidate decision rules, the distribution of conditional treatment effects, and the lack of knowledge of the propensity score. We other easily implementable algorithms for computing the EWM rule and an application using experimental data from the National JTPA Study.

19 May 2017

Working paper graphic

Generic inference on quantile and quantile effect functions for discrete outcomes

Working Paper

This paper provides a method to construct simultaneous confidence bands for quantile and quantile effect functions for possibly discrete or mixed discrete-continuous random variables. The construction is generic and does not depend on the nature of the underlying problem. It works in conjunction with parametric, semiparamet-ric, and nonparametric modeling strategies and does not depend on the sampling schemes. It is based upon projection of simultaneous confidence bands for distribution functions. We apply our method to analyze the distributional impact of insurance coverage on health care utilization and to provide a distributional decomposition of the racial test score gap. Our analysis generates new interesting findings, and com-plements previous analyses that focused on mean effects only. In both applications, the outcomes of interest are discrete rendering standard inference methods invalid for obtaining uniform confidence bands for quantile and quantile effects functions.

19 May 2017

Working paper graphic

Confidence bands for coefficients in high dimensional linear models with error-in-variables

Working Paper

We study high-dimensional linear models with error-in-variables. Such models are motivated by various applications in econometrics, finance and genetics. These models are challenging because of the need to account for measurement errors to avoid non-vanishing biases in addition to handle the high dimensionality of the parameters. A recent growing literature has proposed various estimators that achieve good rates of convergence. Our main contribution complements this literature with the construction of simultaneous confidence regions for the parameters of interest in such high-dimensional linear models with error-in-variables. These confidence regions are based on the construction of moment conditions that have an additional orthogonality property with respect to nuisance parameters. We provide a construction that requires us to estimate an auxiliary high-dimensional linear model with error-in-variables for each component of interest. We use a multiplier bootstrap to compute critical values for simultaneous confidence intervals for a target subset of the components. We show its validity despite of possible (moderate) model selection mistakes, and allowing the number of target coefficients to be larger than the sample size. We apply and discuss the implications of our results to two examples and conduct Monte Carlo simulations to illustrate the performance of the proposed procedure for each variable whose coefficient is the target of inference.

17 May 2017

Working paper graphic

Approximate permutation tests and induced order statistics in the regression discontinuity design

Working Paper

In the regression discontinuity design (RDD), it is common practice to asses the credibility of the design by testing whether the means of baseline covariates do not change at the cutoff(or threshold) of the running variable. This practice is partly motivated by the stronger im-plication derived by Lee (2008), who showed that under certain conditions the distribution of baseline covariates in the RDD must be continuous at the cutoff. We propose a permutation test based on the so-called induced ordered statistics for the null hypothesis of continuity of the distribution of baseline covariates at the cutoff; and introduce a novel asymptotic framework to analyze its properties. The asymptotic framework is intended to approximate a small sample phenomenon: even though the total number n of observations may be large, the number of effective observations local to the cutoff is often small. Thus, while traditional asymptotics in RDD require a growing number of observations local to the cutoff as n → ∞, our framework keeps the number q of observations local to the cutoff fixed as n → ∞. The new test is easy to implement, asymptotically valid under weak conditions, exhibits finite sample validity under stronger conditions than those needed for its asymptotic validity, and has favorable power properties relative to tests based on means. In a simulation study, we find that the new test controls size remarkably well across designs. We then use our test to evaluate the plausibility of the design in Lee (2008), a well-known application of the RDD to study incumbency advantage.

16 May 2017

Working paper graphic

Inference on breakdown frontiers

Working Paper

A breakdown frontier is the boundary between the set of assumptions which lead to a specific conclusion and those which do not. In a potential outcomes model with a binary treatment, we consider two conclusions: First, that ATE is at least a specific value (e.g., nonnegative) and second that the proportion of units who benefit from treatment is at least a specific value (e.g., at least 50%). For these conclusions, we derive the breakdown frontier for two kinds of assumptions: one which indexes deviations from random assignment of treatment, and one which indexes deviations from rank invariance. These classes of assumptions nest both the point identifying assumptions of random assignment and rank invariance and the opposite end of no constraints on treatment selection or the dependence structure between potential outcomes. This frontier provides a quantitative measure of robustness of conclusions to deviations in the point identifying assumptions. We derive √N-consistent sample analog estimators for these frontiers. We then provide an asymptotically valid bootstrap procedure for constructing lower uniform confidence bands for the breakdown frontier. As a measure of robustness, this confidence band can be presented alongside traditional point estimates and confidence intervals obtained under point identifying assumptions. We illustrate this approach in an empirical application to the effect of child soldiering on wages. We find that conclusions are fairly robust to failure of rank invariance, when random assignment holds, but conclusions are much more sensitive to both assumptions for small deviations from random assignment.

15 May 2017

Working paper graphic

Understanding the effect of measurement error on quantile regressions

Working Paper

The impact of measurement error in explanatory variables on quantile regression functions is investigated using a small variance approximation. The approximation shows how the error contaminated and error free quantile regression functions are related. A key factor is the distribution of the error free explanatory variable. Exact calculations probe the accuracy of the approximation. The order of the approximation error is unchanged if the density of the error free explanatory variable is replaced by the density of the error contaminated explanatory variable which is easily estimated. It is then possible to use the approximation to investigate the sensitivity of estimates to varying amounts of measurement error.

10 May 2017

Working paper graphic

Uncertain identification

Working Paper

Uncertainty about the choice of identifying assumptions is common in causal studies, but is often ignored in empirical practice. This paper considers uncertainty over models that impose different identifying assumptions, which, in general, leads to a mix of point- and set-identified models. We propose performing inference in the presence of such uncertainty by generalizing Bayesian model averaging. The method considers multiple posteriors for the set-identified models and combines them with a single posterior for models that are either point-identified or that impose non-dogmatic assumptions. The output is a set of posteriors (post-averaging ambiguous belief) that are mixtures of the single posterior and any element of the class of multiple posteriors, with weights equal to the posterior model probabilities. We suggest reporting the range of posterior means and the associated credible region in practice, and provide a simple algorithm to compute them. We establish that the prior model probabilities are updated when the models are "distinguishable" and/or they specify different priors for reduced-form parameters, and characterize the asymptotic behavior of the posterior model probabilities. The method provides a formal framework for conducting sensitivity analysis of empirical findings to the choice of identifying assumptions. In a standard monetary model, for example, we show that, in order to support a negative response of output to a contractionary monetary policy shock, one would need to attach a prior probability greater than 0.32 to the validity of the assumption that prices do not react contemporaneously to such a shock. The method is general and allows for dogmatic and non-dogmatic identifying assumptions, multiple point-identified models, multiple set-identified models, and nested or non-nested models.

18 April 2017