Downloads
cwp151616%20%282nd%20version%29.pdf
PDF | 772.44 KB
In a randomized control trial, the precision of an average treatment effect estimator can be improved either by collecting data on additional individuals, or by collecting additional covariates that predict the outcome variable. We propose the use of pre-experimental data such as a census, or a household survey, to inform the choice of both the sample size and the covariates to be collected. Our procedure seeks to minimize the resulting average treatment eect estimator's mean squared error, subject to the researcher's budget constraint. We rely on an orthogonal greedy algorithm that is conceptually simple, easy to implement (even when the number of potential covariates is very large), and does not require any tuning parameters. In two empirical applications, we show that our procedure can lead to substantial gains of up to 58%, either in terms of reductions in data collection costs or in terms of improvements in the precision of the treatment effect estimator, respectively.
The original version of the working paper, posted on 01 April, 2016, is available here.
Authors
Research Fellow Columbia University
Sokbae is an IFS Research Fellow and a Professor at Columbia University, with an interest in Econometrics, Applied Microeconomics and Statistics.
Research Fellow University College London
Pedro is a Professor of Economics at University College London and an economist in the IFS' Centre for Microdata Methods and Practice (cemmap).
Research Associate LMU Munich
Daniel is a Research Associate of the IFS in Cemmap and Professor of Statistics and Econometrics at LMU Munich.
Working Paper details
- DOI
- 10.1920/wp.cem.2016.1516
- Publisher
- Institute for Fiscal Studies
Suggested citation
P, Carneiro and S, Lee and D, Wilhelm. (2016). Optimal data collection for randomized control trials. London: Institute for Fiscal Studies. Available at: https://ifs.org.uk/publications/optimal-data-collection-randomized-control-trials (accessed: 26 April 2024).
More from IFS
Understand this issue
Gender norms, violence and adolescent girls’ trajectories: Evidence from India
24 October 2022
Public investment: what you need to know
25 April 2024
The £600 billion problem awaiting the next government
25 April 2024
Policy analysis
ABC of SV: Limited Information Likelihood Inference in Stochastic Volatility Jump-Diffusion Models
We develop novel methods for estimation and filtering of continuous-time models with stochastic volatility and jumps using so-called Approximate Bayesian Compu- tation which build likelihoods based on limited information.
12 August 2014
Assessing the economic benefits of education: reconciling microeconomic and macroeconomic approaches
This CAYT report discusses the strengths and limitations of several approaches to assessing the effect of education on productivity.
14 March 2013
Misreported schooling, multiple measures and returns to educational qualifications
We provide a number of contributions of policy, practical and methodological interest to the study of the returns to educational qualifications in the presence of misreporting.
1 February 2012
Academic research
Understanding Society: minimising selection biases in data collection using mobile apps
2 February 2024
Robust analysis of short panels
8 January 2024
A coefficient of variation for ordered categorical data: Analyzing relative health inequality and ageing in the UK and relative human resource inequality and gender in Canada
21 December 2023