centre for microdata methods and practice

ESRC centre

cemmap is an ESRC research centre

ESRC

Keep in touch

Subscribe to cemmap news

Optimal data collection for randomized control trials

Authors: Pedro Carneiro , Sokbae (Simon) Lee and Daniel Wilhelm
Date: 01 April 2016
Type: cemmap Working Paper, CWP15/16
DOI: 10.1920/wp.cem.2016.1516

Abstract

In a randomized control trial, the precision of an average treatment e ffect estimator can be improved either by collecting data on additional individuals, or by collecting additional covariates that predict the outcome variable. We propose the use of pre-experimental data such as a census, or a household survey, to inform the choice of both the sample size and the covariates to be collected. Our procedure seeks to minimize the resulting average treatment e ect estimator's mean squared error, subject to the researcher's budget constraint. We rely on an orthogonal greedy algorithm that is conceptually simple, easy to implement (even when the number of potential covariates is very large), and does not require any tuning parameters. In two empirical applications, we show that our procedure can lead to substantial gains of up to 58%, either in terms of reductions in data collection costs or in terms of improvements in the precision of the treatment eff ect estimator, respectively.

The original version of the working paper, posted on 01 April, 2016, is available here.

Download full version
New version:
Pedro Carneiro, Sokbae (Simon) Lee and Daniel Wilhelm October 2017, Optimal data collection for randomized control trials, cemmap Working Paper, CWP45/17, The IFS
Pedro Carneiro, Sokbae (Simon) Lee and Daniel Wilhelm March 2017, Optimal data collection for randomized control trials, cemmap Working Paper, CWP15/17, The IFS

Publications feeds

Subscribe to cemmap working papers via RSS

Search cemmap

Search by title, topic or name.

Contact cemmap

Centre for Microdata Methods and Practice

How to find us

Tel: +44 (0)20 7291 4800

E-mail us