Machine Learning for Treatment Effects and Structural Equation Models

Date & Time

From: 16 May 2016
Until: 17 May 2016




The Institute for Fiscal Studies
7 Ridgmount Street,


HE Delegates: £75
Charity or Government: £200
Other Delegates: £450

NOTE: This course is now fully booked, please email to request to add your name to the waiting list.

The course will provide a practical introduction to modern high-dimensional function fitting methods — a.k.a. machine learning (ML) methods — for efficient estimation and inference on the treatment effects and structural parameters in empirical economic models. Every participant will use R to immediately internalize and use the techniques in their own academic and industry work. All lectures, except the introductory one, will be accompanied by the R-code that can be used to reproduce the empirical examples in the lectures during the lectures. Thus, there will be no gap between theory and practice.

Editions: This is the 6th edition of the course to be given in CEMMAP UK and Warwick UK. Previous editions were given at NBER 2014, MIT in 2015 and 2016 as parts of the 14.387 course “Applied Econometrics”, Sciences Po Paris in 2015, and in the Summer School of Econometrics of the Bank of Italy in Perugia 2015.


Causal Inference in Approximately Sparse Linear Structural Equations Models.

  • Approximately sparse econometric models as generalizations of conventional econometric models
  • “Double lasso” or “double partialling out” methods for efficient estimation and inference of causal parameters in these models.
  • Various empirical examples.
  • References: 3, 4.

Understanding of the Inference Strategy via the Double Partialling Out and Adaptivity.

  • Theory: Frisch-Waugh 3Partialling Out. Adaptivity.
  • Laying a strategy for the use of non-sparse and generic ML methods.
  • R Practicum: Mincer Equations, Barro-Lee, and Acemoglu-Johnson-Robinson examples.
  • References: 3,4, 6.

ML Methods for Prediction = Reduced Form Estimation. Evaluation of ML Methods using Test Samples.

  • Penalization Regression Methods: Ridge, Lasso, Elastic Nets, etc.
  • Regression Trees, Random Forest, Boosted Trees.
  • Modern Nonlinear Regression via Neural Nets and Deep Learning
  • Aggregation and Cross-Breading of the ML methods.
  • R Practicum: Simulated, Wage, and Pricing Examples.
  • References: 1, 2, 9-11.­­

ML Methods for Causal Parameters — “Double” Machine Learning for Causal Parameters in Treatment Effect Models and Nonlinear Econometric Models

  • Using generic ML (beyond Lasso) to Estimate Coefficients in Partially Linear Methods
  • Using generic ML to estimate ATE, ATT, LATE in Heterogeneous Treatment Effect Models
  • Using generic ML methods to estimate structural parameters in Moment Condition problems.
  • R-practicum: 401(k) Example.
  • References: 5, 6, 7, 8.

Scalability: Working with Large Data. MapReduce, Hadoop and all that

  • MapReduce, Sufficient Statistics, Linear Estimators
  • MapReduce and Computation of Nonlinear Estimatos via Distributed Gradient Descent
  • MapReduce in R.


Please bring your computer to class. Install R and R-studio. Install packages “hdm”, “glmnet”, “nnet”, “randomForest”, “rpart”, “rpart.plot”, “gbm” from cran (e.g. type install.packages(“gbm”)) If you are not familiar with R, try out several introductory tutorials that are available online. Please read and understand the idea of cross-validation (k-fold cross-validation) to prevent overfitting, and bias and variance tradeoffs in nonparametric estimation. I will be mentioning these briefly in class, but I will count on you understanding this background concepts. A good reference is “Elements of Statistical Learning” which is available from Tibshirani’s website.


The Elements of Statistical Learning by T. Hastie, R. Tibshirani, and J. Freedman. The book can be downloaded for free!

An Introduction to Statistical Learning with Applications in R, by G. James, D. Witten, T. Hastie and R. Tibshirani. The website has a lot of handy resources.

High-Dimensional Methods and Inference on Treatment and Structural Effects in Economics, J. Economic Perspectives 2014, Belloni et. al. Stata replication code is here. R code implementation is in package “hdm”.

“Inference on Treatment Effects After Selection Amongst High-Dimensional Controls (with an Application to Abortion and Crime),”ArXiv 2011, The Review of Economic Studies, 2013, Belloni et. al. Stata and Matlab programs are here; replication files here. R code implementation in package “hdm”.

“Robust Inference on Average Treatment Effects with Possibly More Covariates than Observations”, Arxiv 2013, Journal of Econometrics, 2015. by M. Farrell.

“Post-Selection and Post-Regularization Inference: An Elementary, General Approach,” Annual Review of Economics 2015, V. Chernozhukov, C. Hansen, and M. Spindler. R code implementation in package “hdm”.

“Program Evaluation and Causal Inference with High-Dimensional Data,”ArXiv 2013, Econometrica, 2016+, A. Belloni et al. R code implementation in package “hdm”. Replication files via Econometrica website.

“Double Machine Learning for Causal and Treatment Effects”, MIT Working Paper, V. Chernozhukov, D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey.

“Big Data: New Tricks for Econometrics,” Journal of Economic Perspectives 2014, H. Varian.

“Economics in the age of big data,” Science 2014, L. Einav, J. Levin.

“Prediction Policy Problems,” American Economic Review P&P 2015, J. Kleinberg, J. Ludwig, S. Mullainathan, Z. Obermeyer.