Working Paper

The sorted effects method: discovering heterogeneous effects beyond their averages


Victor Chernozhukov, Ivan Fernandez-Val, Ye Luo

Published Date

21 December 2015


Working Paper (CWP74/15)

The partial (ceteris paribus) effects of interest in nonlinear and interactive linear models are heterogeneous as they can vary dramatically with the underlying observed or unobserved covariates. Despite the apparent importance of heterogeneity, a common practice in modern empirical work is to largely ignore it by reporting average partial effects (or, at best, average effects for some groups, see e.g. Angrist and Pischke (2008)). While average effects provide very convenient scalar summaries of typical effects, by definition they fail to reflect the entire variety of the heterogenous effects. In order to discover these effects much more fully, we propose to estimate and report sorted effects – a collection of estimated partial effects sorted in increasing order and indexed by percentiles. By construction the sorted effect curves completely represent and help visualize all of the heterogeneous effects in one plot. They are as convenient and easy to report in practice as the conventional average partial effects. We also provide a quantification of uncertainty (standard errors and confidence bands) for the estimated sorted effects. We apply the sorted effects method to demonstrate several striking patterns of gender-based discrimination in wages, and of race-based discrimination in mortgage lending.

Using differential geometry and functional delta methods, we establish that the estimated sorted effects are consistent for the true sorted effects, and derive asymptotic normality and bootstrap approximation results, enabling construction of pointwise confidence bands (point-wise with respect to percentile indices). We also derive functional central limit theorems and bootstrap approximation results, enabling construction of simultaneous confidence bands (simultaneous with respect to percentile indices). The derived statistical results in turn rely on establishing Hadamard differentiability of the multivariate sorting operator, a result of independent mathematical interest.