This paper studies the properties of the wild bootstrap-based test proposed in Cameron et al. (2008) for testing hypotheses about the coefficients in a linear regression model with clustered data. Cameron et al. (2008) provide simulations that suggest this test works well even in settings with as few as five clusters, but existing theoretical analyses of its properties all rely on an asymptotic framework in which the number of clusters is “large.” In contrast to these analyses, we employ an asymptotic framework in which the number of clusters is “small,” but the number of observations per cluster is “large.” In this framework, we provide conditions under which an unstudentized version of the test is valid in the sense that it has limiting rejection probability under the null hypothesis that does not exceed the nominal level. Importantly, these conditions require, among other things, certain homogeneity restrictions on the distribution of covariates. In contrast, we establish that a studentized version of the test may only over-reject the null hypothesis by a “small” amount in the sense that it has limiting rejection probability under the null hypothesis that does not exceed the nominal level by more than an amount that decreases exponentially with the number of clusters. We obtain results qualitatively similar to those for the studentized version of the test for closely related “score” bootstrap-based tests, which permit testing hypotheses about parameters in nonlinear models. We illustrate the relevance of our theoretical or applied work via a simulation study and empirical application.
The Wild Bootstrap with a Small Number of Large Clusters
16 August 2019
Working Paper (CWP40/19)