My PhD dissertation was a critical assessment of the use of randomised programme evaluations to inform policy, with a specific focus on educational interventions and the problem of external validity. Summaries of these contributions are below. For information on what has been published in journals, see the ‘publications & presentations’ page.

Randomized trials for policy: a review of the external validity of treatment effects

The paper provides a first survey of the literature on external validity, using as a starting point recent debates regarding the use of randomized evaluations to inform policy. Besides synthesising contributions to the programme evaluation literature I consider definitions of external validity from other sub-disciplines within economics, such as experimental economics and the time-series forecasting literature, as well as from disciplines such as philosophy and medicine. Following Cook and Campbell (1979) I argue that the fundamental challenge arises from interactive functional forms. This somewhat neglected point provides a framework in which to understand how and why extrapolation may fail. In particular it suggests that replication cannot resolve the external validity problem unless informed by some prior theoretical understanding of the causal relationship of interest. Finally, the problem of interaction can be used to show that the assumptions required for simple external validity are conceptually equivalent to those required for obtaining unbiased estimates of treatment effects using non-experimental methods, undermining the idea that internal validity needs to be rigorously assessed whereas external validity can be ascertained subjectively. Theory may play a role in aiding extrapolation, but the extent to which this will be possible in practice remains an open question.

The external validity of class size effects: teacher quality in Project STAR

The external validity of treatment effects is of fundamental importance for policy. The paper explores this issue in the context of experimental evaluations of class size effects on student outcomes. While the existing literature assumes an additively separable educational production function, the way in which class size is hypothesised to affect outcomes more plausibly implies an alternative specification in which the marginal effect of size depends on teacher/class quality. To investigate this possibility a novel measure of quality is used to estimate possible interaction effects between teacher quality and class size in the Tennessee Project STAR dataset. Results are mixed across grades and subjects but include statistically and economically significant effects that suggest dependence between the class size effect and class quality. It is straightforward to show that interaction effects have implications for external validity. Together these results suggest that the external validity of class size effects will depend on the, typically unobserved, teacher quality distribution in the populations of interest.

Constructing a value-added teacher quality measure using data from a randomized trial

Value-added measures of teacher quality are typically constructed using longitudinal, non-experimental administrative datasets. The primary challenge to identification is the non-random matching of teachers and students.We instead outline an approach to constructing a value-added measure from experimental data in which teachers are only observed for a single time period but students and teachers are randomly assigned to classes of certain types. Specifically, we use the Project STAR data to construct a teacher quality measure that is independent of class size and compare the ranking of teachers obtained in this way to alternative quality measures.