A note on the philosophy literature on external validity

Later this month (August 2019) I’ll be presenting a paper at the 14th conference of the International Network for Economic Method (INEM). The paper is titled, “From ‘data mining’ to ‘machine learning’: the role of randomised trials and the credibility revolution”. An apparent puzzle is that there’s a session on external validity – which was the subject of my economics PhD, a working paper and short publication – in which I’m not presenting. Surely if I am going to be presenting at conferences on the method or philosophy of economics I should be presenting my work on external validity? The short answer is: I already did in 2012. But I think the longer explanation is also worth giving.

First, the paper I will be presenting at INEM (and ENPOSS) builds explicitly on work I’ve done on external validity (henceforth ‘EV’).

Second, and more importantly, my contribution to the philosophy literature on EV was not just presented at the Evidence and Causality in the Sciences (ECitS) 2012 conference but subsequently finalised as a paper in 2012, revised in 2013. Unfortunately that paper was not published at the time, for reasons that were at best flimsy. Preoccupied with finishing my economics PhD and changing jobs, I delayed resubmitting the manuscript. When I returned to academia in 2016 I discovered that a paper on the subject had been published in Philosophy of Science. More surprising was that, apart from some differences in verbiage and references, the core arguments of the paper seem to be the same as about 30-40% of my own paper but with no reference to that or my work in economics. Then, earlier this year, another paper appeared in the Journal of Economic Methodology. The core arguments of this paper, too, are very similar to the other 30-40% of my paper (dealing with issues like causal process tracing and related matters). In the second instance, my economics work is cited by misunderstood or misrepresented: suggesting that my views are different to the author’s when in fact, as is clear from the 2012/13 paper, they are almost entirely the same.

Needless to say, this creates a rather awkward situation. Not least because I believe, for reasons I will not ventilate in detail at this point, that it is implausible that the two authors were unaware of, or uninfluenced by, my 2012/13 work. But it is now simply impossible to publish my own work, despite clearly having a claim to intellectual priority. These concerns have been taken-up in the relevant fora, but the wheels turn slowly. And it will be informative to test the extent to which academic philosophy is committed to principles of intellectual priority. In the interim it makes for an ‘interesting’ context for intellectual engagement…

Economics: scientists and plumbers, or bullshit and mathiness?

On the 6th of January 2017 the Annual American Economic Association conference is scheduled to host a plenary address entitled The Economist as Plumber: Large Scale Experiments to Inform the Details of Policy Making. The speaker is the academic economist Esther Duflo, widely-acclaimed for popularising the use of randomised control trials (RCTs).

Given my PhD work in economics on external validity of RCTs and implications for policy, and parallel work in philosophy, I have a few thoughts on this subject. In a draft paper (first presented in 2015) entitled When is Economics Bullshit? I argue that practitioners promoting RCTs have systematically overstated the policy-relevance of results and thereby produced ‘bullshit’ (as defined in the famous essay by philosopher Harry Frankfurt).

A consistent problem in critiquing so-called ‘randomistas’ is that the goalposts have been constantly shifted. Early advocacy for RCTs within economics reflected a ‘missionary zeal’ (Bardhan). It has been suggested that experimental methods have led to a ‘credibility revolution‘: giving credibility to applied microeconomic work that apparently did not exist before. One recipient of the Bates Clarke medal argued that the introduction of RCTs indisputably rendered economics ‘a science’. In the policy domain I, along with other economists, have come across much grander and/or more extreme claims. But when challenged, proselytisers scale back the claims and deny ever overclaiming. So from missionary zeal, revolution and science we now have plumbing….

I look forward to reading Duflo’s speech/paper, but my own view of the methodology and philosophy of economics and RCTs suggests that plumbing is a very poor analogy.

In my own paper, motivated in part by claims that RCTs render economics ‘a science’, I tackle the question of scientific status head on. Using a revival of the so-called demarcation question (basically: how do we demarcate science from non-science or pseudoscience?) in philosophy, I argue that economics cannot (yet) be classified as a science, may never be classifiable as such and in the way it is used by some economists too-often verges on pseudoscience and/or bullshit.

The similarities between this very critical view and that of Romer’s recent critique of macroeconomics (which was made public later) are interesting. Romer focuses more on the use of mathematical modelling whereas my focus is on empirical methods. I will write a detailed comment on Romer’s piece later this year; I agree with some aspects but strongly disagree with others.

In its two presentations so far, my paper on bullshit has been relatively well-received by philosophers of science but not so well-received by philosophers of economics. There is good reason for this: the paper is even more an indictment of the current trend in philosophy of economics than it is of economics itself. The paper notes that in the absence of sufficient technical training and understanding of economics, philosophers in this area have increasingly taken the safer route of becoming apologists for the discipline. In effect, they compete to provide explanations of why economists are correct in their approach. (Exceptions to this, such as Nancy Cartwright – who has collaborated with Angus Deaton in providing important and influential critiques of RCTs – arguably prove the rule: Cartwright’s reputation was already established in philosophy of physics, causality and metaphysics).

The result, unfortunately, is that philosophy of economics currently has very little to add to economists’ critical understanding of their own discipline. Some critics, such as Skidelsky, argue that economists should read more philosophy, but while I am sympathetic to his overall stance I do not think economists would find much worth reading at present. Combining the abject failure of the ‘mainstream’ of philosophy of economics with the low quality of most economists’ reflections on methodological issues leaves us with few critical insights that could move the discipline beyond parochial or self-interested debates.

Updated reference list on external validity

My review of the external validity literature is slowly working its way through the peer review process. Parts were presented from 2011 onwards, but it was first published as a full working paper here and then updated for the Annual Bank Conference in Development Economics in 2014. A short version of one key contribution from that work has been published here.

Since these pieces, however, the reference list has been expanded in two important ways.

First, I became aware of a number of references and parallel literatures outside of economics that had either been missed in the original review, or published subsequent to the first version. Notably: in biostatistics (Elizabeth Stuart and co-authors), educational statistics (Elizabeth Tipton) and causal graphs (Elias Bareinboim and Judea Pearl).

Second, feedback through peer review questioned the omission of structural contributions to the topic – suggesting that this favoured the ‘design-based’ literature most closely associated with randomised control trials (RCTs). That was certainly not the intention. The rationale of the original review was to focus on the problem of external validity within the theoretical framework used by most RCT studies, in order to clearly delineate structuralist critiques from more fundamental external validity challenges.

I still think that it is absolutely critical to emphasise this distinction. However, there are contributions from the structural literature that propose something of a middle ground. Notably, work by Heckman, Vytlacil and co-authors argues for the merits of using the theoretical framework of Marginal Treatment Effects (MTEs). And one interesting recent, empirical contribution (by Amanda Kowalski) which has done so is forthcoming in the Journal of Economic Perspectives. Given this, I have added a number of references from that literature and expanded the review to cover this middle-ground between structural and design-based contributions.

While the paper proceeds through the publication process, I thought it would be useful to post the most recently submitted (May 2016) version of the reference list for those who may be interested. It can be found here.

Some thoughts on Taylor and Watson’s (2015) RCT on the impact of study guides on school-leaving results in South Africa

Since 2010 most of my time spent on academic research has focused on two particular areas:

  1. The use of randomised control trials (RCTs) to support inappropriate, or overly strong, policy claims or recommendations
  2. Empirical examples of how this has manifested in the economics of education.

I was therefore somewhat frustrated when I attended a presentation at the Economic Society of South Africa conference in 2013 to find some rather strong policy claims being made on the basis of what is very weak evidence (even by the standards of practitioners favouring RCTs). I raised my concerns with the relevant author, but I see that the recently-published working paper contains the same problems.

It therefore seems appropriate to summarise my concerns with this work: partly so that interested parties can understand its flaws, but mainly to provide an illustration of how the new fad for RCT-based policy is often oversold.[1] That’s important, because despite seemingly ample evidence I often get economists saying: “Oh but no-one really uses RCT results in that way”.

Continue reading “Some thoughts on Taylor and Watson’s (2015) RCT on the impact of study guides on school-leaving results in South Africa”