Hostname: page-component-745bb68f8f-f46jp Total loading time: 0 Render date: 2025-02-11T01:55:03.321Z Has data issue: false hasContentIssue false

Evidence-Based Policy: A Practical Guide to Doing it Better, Nancy Cartwright and Jeremy Hardie. Oxford University Press, 2013, ix + 196 pages

Published online by Cambridge University Press:  04 March 2014

Naftali Weinberger*
Affiliation:
University of Wisconsin – Madison, USA
Rights & Permissions [Opens in a new window]

Abstract

Type
Reviews
Copyright
Copyright © Cambridge University Press 2014 

Policymakers are increasingly privileging randomized control trials (RCTs) as the best evidence for causal claims. In an RCT one randomly assigns subjects into treatment and control groups. If the randomization is successful, then the difference in expected outcomes between the two groups provides an estimate of the average causal effect of the treatment on the outcome for the population in the study. RCTs can only show that a causal relation obtains in the study's population. To extrapolate a causal relationship from the study population to a different population (the target population) further assumptions are required. Specifically, one must assume that within the target population the causal factor is capable of playing a similar role to the one it plays in the study population and that this factor is accompanied by the background conditions required for it to bring about its effect. In Evidence-Based Policy: A Practical Guide to Doing it Better, Nancy Cartwright and Jeremy Hardie provide an extremely accessible guide for how policymakers can use their background knowledge to evaluate whether these assumptions are met in a particular case.

The authors model extrapolative inferences as having the form of a deductive argument, which they call the effectiveness argument (45).Footnote 1 The conclusion of the argument is that a particular factor that had a positive causal effect in the study population will have a positive causal effect in at least some members of the target population. This is a weak conclusion that is compatible with the policy having a net negative effect. Although it is not sufficient to justify implementing a particular policy, it is necessary. Policymakers must establish at least this conclusion before implementing a policy.

The effectiveness argument contains three premises. Premise 1 is that a factor, X, has a positive effect on an outcome in one population. This is what an ideal RCT establishes.Footnote 2 It is a mistake to infer from the first premise that X will have a similar effect in other populations; to make this inference two additional premises are required. Premise 2 is that X can play a similar causal role in the intended population. Premise 3 states that the support factors necessary for X playing this role are present in the target population. Support factors for X are other factors required for X to have its effect.

Premises 2 and 3 block two ways that a causal claim can fail to generalize from one population to another. To illustrate, consider the following example. A study in Tamil Nadu established that educating mothers promoted healthier infants. Unfortunately, a similar intervention in Bangladesh failed to improve infant health. Why? The authors suggest that what explains the difference is that in Bangladesh mothers-in-law (rather than mothers) are in charge of distributing the food in the family. Premise 2 does not obtain, since educating mothers does not play the same causal role in Bangladesh as it does in Tamil Nadu. Educating mothers-in-law, in contrast, could play a similar causal role.

In saying that educating mothers-in-law could play a similar causal role, the authors leave open the possibility that it might fail to do so were certain support factors absent. Educating the mothers-in-law might have no impact on infant health if the family lacks an adequate food supply. Causes do not typically work in a vacuum, but rather require other factors to bring about an effect. Cartwright and Hardie borrow J.L Mackie's terminology on which causes are INUS conditions. An INUS condition is an Insufficient but Necessary part of an Unnecessary but Sufficient condition for an effect. In other words, when X is an INUS condition for Y, Y obtains if and only if BX v Z is true, where BX is a minimal sufficient condition for Y and Z is a disjunction of other minimal sufficient conditions for Y. Within this framework, one can easily see that B is a support factor for X, since only in conjunction with B does X bring about Y. The authors, like those concerned to identify causes, pick out one factor (X) as the cause, but there is no non-pragmatic distinction between causes and support factors. When X's support factors are not present, premise 3 does not obtain and the policy will not have its intended effect.

Although premises 2 and 3 are intuitively distinct, one must refer to what the authors call causal principles to make this distinction precise. Here is the causal principle for Tamil NaduFootnote 3:

\begin{equation*} ({\rm TN})\,{I} = {a}_1 + {a}_2 {I}_0 + {a}_3 {B}_{m} {E}_{m} + {a}_4 {Z} \end{equation*}

The lowercase ‘a's are coefficients and the uppercase letters are random variables – I refers to infant health, I0 is infant health at an earlier time, Em is education of the mother, Bm are the support factors for Em, and Z represents all other causes of I that do not interact with BmEm. The equation represents how infant health would change if one were to intervene on one of the right-hand-side variables while holding the others constant. The difference between a failure of premise 3 and a failure of premise 2 is as follows. Premise 3 is false if the value of Bm differs in the two populations. Premise 2 is false if the variable Em does not appear in the causal principle for one of the populations. According to the authors, the educational intervention failed in Bangladesh because premise 2 was false. Bangladesh has the following causal principle:

\begin{equation*} ({\rm BD})\,{I} = {a}_1 + {a}_2 {I}_0 + {a}_3 {B}_{{ml}} {E}_{{ml}} + {a}_4 {Z} \end{equation*}

Eml refers to the education of the mother-in-law. Since (BD) does not contain a variable for Em, premise 2 does not obtain.

But what determines whether Em appears in Bangladesh's causal principle? Suppose that instead of treating (BD) as Bangladesh's causal principle, we used the following causal principle, which applies to both populations:

\begin{equation*} ({\rm C})\,{I} = {a}_1 + {a}_2 {I}_0 + {a}_3 {B}_{{m}} {E}_{m} + {a}_4 {B}_{{ml}} {E}_{{ml}} + {a}_5 {Z} \end{equation*}

(C) contains both Em and Eml, so premise 2 is satisfied. Since the values of the support factors can differ between the populations, the effects of Em and Eml can differ as well (as, in fact, they do). If one represents Bangladesh using (BD), premise 2 does not obtain, but if one represents it as (C), it does. Absent some reason for choosing (BD) over (C), whether premise 2 obtains will be objectionably language dependent.

One reason to prefer (BD) to (C) is that if one models the difference between the populations with (C), one misses the fact that the policy's success depends not on which particular member of a family one educates, but rather on whether one educates the person with power over the family's food distribution. At one point the authors suggest that for each population, the relevant causal principle should look as follows:

\begin{equation*} ({\rm P})\,{I} = {a}_1 + {a}_2 {I}_0 + {a}_3 {B}_{{pw}} {E}_{{pw}} + {a}_4 {Z} \end{equation*}

The subscript pw means “person with the power”. I’d like to suggest that instead of considering (P) as an alternative to the distinct principles for each population ((TN) and (BP)), we should rather think of it as an alternative to (C). Like (C), (P) applies to both populations, but only (P) captures the common causal role played by the variables Em and Eml in (C).

Cartwright and Hardie talk as if one can determine whether premise 2 obtains by considering whether a factor appears in a population's causal principle, but populations do not wear causal principles on their sleeves. A population can have one causal principle relative to one set of measured variables, and a different principle relative to another set. If the treatment variable is ‘mother's education’ one causal principle applies, if it is ‘education of the person in power’ another does. The insight behind premise 2 is that choosing one variable set over another can aid extrapolation. This insight has been neglected in the literature on causation. In order to make this point, however, one needs to separate the cases in which one compares two populations using a single model from those in which one compares two ways of modelling the same population. Premise 3 concerns the way that two populations could differ relative to a single way of specifying the variables. Premise 2 concerns the question of whether the factor under consideration would be a variable in the optimal model.Footnote 4

Cartwright and Hardie suggest that a policymaker should perform two searches – a horizontal search and a vertical search – prior to implementing a policy. These searches correspond to premises 3 and 2, respectively.Footnote 5 In a horizontal search, one considers whether the support factors in the study population obtain in the target population as well. In a vertical search, one thinks about whether one has described the cause at the right level of description.

How useful are these searches for determining whether a policy will succeed? Cartwright and Hardie describe an intervention to improve reading scores by means of reducing class size that was successful in Tennessee, but failed in California. A horizontal search would have revealed that California was missing support factors that were present in Tennessee. Specifically, unlike Tennessee, California had a shortage of both teachers and classroom space. In cases like this, where one knows some of the necessary conditions for a policy to work, horizontal searches are clearly useful. In situations where both populations have the conditions necessary for bringing about the effect, horizontal searches are less useful. Would it have been worth performing the intervention had California had enough teachers to implement it, but fewer teachers-per-student than in Tennessee? All else being equal, this would reduce the efficacy of the intervention, but all else is never equal. Perhaps the teachers in California are better on average and this compensates for the negative effects of the higher student-to-teacher ratio. Alternatively, maybe good teachers can only do so much if the classes are too big. Knowing what the support factors are is insufficient for determining how varying these factors changes the effect. For this reason, horizontal searches are better suited for ruling out policies in which support factors are absent than for justifying policies when they are present.

We’ve already seen an example of a vertical search in the Tamil Nadu case. The principle ‘educate the person in power‘ extrapolates to Bangladesh; ‘educate the mother’ does not. The level of abstraction at which we describe a causal factor is important. How can we translate this insight into practical advice? By abstracting away from the properties of a population we end up with claims that apply to a wider range of populations, but not all ways of abstracting work equally well. In the Tamil Nadu case, switching from ‘educate the mother’ to the more general ‘educate the person in power’ worked, but why should we abstract to this general principle. Why not ‘educate the person who supervises the child’ (supposing that mothers play this role in Tamil Nadu)? This principle is as abstract as the one they suggest and it yields different advice for applying the lessons from Tamil Nadu to Bangladesh. How can one know which principle to adopt by looking just at Tamil Nadu? Without some guidance regarding which ways of abstracting are preferable, vertical searches do not yield a verdict on whether a causal relation extrapolates to the target population. Cartwright and Hardie identify this need, but they do not provide much guidance concerning how to satisfy it.

In horizontal and vertical searches a policymaker relies on her background knowledge in considering whether a policy will work. The authors say little about how to determine if one has reliable background knowledge in the first place. Consider the case Cartwright and Hardie discuss of a nurse who is able to quickly detect whether an infant has a certain disease (131–2). Since this disease is treatable only if it is detected early, the hospital would like to teach the nurse's skill to other nurses. Through careful deliberation, the nurse discovered that she detects the disease through monitoring whether the infant changes colour, shows heightened activity, and has reduced appetite. Assuming that the nurse is correct about how she makes her diagnoses, it will be possible to teach the other nurses how to make similar diagnoses by looking for these changes. In this case, the nurse was in fact correct, and the hospital was able to teach other nurses to make better predictions. Yet, even though the nurse's judgement was reliable, there is little reason to think that people's causal judgements are generally reliable, especially when one is implementing a complicated policy. This is why we need RCTs in the first place. It would therefore be unsatisfactory if extrapolation relied entirely on causal intuitions.

Fortunately, the nurse's hypothesis about how she makes correct predictions is testable. Consider the following model for the case (Figure 1):

Figure 1.

This model represents the possible causal relations between the variables. It includes three measured variables on the path from the disease to the diagnosis. These measured variables are called mediators. The arrow going directly from the disease to the diagnosis represents all the causal paths between the treatment and the outcome that do not go through the measured mediators. Using causal mediation techniques, one can determine how much each path contributes to the total effect. Doing so requires more complicated experimental designs than standard RCTs (Imai et al. Reference Imai, Keele, Tingley and Yamamoto2011). Initially, one might think that one could measure the influence on a path going through a mediator by randomizing the mediator. The reason this does not work is that when one randomizes the mediator, one severs the causal connection from the treatment to the mediator. Randomizing the mediator enables one to estimate the effect of the mediator on the outcome, but this is not the quantity one wants to estimate in causal mediation. The desired quantity is the causal contribution of the path going from the treatment to the mediator to the outcome, but randomization disrupts this path. Despite this complication that arises in measuring the relative contributions of the different paths, they are in principle measureable (Pearl Reference Pearl, Berzuini, Dawid and Bernardinelli2012) and social scientists have developed preliminary experimental designs for measuring them (Imai et al., Reference Imai, Tingley and Yamamoto2013). The nurse's hypothesis about how she makes her predictions can be verified by measuring the contributions of the paths going through the mediators.

Causal mediation techniques aid in extrapolation, since testing a hypothesis about the way a cause operates in the study population often enables one to predict whether it will work in other populations. If the nurse's predictions were largely based on infant colour, then other people capable of detecting these colour changes would probably make similarly good predictions. A central thesis of Evidence-Based Policy is that knowing how a cause works (which requires more than knowing the support factors and the causally relevant description) is essential to knowing whether it will generalize. But the authors say little about how we can learn what we need to know. Causal mediation techniques help answer this question.

Cartwright and Hardie intend their book as a practical guide for doing evidence-based policy better and succeed in their intention. They encourage policymakers to ask a broader set of questions than merely whether the policy has been shown to work somewhere. Without considering these additional questions, policymakers have little basis for thinking that a policy that worked elsewhere will also work in their particular situation. Through horizontal and vertical searches, policymakers can use their background knowledge to avoid investing resources in projects that are unlikely to succeed. After doing these searches, one still needs to determine which projects are likely to succeed. I have here suggested that causal mediation techniques are one way to enhance extrapolation. Whether causal mediation is the most fruitful path remains to be seen.Footnote 6

Footnotes

1 The effectiveness argument contains a set of assumptions that, if true, would justify an extrapolation. In addition to presenting these assumptions, the authors also give an account of evidence for when an extrapolation is justified. According to this account (19), any evidence e for a premise in the effectiveness argument is also evidence for the conclusion of the argument. This account is untenable, since evidential relevance is not, in general, transitive; just because e is evidence for a premise that is evidence (relative to an argument) for a conclusion does not entail that e is evidence for that conclusion (Hesse Reference Hesse1970). That a card is red is evidence that it is the queen of hearts, which entails that it is a queen. But that a card is red is not evidence that it is a queen. Fortunately, none of their claims about extrapolation depends on this theory of evidence.

2 Some have criticized RCTs on the grounds that we have no assurance that the populations will be even approximately balanced in studies with small samples (Worrall Reference Worrall2007); see Reiss (Reference Reiss2013, chapter 11) for discussion. Cartwright and Hardie purposely put this issue to the side. They assume that RCT are valid for the test population and ask whether their results can be generalized to other populations.

3 I have altered the notation of the causal principles in several ways to improve clarity. All of the coefficients are adjustable parameters, so a1 in one principle need not have the same value as a1 in another.

4 The question of which model is optimal is related to that of why models with fewer adjustable parameters are preferable to those with more, cf. Forster (Reference Forster2007), Forster and Sober (Reference Forster and Sober1994), Whewell (Reference Whewell1840).

5 The authors do not explicitly note the correspondence between premise 3 and a horizontal search and between premise 2 and a vertical search.

6 I would like to thank Nancy Cartwright, Dan Hausman, Fabienne Peter and Reuben Stern for helpful feedback.

References

REFERENCES

Forster, M. R. 2007. A philosopher's guide to empirical success. Philosophy of Science 74: 588600.Google Scholar
Forster, M. & Sober, E. 1994. How to tell when simpler, more unified, or less ad hoc theories will provide more accurate predictions. British Journal for the Philosophy of Science 45: 135.Google Scholar
Hesse, M. 1970. Theories and the transitivity of confirmation. Philosophy of Science 37: 5063.Google Scholar
Imai, K., Keele, L., Tingley, D. and Yamamoto, T.. 2011. Unpacking the black box of causality: learning about causal mechanisms from experimental and observational studies. American Political Science Review 105: 765789.Google Scholar
Imai, K., Tingley, D. and Yamamoto, T.. 2013. Experimental designs for identifying causal mechanisms. (With discussions.) Journal of the Royal Statistical Society, Series A (Statistics in Society) 176: 551.Google Scholar
Pearl, J. 2012. The mediation formula: a guide to the assessment of causal pathways in nonlinear models. In Causality: Statistical Perspectives and Applications, ed. Berzuini, C., Dawid, P. and Bernardinelli, L.. Chichester: John Wiley & Sons.Google Scholar
Reiss, J. 2013. The Philosophy of Economics. New York, NY: Routledge.Google Scholar
Whewell, W. 1840. The Philosophy of the Inductive Sciences: Founded upon their History (Vol. 1). London: John W. Parker.Google Scholar
Worrall, J. 2007. Evidence in medicine and evidence-based medicine. Philosophy Compass 2: 9811022.Google Scholar
Figure 0

Figure 1.