I. Introduction
The Sanitary and Phytosanitary (SPS) Agreement has attracted academic interest and political controversy since its conclusion as part of the World Trade Organization's (WTO) Uruguay Round agreements in 1995.Footnote 1 The Agreement contains novel, science-based requirements that function as the principal mechanism for distinguishing genuine health, environmental and quarantine risk regulatory measures from other trade measures, presumed to be motivated by protectionism.Footnote 2 If a dispute arises over a Member's measures, it may be taken before the WTO dispute settlement body where legal decision-makers (panels and a standing Appellate Body) evaluate questions such as the ‘sufficiency’ of the scientific evidence underlying a measure and its relationship to a science-based risk assessment. Potentially these provisions may require international judicial bodies to undertake a searching review of the scientific underpinnings of SPS risk regulation, with the potential to place significant constraints on the scope of domestic regulatory autonomy in this field.
While many found the first SPS dispute brought under the Agreement—that of Hormones in 1998—reassuring given the efforts made by the Appellate Body to articulate a flexible notion of risk assessment applicable under the SPS Agreement,Footnote 3 later decisions evolved in a direction that gave a large role to science (and experts advising panels)Footnote 4 in evaluating the international legal legitimacy of SPS measures. In November 2008, the WTO Appellate Body issued its decision in a further round of the Hormones dispute (this time examining whether amended European measures complied with the earlier Appellate Body rulings, warranting the suspension of trade sanctions).Footnote 5 The Continued Suspension decision articulated new requirements around the so-called ‘standard of review’ to be applied by panels in evaluating the scientific basis and adequacy of a Member's risk assessment. The standard of review governs the extent of the investigative authority enjoyed by WTO dispute settlement bodies when it comes to examining disputed facts, including scientific evidence. In the wake of the Continued Suspension case, some commentators saw this decision as an indication in WTO dispute settlement of (a return to) greater deference to the scientific judgment of domestic authorities in SPS risk regulation.Footnote 6
Two years later in November 2010 the Appellate Body issued its first ruling since Continued Suspension in the case of Australia–Apples.Footnote 7 The Australian appeal squarely raised the question of whether the panel applied the correct standard of review in evaluating the relevant risk assessment. The Appellate Body's interpretation of its findings in Continued Suspension suggests that despite the articulation of a new standard, its application may still result in detailed scrutiny of Members' risk assessment findings, at least in the context of SPS disputes raising quarantine risk issues.
What the Australia–Apples rulings bring to the fore are problems with treating all SPS risk situations alike as regards the stringency of the review standard applied. The SPS Agreement covers a diverse range of human health, food safety and quarantine risks,Footnote 8 and potentially also extends to some environmental risks.Footnote 9 As all SPS risk situations are not alike, this article argues there is a need to differentiate between the stringency of the standard of review applicable in different risk situations. Frameworks developed in the social scientific literature for understanding the limitations of conventional science-based risk assessment can be adapted to supply key parameters for differentiating risk situations on the basis of the extent of scientific uncertainty and the level of socio-political contestation with respect to risks.Footnote 10 This approach could provide a principled and more transparent method for the use of science in evaluating WTO Members’ risk assessments and discerning legitimate SPS measures. In turn, practice in the WTO SPS context may inform broader questions regarding the use of science and expert evidence in international dispute settlement involving environmental risks, questions which are receiving increased attention.Footnote 11
Part II of the article highlights the multifaceted role of the standard of review in SPS disputes and discusses how questions over the appropriate stringency of review have been approached, both in the setting of the SPS Agreement and its jurisprudence, and in comparative contexts such as the judicial review of risk regulation in the United States (US) and European Union (EU). Part III discusses the deficiencies of a stringent science-based review of risk regulatory measures in dealing with the diversity of possible risk situations covered by the SPS Agreement and draws on established frameworks in the social scientific literature to articulate other important parameters for application of the standard of review. Parts IV and V then examine the clarifications of the standard of review offered in the WTO Appellate Body's Continued Suspension and Australia–Apples decisions, analysing whether these cases allow scope for a standard of review differentiated according to the risk situation under consideration. Finally, Part VI of the article turns to the question of how a differentiated standard of review, looking to parameters of scientific uncertainty and socio-political contestation regarding risks, might operate in the practice of WTO SPS dispute settlement, with potentially broader significance for cases concerning environmental risk measures arising in other areas of international law.
II. Science and the Standard of Review
A. Role of the Standard of Review
The standard of review applied by WTO panels in disputes under the SPS Agreement—while seemingly a technical question—is in practice intimately tied up with the extent to which the SPS Agreement limits the risk regulatory autonomy of WTO Members in this field.Footnote 12 The SPS Agreement contains the concept of a Member's ‘appropriate level of protection’ (ALOP), otherwise referred to as the notion of ‘acceptable risk’, which is defined as the level of protection deemed appropriate by a WTO Member establishing an SPS measure to protect human, animal, or plant life or health within its territory.Footnote 13 The WTO Appellate Body has, on a number of occasions, stressed that it is the prerogative of Members (and not the WTO) to set ALOPs,Footnote 14 an approach consistent with conventional notions of risk regulation that designate decisions about the levels of risk a society is prepared to accept as a socio-political (and not a scientific) matter.Footnote 15
Nonetheless, in adopting SPS measures designed to ensure an ALOP is achieved, WTO Members must comply with the scientific and risk assessment provisions of the SPS Agreement. In particular, a WTO Member—if not basing its measures on international standards,Footnote 16 or where no such international standards exist—shall ensure that its SPS measure is ‘applied only to the extent necessary to protect human, animal or plant life or health, is based on scientific principles and is not maintained without sufficient scientific evidence’.Footnote 17 It must also ensure that the SPS measure is ‘based on’ a risk assessment ‘as appropriate to the circumstances’Footnote 18 that takes into account specified factors, including scientific evidence.Footnote 19
In past case law the Appellate Body has indicated that there is a close relationship between the scientific evidence and risk assessment requirements of the SPS Agreement, and that the relevant provisions must be ‘constantly’ read together.Footnote 20 Moreover, scientific evidence will only be considered insufficient (which under the SPS Agreement is a justification for the adoption of provisional, precautionary measures)Footnote 21 when a WTO Member is unable to perform ‘an adequate assessment of risks’ as required by the Agreement on the basis of that evidence.Footnote 22
In this context, the standard of review applied in WTO dispute settlement in evaluating a contested SPS measure plays a critical function in mediating between the prerogative of WTO Members to set their own acceptable risk levels and compliance with the requirements set down for legitimate measures in the SPS Agreement. A very lenient application of the standard of review would allow WTO Members significant regulatory autonomy in establishing SPS risk measures but could also forgo the expected benefits of the SPS Agreement that come from tying the adoption of such measures to certain standards of ‘rational’ decision-making.Footnote 23 On the other hand, a very intrusive application of the standard of review might constitute WTO decision-makers not only as the judge of the adequacy of the underlying science but also place significant constraints on what kind of SPS measures can be adopted by Members, undermining their choice of an acceptable level of risk.
1. Role of experts versus ‘other’ risk perspectives in SPS review
The stringency of the standard of review adopted by WTO decision-makers in SPS disputes is also highly influential in determining the types of information and perspectives on risk that will be considered (or privileged) in SPS review. For instance, the standard of review adopted is an important factor in determining the extent of the role played by scientific experts in SPS dispute settlement.Footnote 24 Given the technical nature of many SPS disputes, and the lack of scientific expertise possessed by panels, the appointment of experts to advise panels has been a constant feature of the SPS jurisprudence.Footnote 25 To the extent that panels adopt a more intrusive standard of review in examining the facts of a dispute, their dependence on advising experts is likely to increase in order to permit them to understand and evaluate the scientific arguments put forward by WTO Members. In turn this will tend to amplify the importance of the panel's decisions about how experts are selected, consulted and the nature of the questions put to them, as well as issues around the diversity of experts consulted in a dispute, their level of independence and impartiality, and the use made of their expertise in panel decision-making.
In the area of international environmental dispute settlement more broadly, similar questions are attracting increasing interest given the equally technical nature of many environmental disputes and hence the difficulties international judicial tribunals face in evaluating complex factual information.Footnote 26 In the Pulp Mills decision of the International Court of Justice, the Court commented on the treatment to be afforded to expert evidence and the methodology to be applied by the Court in considering the ‘vast amount’ of scientific and technical information before it.Footnote 27 A majority of the Court was of the view that ‘despite the volume and complexity of the factual information submitted to it, it is the responsibility of the Court, after having given careful consideration to all the evidence placed before it by the Parties, to determine which facts must be considered relevant, to assess their probative value, and to draw conclusions from them as appropriate’;Footnote 28 an approach that bears many similarities to the SPS ‘objective assessment’ standard. However, dissenting Judges Al-Khasawneh and Simma criticized the majority for clinging ‘to the habits it has traditionally followed for the assessment and evaluation of evidence’, which might only serve to ‘increase doubts in the international legal community whether [the Court], as an institution, is well-placed to tackle complex scientific questions’.Footnote 29 They argued that the Court should have availed itself of powers to have recourse to outside sources of expertise in handling the complex scientific and technical dispute before it.Footnote 30 Significantly, they pointed to the WTO dispute settlement system practice of consulting and questioning experts in SPS disputes as an example, in their view, of international ‘best practice’.
The importance of robust procedures relating to the presentation of expert evidence is all the more important in the light of social scientific research indicating that experts bring to the exercise of evaluation their own implicit understandings and ‘framings’ of the SPS or environmental risks at issue,Footnote 31 which may not always accord with the risk framings adopted by the WTO Member or state whose measures are in dispute.Footnote 32 An intrusive panel review of a WTO Member's risk assessment, drawing heavily on expert views (especially if these views are accepted uncritically), thus risks supplanting the risk framings considered important by advising experts for those deemed important by the WTO Member in carrying out the risk assessment.
A similar point can be made in relation to the scope for the inclusion of other, non-scientific, viewpoints in a WTO Member's risk assessment exercise, such as the extent of public concern over a particular risk. The WTO Appellate Body in Hormones indicated that SPS risk assessment is not confined to an evaluation of ‘risk ascertainable in a science laboratory operating under strictly controlled conditions’ but also extends to ‘risk in human societies as they actually exist, in other words, the actual potential for adverse effects on human health in the real world where people live and work and die’.Footnote 33 This broader approach to risk assessment appears to take account of perspectives on risk offered by social scientists in the ‘constructivist’ school, who see value judgments and social processes as inherent to the process of risk assessment.Footnote 34 It also accords more closely with notions of environmental impact assessment in international law that generally allow for a range of information inputs beyond scientific views, including public comment.Footnote 35
One way to take account of non-scientific inputs, including public opinion, in the review of SPS measures is via adjusting the standard of review to place less emphasis on the need for a strong correlation between the measures and the underlying scientific evidence.Footnote 36 In this sense, the standard of review can become a key mechanism under the SPS Agreement allowing for a diversity of approaches to understanding and evaluating SPS risks. Conversely, application of a more stringent standard of review, which takes a close look at the scientific justification for a WTO Member's measures, is likely to foreclose possibilities for the acceptance of non-scientific inputs and public opinion as elements relevant to the review of risk assessment under WTO SPS rules.
2. Interpretation of the standard of review
The text of the SPS Agreement provides no ready solution to the question of the appropriate stringency of review to be applied by panels and the WTO Appellate Body in SPS disputes. The term ‘standard of review’ is not found in the SPS Agreement and no other specific method of evaluation of evidence is required for disputes under the Agreement.Footnote 37 In Hormones, the European Communities sought to argue that the appropriate standard was one of ‘deferential reasonableness’, limiting the panel to assessing the propriety and objectivity of the Communities’ risk assessment process. The Appellate Body, however, indicated that in understanding the standard of review applicable in SPS cases the correct starting point was Article 11 of the Dispute Settlement Understanding (DSU), applicable across all domains of WTO dispute settlement.Footnote 38 The key part of Article 11 provides that in a dispute under one of the WTO Agreements ‘a panel should make an objective assessment of the matter before it, including an objective assessment of the facts of the case and the applicability of and conformity with the relevant covered agreements’.
The only other clarifications of the ‘objective assessment’ standard offered by the WTO Appellate Body in Hormones went to the question of what is not permitted under this standard of review.Footnote 39 The Appellate Body held that the standard is neither that of de novo review nor total deference.Footnote 40 By de novo review, the Appellate Body contemplated a situation where the panel reaches its own assessment of the risks, based on the scientific material put forward by the parties and the panel's advising experts.Footnote 41 On the other hand, ‘total deference’ connoted a situation where the panel simply adopts the same evaluation of the scientific evidence as the Member concerned.Footnote 42 The WTO Appellate Body's judgment indicated that the applicable standard of review falls somewhere between these two extremes but did not offer any further specification.
B. Comparative Experience with Judicial Review of Risk Regulation
As demonstrated by the experience of judicial review of risk regulation from other comparative jurisdictions, such as jurisprudence at the federal level in the US and at the supranational level in the EU, locating the ideal stringency of review of science-based risk regulation on the spectrum between de novo review and total deference is not an easy task.
In the US, where risk regulations introduced by federal executive agencies have been challenged as insufficiently science-based by regulated industries, the review standard employed has been based, since the 1970s, on the ‘hard look’ doctrine.Footnote 43 In recognition of the difficulties presented by the review of technical decision-making, this doctrine professes a policy of judicial restraint with respect to evaluation of the science underlying a regulatory measure, but with a ‘hard look’ at the policy choices made on the basis of that evidence.Footnote 44
In practice, the US federal case law reveals a variety of understandings of how to apply the hard look standard in judicial review cases. This divergence was exemplified in the late 1970s in the judicial writings of two justices of the Court of Appeals for the DC Circuit. Judge Leventhal was a prominent advocate of reviewing judges scrutinizing the substantive underpinnings of an executive agency's decision to determine whether its exercise of discretion was reasonable.Footnote 45 By taking a ‘hard look’ at the agency's record and reasoning supporting a decision, courts following this approach often undertook a searching review of the underlying scientific evidence. In contrast, Chief Judge Bazelon believed that reviewing judges should ‘scrutinize agency proceedings with extreme care’ focusing on ensuring that agency procedures were adequate to allow for public participation in rule-making and full disclosure of areas of scientific uncertainty.Footnote 46 Over time, court rulings following Chief Judge Bazelon's lead devised increasingly more stringent procedural requirements to be met by agencies in order to satisfy the demands of the ‘hard look’ doctrine,Footnote 47 although, ultimately, this did not guarantee judicial restraint since judges often needed to probe the science underlying measures to determine whether additional procedures were necessary.Footnote 48 Further, procedural obligations imposed by reviewing courts on agency risk regulation drew criticism for unnecessarily complicating the federal regulatory process without improving transparency in agencies’ use of science,Footnote 49 and were eventually overruled by the US Supreme Court, albeit while leaving the ‘hard look’ doctrine intact.Footnote 50 Federal courts thus remained free to scrutinize the scientific underpinnings of agencies’ risk regulatory measures, leading to the development of increasingly stringent requirements for agency risk assessment.Footnote 51
In the EU, the issue of the stringency of the review to be applied by European courts examining risk regulatory measures has arisen principally in cases challenging measures adopted by the Union institutions on the basis of the precautionary principle.Footnote 52 In general, in cases involving review of the competence of Union institutions to adopt risk regulations in complex and technical cases, the standard of review applied by European courts has been deferential.Footnote 53 However, the European jurisprudence—like that in the US—has demonstrated an increasing tendency to tighten requirements around the adoption of precautionary risk regulations. This tightening has largely taken a procedural direction,Footnote 54 in the sense of specifying rigorous standards around the pre-requisites for decision-making, the quality of the scientific advice supplied to the Union institutions to support decision-making, and the need for transparency and reason-giving.Footnote 55
Even so, European Courts have not been prepared to confirm the absolute discretion of Union institutions to regulate health and environmental risks in conditions of scientific uncertainty, perhaps out of a concern that these institutions lack the necessary credibility and social legitimacy to carry off this level of discretion.Footnote 56 In these circumstances, we might expect to see a trend over time in European risk regulation towards a greater degree of scientific rigour and increased reliance on risk assessments.Footnote 57
The body of comparative risk regulatory jurisprudence that has emerged in the US and EU illustrates intermediate points along the standard of review spectrum between de novo review and total deference. The ‘hard look’ doctrine, as it has evolved in the US might be placed more towards the de novo end of this spectrum, whereas the approach of the European courts has been more deferential. At the same time this experience illustrates the difficulties courts face in seeking to ensure a rational, scientifically sound basis for decision-making while still allowing authorities sufficient autonomy to introduce risk regulations to respond to uncertain or socially unacceptable risks. These difficulties are all the more acute for WTO judicial bodies given the lesser degree of legitimacy they enjoy compared to their national, and even supranational EU-level, counterparts.Footnote 58
C. A Trend towards Quasi De Novo Review in the SPS Jurisprudence
The lack of clarity around the standard of review applicable in WTO SPS disputes contributed to a trend in cases following the Hormones decision towards an approach coming close to de novo review. The foundations for that trend were laid in the Hormones case itself. While in Hormones the Appellate Body indicated some flexibility around the use of scientific evidence in a Member's risk assessment (for instance, stating that the science relied upon did not have to constitute the majority view of the relevant scientific community)Footnote 59 it also ruled that SPS measures must have a ‘rational relationship’ to a science-based risk assessment,Footnote 60 and that in conducting a risk assessment a WTO Member must be able to produce ‘specific’ scientific studies substantiating the risk of concern.Footnote 61 These two requirements, and in particular the latter, encourage panels to look closely at the scientific evidence to determine whether the studies put forward by a defending Member indicate a sufficient basis for a finding of risk and the implementation of risk regulatory measures. The danger of such an approach is that it can lead to scientific risk findings being ‘assigned relevance to the exclusion of everything else’.Footnote 62
The Appellate Body did indicate, however, that a less stringent standard of risk assessment might apply in the context of some SPS risks, such as food safety risks, than in relation to other SPS risks, such as those of quarantine significance arising from the potential introduction, establishment and spread of pests and diseases in a WTO Member's territory. The basis for this distinction was the text of the definition of ‘risk assessment’ in the SPS Agreement, which speaks of an ‘evaluation of the potential for adverse effects on human or animal health’ in the case of food-borne risks and an ‘evaluation of the likelihood of entry, establishment or spread of a pest or disease’ in the case of quarantine risks.Footnote 63 In Hormones, the Appellate Body interpreted ‘potential’ for adverse effects to mean simply the possibility of harm;Footnote 64 whereas ‘likelihood’ was equated in the later case of Salmon with a requirement for the assessment of probabilities.Footnote 65 As discussed further in Part V, whatever the textual reasoning, the normative rationale for this distinction is not compelling and can lead to some odd results in practice.
The cases that followed Hormones in the ensuing 15 years—Salmon (1998), Varietals (1999), Japan–Apples (2003), Biotech (2006) and the panel decisions in Continued Suspension (2008) concerned both quarantine/environmental risks and food safety risks, yet stringency scrutiny of WTO Member's risk assessments and the underlying science was a constant feature. It is not possible here to give a comprehensive account of the development of the SPS jurisprudence.Footnote 66 For present purposes, however, a number of findings in these cases are pertinent in illustrating an overall trend in the SPS jurisprudence towards a more stringent review of the scientific underpinnings of risk regulatory measures. These findings include the following:
• Quarantine risk assessment concerning the likelihood of pest or disease entry, establishment and spread in a WTO Member's territory requires an evaluation of risk ‘probabilities’ and not mere ‘possibilities’ (Salmon). Although in risk-assessment practice probabilities of risk are usually evaluated in quantitative terms (ie a one in a million risk of x), the Appellate Body has held that a qualitative assessment is also acceptable (ie evaluating risk as low, medium, high etc).Footnote 67
• The requirement of Article 2.2 of the SPS Agreement for ‘sufficient’ scientific evidentiary support for a Member's SPS measure necessitates a ‘rational relationship’ between the measure and the underlying scientific evidence (Varietals),Footnote 68 emphasizing a panel's assessment of the scientific underpinnings of SPS measures in evaluating their legal legitimacy.Footnote 69
• A Member's measures may be ruled ‘disproportionate’ and illegal under the SPS Agreement where experts advising the panel consider the available scientific evidence establishes the existence of ‘negligible risk’ (Japan–Apples).Footnote 70
• A risk assessment of the SPS matter at issue, conducted at another level of government, which finds no unacceptable risks can be used to call into question the validity of scientific evidence reaching an opposite conclusion (Biotech),Footnote 71 notwithstanding that the assessments may have adopted different framings of the risks.Footnote 72
The trend in the SPS jurisprudence toward heavy reliance on scientific information (largely provided by the experts advising the panel) came full circle with the panel reports in the Continued Suspension case.Footnote 73 The panel was called upon to consider whether new scientific studies on the health effects of hormone residues in meat products—undertaken following the EU's loss in the first Hormones dispute—amounted to a risk assessment sufficient to warrant the EU's amended measures that maintained bans on hormone-treated beef. The methodology of the panel in approaching this task was essentially one of surveying and summarizing the opinions of its expert advisors on various questions. The majority view to emerge from this process was assumed to reflect the ‘best science’ and the basis upon which the EU measures should be judged.Footnote 74
III. Differentiating risk situations (apples from oranges …)
At this point a reader might well ask why we should be troubled by the emergence of a stringent science-based review of SPS measures given that the SPS Agreement itself places heavy emphasis on the role of science in discerning between legitimate and illegitimate domestic risk regulations. In this context, stringent review of the scientific underpinnings of SPS measures might be regarded as the best means of ensuring an ‘objective assessment’ of the facts in dispute as required by the WTO DSU. However, this approach assumes a monolithic concept of risk, and indeed science, which employs the latter as a litmus test for the legitimacy of assessments of the former. A one-dimensional approach of this kind does in fact underlie conventional science-based processes of risk assessment. According to this view, risk assessment is a scientific process of identifying potential hazards or adverse outcomes, and evaluating their likelihood of occurrence in order to come up with an aggregated risk characterization.Footnote 75
Over time this view of risk assessment—or rather its capacity to capture all dimensions of risk—has come under increasing challenge, particularly from social scientists working in the constructivist tradition. While conventional notions of science and risk stress objectivity, certainty and universality,Footnote 76 constructivist notions highlight the many uncertainties in scientific research and the importance of social processes in contributing to our understanding of what constitutes scientific knowledge and risk.Footnote 77 Building on these critiques, a number of researchers have sought to develop frameworks for defining the circumstances in which ‘normal science’Footnote 78 and conventional risk assessment present useful tools for understanding risk, and those circumstances in which the application of such tools can lead to a distorted view of risk.Footnote 79 The underlying premise of such frameworks is that the use of science in risk decision-making should ‘focus … on the conditions under which it is valid, and whether those conditions prevail in the situation of interest’.Footnote 80
Many of these frameworks have their basis in the seminal work of Silvio Funtowicz and Jerome Ravetz on ‘post-normal’ science. Post-normal science is a way of doing science said to be better suited to understanding complex, contemporary risk problems because it acknowledges uncertainties rather than seeking to ignore them, and makes value judgments explicit.Footnote 81 In their work Funtowicz and Ravetz sought to distinguish between three different types of scenarios concerning the use of science in policy-making, defined in terms of the interaction of two variables, ‘systems uncertainties’ and ‘decision stakes’.Footnote 82 ‘Systems uncertainties’ describes the intensity of uncertainty surrounding an issue where ‘the problem is concerned not with the discovery of a particular fact, but with the comprehension or management of an inherently complex reality’.Footnote 83 ‘Decision-stakes’, on the other hand, connote ‘all the various costs, benefits, and value commitments that are involved in the issue through the various stakeholders’.Footnote 84 Where both uncertainties and stakes are low (ie uncertainties can be managed using routine technical measures and processes, and stakes are simple and small) the resultant situation is characterized as one of ‘applied science’ where a scientific ‘business as usual’ approach works effectively.Footnote 85 A more complex situation arises in cases where one of either uncertainties or stakes is significant. Funtowicz and Ravetz characterize such decision-making as in the arena of ‘professional consultancy’, which involves the exercise of professional skill and judgment either to clarify the reliability of particular information or to resolve value questions.Footnote 86 In practice this can give rise to different opinions offered on the same hazard by different experts.
The final situation, in which both uncertainties and decision stakes are high, represents the sphere of post-normal science. When in the ‘wild’ area of post-normal science, Funtowicz and Ravetz contend that all (scientific and professional experts included) are ‘amateurs’ because the questions at issue are essentially ‘trans-scientific’ (that is, they can be asked of, but not answered by, science).Footnote 87 The authors’ prescription for a ‘quality assessment’ of scientific materials in such circumstances is to make use of an ‘extended peer community’ that will use ‘extended facts’, including anecdotal and community knowledge.Footnote 88 In effect, they see the problems of post-normal science as requiring a ‘democratization’ of science itself, not ‘out of some generalized wish for the greatest possible extension of democracy in society’ but rather because ‘an extension of peer communities, with the corresponding extension of facts, is necessary for the effectiveness of this new sort of science in meeting the great challenges of our age’.Footnote 89
A very useful distillation of these ideas, and their relevance for risk assessment, can be found in the work of Andy Stirling. Stirling's research has focused particularly upon the precautionary principle and approaches for integrating this principle into risk assessment processes. In explaining the need for a precautionary risk-assessment approach to respond to certain kinds of uncertain, complex risk situations, Stirling explores different states of knowledge that may exist in any given situation about the two main parameters of conventional risk assessment, hazards/outcomes and likelihood/probabilities. Using these two parameters he differentiates between four logical permutations of possible states of incomplete knowledge,Footnote 90 which can be summarized as follows:
• (true) Risk: where there is high confidence about possible outcomes and their probabilities as assessors are dealing with familiar systems and controlled conditions (for example, the risk of meltdown of a reactor in a nuclear power plant).
• Uncertainty: where possible outcomes can be characterized but there is insufficient knowledge to assign probabilities because the systems under examination are complex, non-linear and open.
• Ambiguity: where it is the possible outcomes, rather than their probabilities, that are difficult to assess, whether because there is contestation over appropriate risk framings, assumptions and methods, experts disagree and/or issues of behaviour, trust or compliance are pertinent.
• Ignorance: where neither outcomes nor probabilities can be fully characterized such that effects may be unanticipated, there are gaps, surprises and unknowns, and/or the existence of novel causative mechanisms.Footnote 91
Conventional processes of risk assessment, Stirling argues, offer ‘a powerful suite of methods under a strict condition of risk’ but are not ideal in risk situations characterized by uncertainty, ambiguity or ignorance.Footnote 92
These insights from the social scientific literature on risk decision-making as to the circumstances in which standard scientific methods and conventional risk assessment yield inaccurate representations of risk, are also of relevance in considering the situations in which it is appropriate for a decision-maker reviewing a risk assessment or associated regulatory measures, to adopt a stringent, science-based standard of review.Footnote 93 If conventional notions of science and risk assessment are strictly applied in the review of measures adopted to deal with a situation of uncertainty, ambiguity or ignorance (or, in Funtowicz and Ravetz’ framework, spheres of professional consultancy or post-normal science) this may privilege a narrow understanding of the nature of the risks at issue, which may not fully capture all the potential adverse outcomes of concern. Consequently, risk regulatory measures may be struck down as not being based on ‘sufficient’ scientific evidence or an ‘adequate’ risk assessment even though they are seeking to respond to a different vision of risk than that captured by conventional scientific methods.
A better calibration between the standard of review applied in SPS disputes and the circumstances in which scientific risk assessment (and hence science-based evaluation) holds its greatest validity could be achieved by allowing for greater flexibility in application of the standard of review, depending on the characteristics of the risk situation under consideration. The frameworks from the social scientific literature discussed above suggest two parameters of particular importance in differentiating between risk situations: the intensity of uncertainty (or incertitude)Footnote 94 surrounding a particular risk and the degree of socio-political contestation with respect to the issue that will give rise to different framings in risk assessment and different value judgments about the importance or otherwise of avoiding particular outcomes. In turn, differences over value orientations will tend to feed into disputes over the assumptions and methods to be applied in risk assessment (especially to deal with areas of incomplete knowledge). These two parameters, designated ‘certainty’ and ‘consensus’, were those put forward by a quintet of distinguished social science professors as a basis for distinguishing risk situations in an article published in the lead-up to the WTO panel's decision in the Biotech case.Footnote 95 The aim of David Winickoff and his co-authors in that article was to advocate for a procedural approach to SPS review, taking into account national values and policy judgments in scientific regulation.
Applying the approach of Winickoff et al more broadly to distinguish ‘apples from oranges’ so to speak, we can use combinations of the uncertainty and socio-political contestation parameters to identify four broad categories of risk situations that may confront decision-makers in an SPS (or other environmental) dispute (see Figure 1). It is acknowledged that, in the real world, the boundaries between each of these four risk situations will be permeable, which may require a decision-maker to judge which category best describes a particular risk situation (how panels and the Appellate Body might go about this task is discussed further in Part VI). It might also be argued that an approach characterizing risk situations according to two parameters only suffers from the same kind of reductionism that founds critiques of conventional risk assessment. While there is some truth to this criticism, the analysis of risk situations according to characteristics of uncertainty and socio-political contestation nonetheless offers a useful contribution in articulating categories of risk situations that fall outside the realm and expertise of conventional scientific risk assessment. Moreover, the parameters themselves are sufficiently broad to subsume a number of different elements commonly encountered in disputes over risk. For instance, the parameter of uncertainty captures problems of ignorance, differences between experts, inadequate methods and complex systems. On the other hand, the parameter of socio-political contestation brings into consideration different value assumptions, different risk framings and different assessments of the costs and benefits of a particular outcome.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160626064344-54683-mediumThumb-S0020589312000024_fig1g.jpg?pub-status=live)
Figure 1. Differentiating risk situations
The most difficult risk situations for assessment and decision-making purposes tend to be those where levels of uncertainty are high (perhaps because the risk is relatively novel (ie a case of ignorance) or research methodologies are not well-developed (leading to ambiguity)) and social consensus around risk framings is low (ie the extent and importance of the risk are perceived differently by different communities).Footnote 96 SPS disputes over risk situations falling into this category might justifiably attract a more deferential standard of review, both as there is more expert disagreement about what is the appropriate scientific basis for measures, and also because values that play a role in shaping the reasoning of risk assessors will be highly contested.
At the other end of the spectrum we would find risk situations where levels of uncertainty are low and there is a high degree of convergence in the social framing of the risks at issue (ie socio-political contestation over the risk is low). Here we would expect that standard scientific processes and conventional risk assessment would operate effectively and hence a fairly stringent standard of review would be justified to ensure that measures have a credible scientific underpinning and that risk assessment conclusions are adequately justified.
The two further categories of risk situations illustrated in Figure 1—low uncertainty/high socio-political contestation and high uncertainty/low socio-political contestation—will lie at intermediate points on the continuum of risk situations (see Figure 2). In situations of low uncertainty accompanied by high levels of socio-political contestation over the importance of the risk, a standard of review towards the stringent end of the continuum might be applied, similar to the ‘hard look’ approach taken by US federal courts. By contrast, risk situations involving SPS concerns characterized by high levels of uncertainty and also high levels of social consensus as to the importance of the risk (ie socio-political contestation is low) might warrant a more deferential standard of review, subject to procedural constraints such as those that have been employed by the European courts (eg independence of experts, quality of expertise, avenues for outside participation, transparency).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160626064346-83474-mediumThumb-S0020589312000024_fig2g.jpg?pub-status=live)
Figure 2. Standard of review continuum
Part VI takes up the question of how this approach to application of the standard of review might be operationalized in the practice of SPS dispute settlement. Before doing so, however, it is pertinent to consider the clarifications of the standard of review offered by the Appellate Body in its recent case law and whether they allow for the kind of differentiation between risk situations proposed.
IV. Continued suspension (hormones)—a turning point?
On appeal of the Continued Suspension panel reports, one set of questions that faced the WTO Appellate Body concerned the appropriateness of the panel's approach in dealing with the scientific and expert evidence. Consistent with the de facto approach of de novo review that had been evolving in the SPS case law, the panel had essentially taken the majority view of its advising experts as a baseline for judging the adequacy of the EU's measures and their underlying scientific basis. The Appellate Body dealt with the issue of the appropriateness of the panel's approach in two ways: first, through an analysis of the requirements applicable to the selection and consultation of advising experts,Footnote 97 and secondly, and of most relevance for the present context, by focusing on the nature of the standard of review under the SPS Agreement. The Appellate Body's rulings on both questions demonstrated a strong appreciation of the important role that scientific evidence—and the experts who provide and evaluate that evidence—plays in SPS dispute settlement.
In respect of the standard of review applicable in SPS dispute settlement, the Appellate Body reiterated its previous findings in Hormones that the standard is one of ‘objective assessment’ derived from Article 11 of the DSU. However, it added several important clarifications with respect to what is required in order for a panel to make an objective assessment of a WTO Member's risk assessment. The Appellate Body ruled that it was not the task of the panel to determine the correctness of a Member's risk assessment but rather whether it ‘is supported by coherent reasoning and respectable scientific evidence and is, in this sense, objectively justifiable’.Footnote 98 The Appellate Body then went on to detail the methodology panels should apply in reviewing a WTO Member's risk assessment, including the role of experts in this regard. The Appellate Body specified a four-step methodology consisting of the following stages:Footnote 99
1. Identification of the scientific basis of the SPS measure. The Appellate Body noted that this basis need not reflect a majority view within the scientific community but can embrace divergent or minority perspectives.
2. Verification that the scientific basis comes from a respected and qualified source. Accordingly, the Appellate Body ruled that the scientific material relied upon in risk assessment must have the ‘necessary scientific and methodological rigour to be considered reputable science’. This does not imply general acceptance of a particular view within the scientific community but rather evidence ‘considered to be legitimate science according to the standards of the relevant scientific community’.
3. Assessing whether the reasoning articulated on the basis of the scientific evidence is objective and coherent. To meet this standard the Appellate Body held that the conclusions drawn by a Member need to find sufficient support in the scientific evidence relied upon.
4. Determining whether the results of the risk assessment sufficiently warrant the SPS measure. According to the Appellate Body, the task at this stage was to discern whether there is a rational or objective relationship between the scientific basis identified and the SPS measure adopted, although the scientific basis can be a minority view if coming from a qualified and respected source.
The role of the experts advising the panel is to assist the panel with the above tasks while avoiding the situation of de novo review where the panel asks whether ‘the experts would have done a risk assessment in the same way and would have reached the same conclusions as the risk assessors’.Footnote 100
On its face, the methodology articulated by the WTO Appellate Body in Continued Suspension would seem to allow greater scope for deferential review of a Member's challenged risk assessment in SPS dispute settlement. This is the view of commentators such as Gruszczynski, who considers that the ‘standard is much closer to the deferential end of the spectrum’ although ‘it would be too far to conclude that it is a fully procedural one’.Footnote 101 He concludes that the new standard
leaves WTO Members with a great degree of discretion as to how to assess scientific data and what kind of inferences to make on this basis. It also greatly reduces the need of the panel to engage in a detailed examination of scientific evidence and deciding which scientific view is better.Footnote 102
Accepting Gruszczynski's analysis, we might expect that the standard of review articulated in Continued Suspension would allow greater scope for the adoption of legitimate SPS measures in categories of risk situations exhibiting characteristics of scientific uncertainty and some degree of socio-political contestation.
There are certainly a number of indicators in the Appellate Body's judgment in Continued Suspension that suggest it favoured a more deferential approach to evaluating risk assessment and the supporting scientific evidence than had become the practice in SPS jurisprudence. In particular, in respect of assessing the ‘scientific basis’ of an SPS measure, the Appellate Body directed panels to focus on the respectability and reputability of the source of the scientific evidence rather than evaluating the cogency of the scientific evidence itself. This approach has some similarities with that taken by the US Supreme Court in the well-known Daubert case dealing with the question of which evidence qualifies as expert testimony under federal evidence rules.Footnote 103 In its decision, the Supreme Court relied on a flexible set of factors designed to ensure that admitted scientific evidence is both relevant and reliable. Since the Supreme Court's decision, however, the Daubert test has proven no easier for courts to apply in practice than the previous standard,Footnote 104 which relied on assessing whether a particular view or technique had gained ‘general acceptance’ in the particular scientific field to which it belonged.Footnote 105 Moreover, the Supreme Court's emphasis on the ‘scientific validity’ of evidence has tended to push reviewing courts to impose stricter requirements on the admission of expert opinion.
The US experience under Daubert suggests the need for caution before hailing the Continued Suspension decision as ‘a turning point’ with respect to the standard of review.Footnote 106 In particular, the Appellate Body's review methodology outlined in Continued Suspension still contains a substantial emphasis on scientific factors (eg the ‘scientific basis’ for measures, which must display ‘scientific and methodological rigour’ sufficient to be considered ‘legitimate science’ by reference to ‘the standards of the relevant scientific community’).Footnote 107 Although panels are instructed not to disregard scientific evidence that comes from a minority of scientists, they will still be relying heavily on their advising experts to help them identify what evidence should or should not be considered ‘legitimate science’, an evaluation which may well gravitate towards peer reviewed studies or data that has widespread acceptance in the relevant scientific community. It is also worth noting that the Appellate Body cast its review methodology in the context of evaluation of risk assessment and so it may not be applicable to assessments of the ‘sufficiency’ or ‘insufficiency’ of scientific evidence under other provisions of the SPS Agreement. It has frequently been in assessing this element of SPS measures that the heaviest scrutiny of the disputed science has been evident.
V. Australia–apples—a double standard?
The Australia–Apples case offered the first opportunity for application of the new standard of review requirements in WTO SPS dispute settlement. Like the previous Japan–Apples case, the Australian–Apples dispute involved quarantine measures designed to prevent the introduction of various pests and diseases, including fire blight, into the territory of the Member concerned (Australia) from other countries where these diseases and pests were established. New Zealand challenged the Australian measures arguing that there was ‘no scientific support’ for Australia's contention that imported apples are a pathway for transmitting various plant diseases and pests, and hence that the Australian measures were inconsistent with Article 2.2 of the SPS Agreement. New Zealand also argued that the Import Risk Analysis (IRA) produced by Australian authorities to justify and elaborate quarantine restrictions on New Zealand apples was inconsistent with the risk assessment requirements under Article 5.1 of the SPS Agreement. These arguments were upheld by at first instance by the panel and on appeal to the Appellate Body.
This result might have seemed inevitable given the Japan–Apples precedent, however, the risk assessment put forward by the Australian quarantine authorities differed significantly from that struck down as inadequate in the Japanese case. The Australian IRA adopted what was termed a ‘semi-quantitative’ approach that combined a quantitative estimation of the probability of entry, establishment and spread of particular diseases or pests with a qualitative evaluation of the economic and environmental consequences.Footnote 108 This approach was adopted because of the concerns expressed by stakeholders during the domestic decision-making process over the transparency and objectivity of using a purely qualitative risk assessment.
The quantitative estimates of probabilities made by the IRA were expressed as probability intervals, rather than as single numbers. For instance, a probability interval of zero to 10−6 (1 in a million) was used for a negligible likelihood of occurrence of a particular event. The IRA also employed probability distributions to pinpoint a value within each interval on a per apple basis for each importation step of an importation scenario and the different pathways of distribution, utilization, waste generation and disposal of apples in Australia. Where insufficient information was available to determine a most likely value within a given probability interval, the IRA applied a uniform distribution approach on the basis of ‘expert judgment’. A uniform distribution model assumes that each value in the probability interval range occurs with equal probability. In the case of a negligible likelihood with a probability interval of 0–10−6, application of a uniform distribution rule yields a midpoint of 0.5 in a million.
This IRA methodology was heavily criticized by the panel in Australia–Apples. The panel focused particularly on the choice of a probability interval of 0 to 10−6 and use of uniform distribution to model events with a ‘negligible’ likelihood of occurring, ruling that these choices were not properly justified in the IRA and led to an overestimation of the probability of entry, establishment and spread of the diseases and pests at issue. The panel identified these ‘flaws’ as ‘magnify[ing] the assessment of risk, turning what are often the remotest of possibilities into events that are assessed as occurring with some frequency.’Footnote 109 This finding formed a major plank of the panel's overall ruling that the IRA was not a proper SPS risk assessment. Arguably the panel's finding came very close to saying that it disagreed with the risk assessment methodology chosen by the WTO Member concerned and that it (or its advising experts) would have done the risk assessment differently.
Consequently, on appeal, a central part of Australia's arguments was that the panel had misinterpreted and misapplied the standard of review articulated in Continued Suspension in its assessment of the IRA under Article 5.1.Footnote 110 However, the Appellate Body upheld the panel's application of the standard of review and its finding that the IRA's conclusions were not objective or coherent because they exaggerated or overestimated certain risks and consequences and did not find sufficient support in the scientific evidence relied upon. In this regard, the Appellate Body drew a distinction between the scrutiny to be applied to the underlying scientific basis of a measure, and scrutiny of the reasoning of a risk assessor on the basis of the scientific evidence. In the case of the former, the standard to be applied was more deferential reflecting the fact that panels are not well-equipped to undertake their own scientific assessment.Footnote 111 Accordingly, the applicable standard was whether the scientific material constitutes ‘legitimate science according to the standards of the relevant scientific community’.Footnote 112
By contrast, when reviewing the reasoning of risk assessors, the apparently more stringent standard of objectivity, coherence and assessment of the sufficiency of scientific evidentiary support applied.Footnote 113 The Appellate Body found the IRA's reasoning and conclusions drawn in the risk assessment fell into this second category, even where expert judgment was exercised by risk assessors to address alleged scientific uncertainty. Indeed, the Appellate Body was of the view that ‘when the exercise of expert judgment forms an integral part of the risk assessor's analysis, then it should be subject to the same type of [more stringent] scrutiny by the panel as all other reasoning and conclusions in the risk analysis.’Footnote 114 The Appellate Body did not view the phrase ‘as appropriate to the circumstances’ in Article 5.1 of the SPS Agreement as affording such flexibility as to excuse a risk assessor from properly performing the risk assessment.Footnote 115 Hence although recourse to expert judgement in the IRA was not in itself objectionable, ‘it must be reasoned and explained consistently with Articles 5.1 and 5.2 of the SPS Agreement so that the risk assessment can still be considered a scientific process that is based on the “available scientific evidence”.’Footnote 116
The Appellate Body's findings in Australia–Apples raise an important question about the current interpretation of the SPS standard of review, which has been succinctly summarized by Gruszczynski:
One may consider US/Canada – Continued Suspension as an anomaly in an otherwise rather consistent line of cases that subscribed to de novo review. However, it is also possible that the Appellate Body wants to apply a more deferential standard of review in human health related trade disputes (such as Continued Suspension), while in traditional phytosanitary cases the applicable standard will remain intrusive.Footnote 117
If the Appellate Body were indeed seeking to distinguish between different types of SPS disputes in terms of the stringency of the review applicable, what would be the basis for that distinction? As discussed in Part II, the Appellate Body has relied in the past on the text of the SPS Agreement to distinguish between two categories of risk situation: 1) food or feed safety risks where it is only necessary in risk assessment to evaluate possible adverse effects on human or animal health; and 2) quarantine risks associated with the introduction of pests or diseases via trade in a particular commodity in which case an evaluation of the probability of harm is required.
At a normative level, however, there is no compelling rationale for distinguishing, on this basis, the applicable risk assessment standard and associated stringency of judicial review. For one, both categories of risk (food-borne and quarantine) may relate to human health,Footnote 118 so that if there was some desire to elevate the value of human health protection above other regulatory purposes it is not served by the Appellate Body's textual interpretation. Even if the intention was to require a higher standard of justification for measures addressing quarantine risks, the Appellate Body's tolerance of both qualitative and quantitative forms of risk assessment blurs the distinction where low likelihood risks are at issue. As Alessandra Arcuri has astutely noted, there may be little difference in practice between assessing whether an event is possible and assessing whether it is of low or negligible probability.Footnote 119 The difficulties experienced with the IRA in Australia–Apples bear this out. One can speculate that the IRA may have fared better in WTO dispute settlement had the Australian quarantine authorities opted for a purely qualitative (but less transparent) assessment of the probabilities of introduction, establishment and spread of plant pests and diseases via apple imports.
Arcuri seeks to explain the different approaches evident in the SPS case law as the result of the Appellate Body ‘juggling’ between two different visions of risk put forward by competing clusters of experts and scholars in the field of risk analysis. The first cluster, which she says subscribes to a ‘quantitative-risk logic’, adopts the conventional technical risk perspective in which risk is a calculable entity assessed via scientific means. The second group, falling into the constructivist school, employs a ‘holistic-risk logic’ that recognizes the role of values and socio-cultural context in the process of risk assessment. Arcuri hypothesizes that as WTO panels and the Appellate Body ‘juggle’ between these two visions of risk in SPS disputes they end up favouring different standards of risk assessment and review.Footnote 120
VI. Operationalizing a differentiated standard of review
While Arcuri's juggling hypothesis provides a useful analytical lens for explaining the changing stringency of review applied in different SPS disputes, continued ‘juggling’ of this kind by the Appellate Body does not provide a sustainable, principled solution for application of the standard of review in SPS dispute settlement. Moreover, implicit in the Appellate Body's ‘juggling’ between different stringencies of review is the suggestion that there exists a hierarchy of SPS risks. It is striking that the cases where the Appellate Body has adopted a more deferential stance have been those involving human health risks (cancer potentially caused by residues of growth hormones in beef), whereas the standard of review has uniformly been more stringently applied in phytosanitary quarantine cases.
Some might argue that the solution to the standard of review question therefore lies in the Appellate Body and panels making explicit the value judgments they are exercising regarding the importance of different SPS risks.Footnote 121 There is some precedent for this approach in a number of cases decided by the Appellate Body under the General Agreement on Tariffs and Trade (GATT) involving health and environmental risks. In its jurisprudence on the ‘necessity’ test applicable under the GATT health exception, Article XX(b), the Appellate Body has sought to ‘weigh and balance’ a series of factors which include ‘the importance of the common interests or values protected’ by a particular measure.Footnote 122 In the case of Asbestos, concerning the GATT legality of a health-related ban on asbestos products, the Appellate Body gave great weight to the health objective pursued by the measure in evaluating the necessity for its adoption. It described the ‘value’ of ‘the preservation of human life and health through the elimination, or reduction, of the well-known, and life-threatening health risks posed by asbestos fibres’ as ‘both vital and important in the highest degree’.Footnote 123 Similarly, in the later case of Retreaded Tyres, which also involved arguments under Article XX(b), both the panel and the Appellate Body characterized the objective of environmental protection pursued by the disputed measure as important.Footnote 124 Given the importance of the interests protected by the import ban (which also included health objectives), the ban was judged necessary despite its trade-restrictiveness.Footnote 125
There are a number of reasons, however, to be cautious about transferring this approach to the SPS context as a way of supporting the application of different standards of review in different cases. One immediate concern that arises is how international decision-makers involved in SPS dispute settlement might legitimately go about evaluating the importance of different SPS risks and the values attached to them. Even in the national sphere there are few established guidelines for determining the relative substantive importance of interests or values in the health or environmental sphere.Footnote 126 In addition, SPS measures have a particular focus on risks arising within a Member's territory making questions of local environmental conditions and the risk preferences of local populations more pertinent.Footnote 127 Universal value commitments are thus difficult to discern in these circumstances.
It also remains unclear from the GATT health and environmental jurisprudence to date whether an approach that looks to the importance of the values protected by a measure would lead to a more deferential review approach where the risks at issue extend beyond situations of well-founded harms. In Asbestos, the willingness of the Appellate Body to find the values at stake were vital seemed to be predicated on a finding of scientifically well-supported health risk.Footnote 128 In other words, it appeared to be the strong scientific consensus surrounding the existence of asbestos-related health risks that founded the Appellate Body's rulings that there was also substantial social consensus on the importance of the interests at stake. Faced with a situation where disputes remain over the scientific basis of risk concerns (the issue of genetically modified organisms is a good example here), WTO decision-makers might be more hesitant to apply a deferential standard of review on the basis of value concerns alone.Footnote 129
This article has argued that a better approach to recognizing the diversity of risk situations presented by SPS cases, and potentially also other international environmental disputes, would be to differentiate between risk situations according to their characteristics of uncertainty and degree of socio-political contestation, and to tailor the stringency of the standard of review applied accordingly. The parameters of uncertainty and socio-political contestation offer the potential for a more transparent, reasoned assessment process than in the case of different values. As regards the criterion of uncertainty there are now numerous analyses that describe the different types of uncertainty encountered in scientific research and put forward indicia for recognizing whether uncertainty arises in a particular situation.Footnote 130 Discerning differing degrees of socio-political contestation regarding risks may be more problematic. However, Winickoff et al suggest one approach would be to look to evidence sourced in the risk literature, regulatory experience or public dialogues which suggests a lack of consensus as to the nature, sources and extent of the risks involved.Footnote 131 Using such evidence, the task of a decision-maker would be to assign a ranking to each parameter of ‘high’ or ‘low’. The combination of the parameters assigned (eg high/high, low/high etc) could then guide the applicable stringency of review, as illustrated in Figure 1.
While in theory this approach seems straightforward more difficult issues of interpretation are likely to be encountered in SPS dispute settlement practice. Concerns already exist that legally-trained decision-makers such as WTO panelists are ill-equipped to grapple with scientific and technical evidence.Footnote 132 While the Appellate Body in a number of its decisions has demonstrated considerable sophistication in its understanding of different perspectives on science and risk assessment,Footnote 133 it is panels who continue to bear the brunt of the task of evaluating the factual information including the ever-growing volume of scientific and technical evidence in SPS disputes.Footnote 134 Would they fare any better if asked to discern levels of uncertainty pertaining to a risk and the degree to which there are different risk framings, value judgments and assumptions at stake? Perhaps not. However, recognizing these matters as relevant factors for assessment by a panel might at least permit the involvement of a broader range of perspectives and evidence about the risks under consideration, potentially also extending beyond the views of experts to include anecdotal and regulatory experience, and public opinion.
Another way of approaching the problem, familiar to lawyers, is to reason by analogy from existing examples. If we go back over the past SPS case law (and this could also be done for disputed issues that have come before the WTO SPS committee) we might categorize the types of risk situations that have arisen as shown in Figure 3 below. In evaluating the characteristics of a risk situation in a later SPS dispute, decision-makers might consider whether that situation is more or less like those that have been encountered in the past.Footnote 135
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20151127071912487-0420:S0020589312000024_fig3g.gif?pub-status=live)
Figure 3. Differentiating risk situations in the SPS case law
Finally, decision-makers might make use of other tools in combination with differentiations in the stringency of review to cater for the particular characteristics of different risk situations. This may be especially useful in cases that occur at the boundary between different categories of risk situations. For instance, in situations of low to moderate uncertainty but high socio-political contestation, a key issue will most likely be that parties dispute the range of possible adverse outcomes that may occur. In such cases, a relatively stringent, ‘hard look’ style of review might be coupled with a proportionality-style analysis that looks to the coherence of the relationship between the assessed level of risk and the SPS measures adopted in response.Footnote 136 Cost–benefit analysis could potentially also be a useful tool in this instance in clarifying the financial costs and social benefits associated with different regulatory options that respond to different understandings of the potential hazards.Footnote 137
Conversely, in risk situations involving SPS concerns characterized by high levels of scientific uncertainty but low to moderate levels of socio-political contestation (ie the risk is generally regarded as being important and framings converge) it may be that uncertainties will be able to be reduced over time. In such cases, a more deferential standard of review that focuses on the adequacy of procedures adopted by the Member in assessing risks (ie their openness, transparency, use of qualified expertise, etc) might be supplemented by the acceptance of measures based on that assessment as a precautionary response applying on a provisional basis only.Footnote 138
The approach described above, based upon differentiating different SPS risk situations and the stringency of the standard of review applied, is unlikely to satisfy all stakeholders: some will feel that it still allows too much scope for protectionism; others that it allows insufficient scope for including public risk perceptions. It may also be seen as introducing unacceptable levels of discretion into the task of international adjudication.Footnote 139 However, judicial review of science-based risk regulation will always remain a normative exercise at some level.Footnote 140 Arguably, a methodology for adjusting the stringency of the standard of review according to qualities of the risk situation at hand provides a more principled and transparent basis for the science-law interaction in SPS disputes than the current ‘juggling’ evident in Appellate Body decision-making. It also has the advantage that it can be relatively easily accommodated within the existing provisions and jurisprudence under the SPS Agreement, for example, by relying on the criterion of sufficiency/insufficiency of scientific evidence in Articles 2.2 and 5.7, as well as the Appellate Body's recognition in Hormones of the necessity for the inclusion of ‘real world’ considerations in SPS risk assessment.
VII. Conclusion
This article has examined a seemingly technical question—the standard of review applicable in WTO SPS disputes—but in so doing has sought to demonstrate the broader implications of this issue for domestic risk regulation, and questions surrounding the use of science and expertise in WTO dispute settlement. The Appellate Body case law on the SPS standard of review has flip-flopped between deferential and stringent applications, but without articulating a clear normative rationale for this approach. In this article an alternative approach has been put forward, based upon differentiating SPS risk situations according to key parameters of associated levels of scientific uncertainty and socio-political contestation around risks, and adjusting the stringency of judicial review applied to reflect different combinations of these two parameters. It is argued that this manner of approaching the standard of review in SPS cases would allow for a more coherent and principled application in practice that recognizes the diversity of risk situations falling within the scope of the SPS Agreement.
In the first WTO dispute to come before the Appellate Body, it declared that WTO law should not be interpreted ‘in clinical isolation’ from the rest of public international law.Footnote 141 Equally it should not be assumed that developments occurring in WTO SPS law are of no relevance to the broader sphere of international dispute settlement. Highly technical disputes, raising difficult questions over the methodology to be applied by international courts and tribunals in evaluating scientific evidence, are increasingly a feature of international environmental law.Footnote 142 As the Pulp Mill case indicates, other international judicial tribunals may look to WTO SPS dispute settlement as a site of international ‘best practice’ on questions relating to the appropriate treatment and use of science and expertise in decision-making. To the extent that international legal tribunals dealing with complex, fact-intensive environmental disputes seek to model their own practices on that of WTO SPS dispute settlement through the resort to independent expert advice and science-based standards of evaluation, they are likely to face similar challenges and questions about how experts are selected, how their independence is determined, how much weight is placed on the experts’ views, and ultimately, the appropriate standard of review. There is thus considerable potential for innovations in the application of the standard of review in SPS dispute settlement to provide a useful model for environmental dispute resolution in international adjudicatory fora beyond the realm of the WTO.