When setting priorities for health, there is broad agreement that a range of social values and ethical principles beyond clinical and cost-effectiveness matter, but exactly how health technology assessment (HTA) should account for a broader set of criteria remains an area of ongoing debate (Reference Hofmann1–Reference Bellemare, Dagenais, K-Bédard, Béland, Bernier and Daniel3). In light of this, we welcome a recent review paper by Baltussen et al. that evaluates the potential of multi-criteria decision analysis (MCDA) to enable HTA agencies to incorporate a broader set of values in their appraisals (Reference Baltussen, Marsh, Thokala, Vakaramoko, Castro and Cleemput4). The authors describe three approaches to MCDA—qualitative MCDA, quantitative MCDA, and MCDA with decision rules—laying out their relative advantages and disadvantages with respect to improving the quality, consistency, and transparency of HTA recommendations and providing recommendations for how each can best be implemented. This contribution helpfully extends an earlier effort by Devlin and Sussex (Reference Devlin and Sussex5) by offering a more systematic comparison of different types of MCDA in terms of their ability to improve HTA recommendations.
We endorse many of the authors' assessments and conclusions, including the critical role of deliberation in any MCDA approach and the undertaking of qualitative MCDA at a minimum. However, we take a stronger position regarding the flaws of quantitative MCDA and, building on their own critical assessment, strongly caution against it. We find the quantitative approach antithetical to at least two of the three ways that the authors believe MCDA can improve HTA recommendations: (i) enhancing quality and (ii) promoting transparency. Below we further examine the ways in which quantitative approaches to MCDA are flawed and unsuited to realizing the intended aims of MCDA. We instead advocate for a predominantly qualitative approach to MCDA appraisal that relies on deliberation with multiple informational inputs, including decision rules or aids to help account for health opportunity costs.
On Quality: The Whole is More Than the Sum of Its Parts
As Baltussen et al. define it, the quality of an MCDA approach rests on the extent to which it takes into account relevant values and enables appropriate tradeoffs between them (Reference Baltussen, Marsh, Thokala, Vakaramoko, Castro and Cleemput4). We are concerned that quantitative MCDA cannot fulfill these functions. The authors already recognize a number of methodological challenges that threaten the quality of quantitative MCDA. However, assigning aggregated, weighted scores to technologies presents a more pernicious threat to high-quality HTA recommendations because it oversimplifies complex concepts and tradeoffs both within and across criteria, thus obscuring potentially important considerations that should be explicitly addressed in decision making. The result is that, at each step of scoring, weighting, and aggregating, there is significant information loss. This limits the ability of those involved in the process to both engage with the underlying assumptions and considerations inherent in complex criteria and make reasoned judgments about which tradeoffs are appropriate. In short, quantitative MCDA risks reducing what should be a difficult decision worth wrestling with to a technical exercise that algorithmically and unreflectively produces a recommendation.
To see why, let us first take on the issue of information loss within a particular criterion, when scores and weights are assigned for particular attributes of a health technology. Imagine that, in an attempt to reduce cognitive load, a single criterion for “equity” was adopted in the performance matrix. However, equity is too complex a concept to be reducible to a single score in this way. Equity encompasses competing accounts of distributive justice (e.g., prioritizing the worst off vs. ensuring sufficiency for all) that can be assessed along various dimensions (e.g., equity by geography, age, gender, socioeconomic status, ethnicity, etc.) and through various measures (e.g., healthcare access, quality, outcomes, etc.). Even if it were possible to compute a single composite score for equity that accommodates various kinds of relevant considerations, the rating would tell you very little about which types of inequities the intervention addresses or whether any negative equity impacts are introduced amidst net equity gains. For example, offering a new breast cancer drug could simultaneously help address gender inequity between male and female cancer patients while exacerbating inequities between urban and rural women if the treatment is only accessible to those near urban hospitals. The nuance and competing considerations are lost when complex criteria are boiled down to a single score.
If an MCDA performance matrix were expanded to include more granular, distinct criteria related to equity—as is more common in practice—such as disease severity and prevalence, and treatment impacts on poverty (Reference Youngkong, Baltussen, Tantivess, Mohara and Teerawattananon6), critical considerations remain explicitly unacknowledged. First, there still remain multiple considerations underlying each single criterion that should be assessed. For example, in the case of impacts on poverty, decision makers may consider impoverishment due to direct out-of-pocket health expenditures, lost wages due to ill health and absenteeism, or long-term impacts on earning potential related to missed schooling. There are also considerations of the prevalence of particular conditions among the poor and how much preference should be given to interventions that disproportionately affect the indigent. A single score related to poverty impacts could obscure tradeoffs between prioritizing the health needs of those already below the poverty line versus allocating to reduce catastrophic health expenditure among those comparatively better off at baseline. This brings us to our second point. As discussed above, promoting equity requires both a theory of the good (i.e., what aspect[s] of well-being should be the focus) and a theory of how to distribute that good (i.e., on the bases of starting with those who are worst off, realizing some level of sufficient well-being for all, or striving for equality) (Reference Persad, Mastroianni, Kahn and Kass7). Without a clear conception of the underlying distributive justice theory, it is very possible that different quantitative MCDA appraisals could assign quite divergent “equity” scores to something like a rare disease intervention, with one scoring such an intervention highly on the basis of an implicitly prioritarian approach (Reference Youngkong, Baltussen, Tantivess, Mohara and Teerawattananon6), and another giving a lower equity score to interventions that would bring comparatively few closer to sufficient well-being (Reference Endrei, Molics and Ágoston8). As such, trying to reduce complex considerations like equity to a set of scores at any level of granularity inevitably obscures related considerations or key components of distributive theory, and may also give a false sense of parity among two interventions that rate similarly on a single dimension of the performance matrix but for very different reasons.
Finally, the issue of information loss is further compounded when aggregating weighted scores across multiple criteria. This practice makes it difficult to identify specific tradeoffs between different types of criteria, precluding thoughtful and informed discussion by an appraisal committee about whether and when those tradeoffs are acceptable.
On Transparency: More Than Disclosing Methods and Scores
Baltussen et al. note that all three approaches to MCDA can improve the transparency of priority setting by providing explicit criteria against which health interventions are evaluated (Reference Baltussen, Marsh, Thokala, Vakaramoko, Castro and Cleemput4). Furthermore, they add that quantitative MCDA can enhance transparency compared to qualitative MCDA when the numerical weights and scores for these criteria and the mathematical aggregation function are explicitly reported to the public. However, simply making more information available is not a plausible account of what transparency requires, especially when transparency is viewed as instrumental to legitimacy, accountability, and public trust. A more appropriate account of transparency surrounding government decision making understands it as a means for demonstrating respect to members of the public who will be impacted by a decision and providing them with information to better understand decisions and their underlying rationale, so that they can accept, act on, or challenge decisions (Reference Daniels and Sabin9,Reference Gutmann and Thompson10). This not only helps legitimize government decisions, particularly in democratic societies, but can also provide an additional check on the quality of the decision making. To fulfill this function, transparency requires providing access to understandable and relevant information in addition to simply providing more information, an understanding consistent with other discussions of transparency that emphasize accessibility, relevance, and comprehension (Reference Persad, Cohen, Evans, Lynch and Shachar11,Reference Naurin12).
In this regard, quantitative MCDA falls short. An overly technical and algorithmic approach will likely fail to help the public meaningfully understand why and how a health coverage decision was made. For example, in a study of the increasing formalization of HTA at The National Institute for Health and Care Excellence (NICE)—the HTA body that serves England and Wales—Charlton describes how particular normative judgments have become “embedded” in technical analyses that are largely removed from the more public committee appraisal process (Reference Charlton13). Furthermore, the methods used in quantitative MCDA to develop weights for each criterion, score interventions across criteria, and aggregate scores to obtain an overall ranking of interventions can be technically complex, and it is unlikely that a typical member of the public would be able to understand and engage with information provided about the methodology, let alone challenge or disagree with the underlying approach. In fact, reliance on this type of approach may further erode public trust in health priority-setting endeavors given the climate of mistrust of experts (Reference Johnston and Ballard14) and negative conceptions about the fairness of algorithmic decision making (Reference Lee15,Reference Woodruff, Fox, Rousso-Schindler and Warshaw16).
Perhaps more concerning than the inaccessibility of quantitative MCDA methods is the way in which aggregating across multiple and complex criteria obscures public acknowledgment of competing considerations in favor of or against a health intervention, thereby failing to help the public develop good judgment (Reference Johri and Norheim17) or an appreciation of the moral stakes of resource allocation in a context of competing needs (Reference Nussbaum18). Masking the difficult tradeoffs that are inherent in health priority setting not only affects the quality of the recommendations (as discussed above), but may inhibit broader public understanding of what is morally at stake when health coverage decisions are being made, why it is important and necessary to set priorities amidst resource constraints, and the opportunity costs associated with each investment decision (Reference Goold, Biddle, Klipp, Hall and Danis19).
A Greater Role for Deliberation: The Promise of Qualitative MCDA
Like Baltussen et al. (Reference Baltussen, Marsh, Thokala, Vakaramoko, Castro and Cleemput4), we view deliberation as a crucial component of HTA. In our view, however, qualitative MCDA is far better suited than quantitative MCDA to promoting the type of deliberative decision making that can enhance the quality and transparency of appraisals, as discussed above. In qualitative MCDA deliberation, committee members openly discuss and debate competing considerations both within and across criteria that comprise the performance matrix. These deliberations help to ensure that competing moral claims on health resources remain explicit and distinct rather than collapsed into summary criterion scores or aggregated ranks, thus improving the quality of decision making. These discussions are also more likely to avoid highly technical subjects and instead rely on a common vernacular, especially when appraisal committees are comprised of diverse stakeholders who must find a shared language to communicate effectively across disciplinary boundaries. As a result, the output of qualitative MCDA deliberations will provide appropriate details about the final recommendation, the information taken into account, the principal reasons for and against coverage, and how the final position was decided upon, such that the lay public can more easily understand and engage with the substance of recommendations.
To be sure, Baltussen et al. argue that quantitative MCDA should always include a deliberative component, so supporters of quantitative MCDA might respond that it offers the best of both approaches. However, it is critical to consider the stage at which deliberation occurs. Baltussen et al. describe deliberation as the final step in quantitative MCDA—as was the case for several quantitative MCDA studies identified in their review (Reference Goetghebeur, Wagner, Khoury, Rindress, Grégoire and Deal20–Reference Youngkong, Teerawattananon, Tantivess and Baltussen23)—meaning that deliberation occurred only after the scoring, weighting, and aggregation have been completed to raise any considerations not yet adequately captured; but we argue that deliberation should be central to the analysis and appraisal of each decision criterion, as well as the final recommendation. Deliberation may not meaningfully contribute to higher quality MCDA if it follows a quantitative process that involves the sort of information loss discussed above. Deliberation pursued as the final step of quantitative MCDA risks converting a decision problem that ought to involve the consideration of multiple tradeoffs both within and between relevant decision criteria into one that involves a single tradeoff between the result of aggregation and the ad hoc considerations introduced during deliberation. In cases where evidence, scores, and weights are presented separately to the appraisal committee ahead of deliberation, there remains the risk that framing effects will inappropriately bias decision makers toward these prior assessments at the expense of considerations raised during deliberation (Reference Devlin and Sussex5).
Additionally, such late-stage deliberation cannot address the shortcomings of the quantitative approach with respect to transparency. When deliberation happens largely after the scoring and aggregating, the bulk of the input and reasoning behind the final position is still likely to reside in the opaque technical exercise, and deliberation will do little to improve transparency and public understanding of the overall approach and final rationale. As Gutmann and Thompson write, “A deliberative justification does not even get started if those to whom it is addressed cannot understand its essential content” (Reference Gutmann and Thompson10).
Finally, MCDA appraisal committees may simply overlook or shortchange deliberation when it is the last step of a long and complex process. Indeed, Baltussen et al. found that only a minority of quantitative MCDA studies reported engaging in deliberation at all (Reference Baltussen, Marsh, Thokala, Vakaramoko, Castro and Cleemput4). Thus, to better promote the quality and transparency of appraisals, deliberation should be given a much more prominent role in appraisals through the adoption of qualitative MCDA.
One potential objection to the adoption of qualitative MCDA is that expanding the role of deliberation can result in less consistent decision making (Reference Baltussen, Marsh, Thokala, Vakaramoko, Castro and Cleemput4). However, the value of consistency depends on the type of inconsistency in question. While we should strive to limit arbitrary or unjustified variation across MCDA decisions, the relative importance of criteria may justifiably differ across particular decisions. With respect to NICE, Charlton has raised the concern that increased formalization of HTA may restrict the ability of an appraisal committee to consider such normatively relevant differences in context across decisions (Reference Charlton13). Additionally, some degree of flexibility in the identification and application of decision criteria over time and across geographic space is surely desirable given that values and health needs differ and evolve. Deliberative qualitative processes in MCDA may better allow for this flexibility in real-time than quantitative approaches. Of course, one type of unjustifiable variation across MCDA decisions would be the outsized influence of dominant voices and its negative impact on the quality of decision making, which Baltussen et al. argue is a greater concern for qualitative MCDA than quantitative MCDA (Reference Baltussen, Marsh, Thokala, Vakaramoko, Castro and Cleemput4); but quantitative MCDA, which often relies on preference elicitation to construct criteria weights, is similarly subject to the influence of the more powerful or privileged individuals that are able to gain access to these data collection activities. Thus, while it is certainly important to adopt deliberation facilitation methods that can effectively mitigate this bias in qualitative MCDA, it is not clear that this concern is a greater problem for qualitative MCDA.
Another set of potential objections is that facilitating high-quality decision making through intensive deliberation and transparently reporting the content of these deliberations may be too resource-intensive to be practicable or could result in public confusion due to information overload (Reference O'Neill24). For example, Baltussen et al. note that speed may be a priority in contexts that face a significant HTA backlog, such as Colombia in 2012–2013. Quantitative MCDA may provide a benefit over qualitative MCDA in such contexts, if in fact it can be carried out more quickly. In our view, however, a potential tradeoff between achieving speed and promoting public trust or acceptance must be considered. Conducting high-quality quantitative MCDA and transparently reporting the results with this tradeoff in mind is likely to raise its own practical difficulties. For instance, the process of eliciting preferences to establish weights for use in quantitative MCDA that are representative of the views of the public or appropriate stakeholders is itself considerably time- and resource-intensive. Moreover, if decision makers are committed to the alternative view defended above of transparency requiring access to understandable and relevant information, then ensuring that relevant results of quantitative MCDA processes are understandable to the public will involve reporting more than the criteria, scores, weights, and aggregation function used. Additional translation of these methods into laymen's terms as well as targeted education to help different affected groups to understand these methods will be needed, both of which may demand substantial levels of resources or result in information overload. It is therefore not clear that quantitative MCDA should be preferred to qualitative MCDA on pragmatic grounds.
A final potential advantage of the quantitative approach with respect to these pragmatic concerns is that it may reduce cognitive load on the appraisal committee. However, if the committee does not take scores and weights as given, but instead treats them as additional information to be considered during a deliberative phase, the extent of this advantage is mitigated. Of course, some may worry that the cognitive demand associated with expanded deliberation could result in important considerations being overlooked, ultimately affecting the quality of decision making. However, expanding the role of deliberation does not and should not entail the abandonment of a structured appraisal approach. Relevant criteria and information should always be explicitly stated and discussed, with effective facilitation to ensure they feature adequately during appraisals. More importantly, reducing cognitive load could come at the cost of foregoing necessarily complex deliberation around morally important tradeoffs, thus providing a false sense of security that the moral work inherent in appraisals has been done and failing to meet the standard of high-quality decision making defended above. A final response, then, is simply to bite the bullet. If decision makers are committed to quality and transparency as key elements of MCDA, and if deliberation in the context of qualitative MCDA can better promote quality and transparency—properly understood—than quantitative MCDA, then decision makers must find a way to effectively facilitate deliberative processes and share relevant details with the public.
Conclusion
To summarize, the flaws of quantitative MCDA go beyond the methodological weaknesses helpfully identified by Baltussen et al. and undermine key aims of the approach to deliver high-quality and transparent recommendations. Quantitative MCDA may mask the complex tradeoffs that exist within and between decision criteria and remain generally inaccessible to those who are not well-versed in its technical methods of appraisal. A more central role for deliberation can address these limitations and may also provide better opportunities for policy makers and the public to genuinely grapple with and appreciate the moral stakes of healthcare priority-setting in contexts of scarcity.
To be clear, we are not opposed to the use of any quantitative methods of analysis as part of an MCDA approach. Criteria that can be appropriately measured and quantified, such as health benefit, provide crucial input on health gains associated with different interventions. Moreover, validated measures of benefit can be used in cost-effectiveness analysis (CEA) to inform consideration of opportunity costs with respect to QALYs or DALYs, and CEA or budget impact analysis can put in perspective tradeoffs between health gains and other criteria of benefit. There are even some novel approaches to CEA that try to disaggregate the measure further or enable a better understanding of certain equity dimensions as they relate to the costs and benefits of an intervention, such as distributional CEA and extended CEA (Reference Cookson, Mirelman, Griffin, Asaria, Dawkins and Norheim25). What we do take issue with is an overall quantitative approach to MCDA appraisal that relies on numerical scoring, weighting, or aggregation within and across a number of diverse criteria, regardless of the analytical methods used to measure or describe impacts for each criterion. We therefore endorse qualitative MCDA used in combination with select decision rules or aids. In this approach, a small number of decision aids based on well-established and widely understood quantitative methods, such as CEA and budget impact analysis, can enable the consideration of health-related opportunity costs while appraisal committees ultimately rely on careful deliberation to take stock of the evidence on the wider range of morally important decision criteria.
Given the increasing interest and focus – in this journal and beyond – on both deliberation in MCDA (Reference Baltussen, Jansen, Mikkelsen, Tromp, Hontelez and Bijlmakers26–Reference Oortwijn and Klein28) and how deliberation can and should be used to inform HTA more generally (Reference Bond29), this commentary provides further insights and arguments to guide ongoing experimentation and evaluation of different deliberative approaches for health priority-setting on the bases of quality, transparency, and other important goals of decision making. To this end, we propose that it may be more constructive to move away from a typology that distinguishes MCDA in terms of quantitative and qualitative assessment and instead focus on how well different deliberative, multi-criteria approaches are able to incorporate and balance the various types of qualitative and quantitative informational inputs that are morally relevant to the decisions at hand. This would also entail closer examination of how and when deliberation is included in the process, from the specification of the evaluative criteria, to the appropriate methods for generating evidence of an intervention's performance on those criteria, to the final appraisal and recommendation about coverage—all with an eye to promoting the quality and transparency of decisions (Reference Bond29). In fact, we already see convergence on this issue among the different MCDA approaches, as evidenced by Baltussen et al.'s (Reference Baltussen, Marsh, Thokala, Vakaramoko, Castro and Cleemput4) claim that deliberation should play a role in any approach. Going forward, researchers and decision makers should remain mindful of how quantitative inputs into deliberation, while useful, can be overextended and undermine quality and transparency. Future work should ultimately emphasize the strengthening of deliberative processes with the generation of appropriate evidence that enables rich discussion of tradeoffs, rather than further advancement of oversimplified quantitative tools to support decision making.
When setting priorities for health, there is broad agreement that a range of social values and ethical principles beyond clinical and cost-effectiveness matter, but exactly how health technology assessment (HTA) should account for a broader set of criteria remains an area of ongoing debate (Reference Hofmann1–Reference Bellemare, Dagenais, K-Bédard, Béland, Bernier and Daniel3). In light of this, we welcome a recent review paper by Baltussen et al. that evaluates the potential of multi-criteria decision analysis (MCDA) to enable HTA agencies to incorporate a broader set of values in their appraisals (Reference Baltussen, Marsh, Thokala, Vakaramoko, Castro and Cleemput4). The authors describe three approaches to MCDA—qualitative MCDA, quantitative MCDA, and MCDA with decision rules—laying out their relative advantages and disadvantages with respect to improving the quality, consistency, and transparency of HTA recommendations and providing recommendations for how each can best be implemented. This contribution helpfully extends an earlier effort by Devlin and Sussex (Reference Devlin and Sussex5) by offering a more systematic comparison of different types of MCDA in terms of their ability to improve HTA recommendations.
We endorse many of the authors' assessments and conclusions, including the critical role of deliberation in any MCDA approach and the undertaking of qualitative MCDA at a minimum. However, we take a stronger position regarding the flaws of quantitative MCDA and, building on their own critical assessment, strongly caution against it. We find the quantitative approach antithetical to at least two of the three ways that the authors believe MCDA can improve HTA recommendations: (i) enhancing quality and (ii) promoting transparency. Below we further examine the ways in which quantitative approaches to MCDA are flawed and unsuited to realizing the intended aims of MCDA. We instead advocate for a predominantly qualitative approach to MCDA appraisal that relies on deliberation with multiple informational inputs, including decision rules or aids to help account for health opportunity costs.
On Quality: The Whole is More Than the Sum of Its Parts
As Baltussen et al. define it, the quality of an MCDA approach rests on the extent to which it takes into account relevant values and enables appropriate tradeoffs between them (Reference Baltussen, Marsh, Thokala, Vakaramoko, Castro and Cleemput4). We are concerned that quantitative MCDA cannot fulfill these functions. The authors already recognize a number of methodological challenges that threaten the quality of quantitative MCDA. However, assigning aggregated, weighted scores to technologies presents a more pernicious threat to high-quality HTA recommendations because it oversimplifies complex concepts and tradeoffs both within and across criteria, thus obscuring potentially important considerations that should be explicitly addressed in decision making. The result is that, at each step of scoring, weighting, and aggregating, there is significant information loss. This limits the ability of those involved in the process to both engage with the underlying assumptions and considerations inherent in complex criteria and make reasoned judgments about which tradeoffs are appropriate. In short, quantitative MCDA risks reducing what should be a difficult decision worth wrestling with to a technical exercise that algorithmically and unreflectively produces a recommendation.
To see why, let us first take on the issue of information loss within a particular criterion, when scores and weights are assigned for particular attributes of a health technology. Imagine that, in an attempt to reduce cognitive load, a single criterion for “equity” was adopted in the performance matrix. However, equity is too complex a concept to be reducible to a single score in this way. Equity encompasses competing accounts of distributive justice (e.g., prioritizing the worst off vs. ensuring sufficiency for all) that can be assessed along various dimensions (e.g., equity by geography, age, gender, socioeconomic status, ethnicity, etc.) and through various measures (e.g., healthcare access, quality, outcomes, etc.). Even if it were possible to compute a single composite score for equity that accommodates various kinds of relevant considerations, the rating would tell you very little about which types of inequities the intervention addresses or whether any negative equity impacts are introduced amidst net equity gains. For example, offering a new breast cancer drug could simultaneously help address gender inequity between male and female cancer patients while exacerbating inequities between urban and rural women if the treatment is only accessible to those near urban hospitals. The nuance and competing considerations are lost when complex criteria are boiled down to a single score.
If an MCDA performance matrix were expanded to include more granular, distinct criteria related to equity—as is more common in practice—such as disease severity and prevalence, and treatment impacts on poverty (Reference Youngkong, Baltussen, Tantivess, Mohara and Teerawattananon6), critical considerations remain explicitly unacknowledged. First, there still remain multiple considerations underlying each single criterion that should be assessed. For example, in the case of impacts on poverty, decision makers may consider impoverishment due to direct out-of-pocket health expenditures, lost wages due to ill health and absenteeism, or long-term impacts on earning potential related to missed schooling. There are also considerations of the prevalence of particular conditions among the poor and how much preference should be given to interventions that disproportionately affect the indigent. A single score related to poverty impacts could obscure tradeoffs between prioritizing the health needs of those already below the poverty line versus allocating to reduce catastrophic health expenditure among those comparatively better off at baseline. This brings us to our second point. As discussed above, promoting equity requires both a theory of the good (i.e., what aspect[s] of well-being should be the focus) and a theory of how to distribute that good (i.e., on the bases of starting with those who are worst off, realizing some level of sufficient well-being for all, or striving for equality) (Reference Persad, Mastroianni, Kahn and Kass7). Without a clear conception of the underlying distributive justice theory, it is very possible that different quantitative MCDA appraisals could assign quite divergent “equity” scores to something like a rare disease intervention, with one scoring such an intervention highly on the basis of an implicitly prioritarian approach (Reference Youngkong, Baltussen, Tantivess, Mohara and Teerawattananon6), and another giving a lower equity score to interventions that would bring comparatively few closer to sufficient well-being (Reference Endrei, Molics and Ágoston8). As such, trying to reduce complex considerations like equity to a set of scores at any level of granularity inevitably obscures related considerations or key components of distributive theory, and may also give a false sense of parity among two interventions that rate similarly on a single dimension of the performance matrix but for very different reasons.
Finally, the issue of information loss is further compounded when aggregating weighted scores across multiple criteria. This practice makes it difficult to identify specific tradeoffs between different types of criteria, precluding thoughtful and informed discussion by an appraisal committee about whether and when those tradeoffs are acceptable.
On Transparency: More Than Disclosing Methods and Scores
Baltussen et al. note that all three approaches to MCDA can improve the transparency of priority setting by providing explicit criteria against which health interventions are evaluated (Reference Baltussen, Marsh, Thokala, Vakaramoko, Castro and Cleemput4). Furthermore, they add that quantitative MCDA can enhance transparency compared to qualitative MCDA when the numerical weights and scores for these criteria and the mathematical aggregation function are explicitly reported to the public. However, simply making more information available is not a plausible account of what transparency requires, especially when transparency is viewed as instrumental to legitimacy, accountability, and public trust. A more appropriate account of transparency surrounding government decision making understands it as a means for demonstrating respect to members of the public who will be impacted by a decision and providing them with information to better understand decisions and their underlying rationale, so that they can accept, act on, or challenge decisions (Reference Daniels and Sabin9,Reference Gutmann and Thompson10). This not only helps legitimize government decisions, particularly in democratic societies, but can also provide an additional check on the quality of the decision making. To fulfill this function, transparency requires providing access to understandable and relevant information in addition to simply providing more information, an understanding consistent with other discussions of transparency that emphasize accessibility, relevance, and comprehension (Reference Persad, Cohen, Evans, Lynch and Shachar11,Reference Naurin12).
In this regard, quantitative MCDA falls short. An overly technical and algorithmic approach will likely fail to help the public meaningfully understand why and how a health coverage decision was made. For example, in a study of the increasing formalization of HTA at The National Institute for Health and Care Excellence (NICE)—the HTA body that serves England and Wales—Charlton describes how particular normative judgments have become “embedded” in technical analyses that are largely removed from the more public committee appraisal process (Reference Charlton13). Furthermore, the methods used in quantitative MCDA to develop weights for each criterion, score interventions across criteria, and aggregate scores to obtain an overall ranking of interventions can be technically complex, and it is unlikely that a typical member of the public would be able to understand and engage with information provided about the methodology, let alone challenge or disagree with the underlying approach. In fact, reliance on this type of approach may further erode public trust in health priority-setting endeavors given the climate of mistrust of experts (Reference Johnston and Ballard14) and negative conceptions about the fairness of algorithmic decision making (Reference Lee15,Reference Woodruff, Fox, Rousso-Schindler and Warshaw16).
Perhaps more concerning than the inaccessibility of quantitative MCDA methods is the way in which aggregating across multiple and complex criteria obscures public acknowledgment of competing considerations in favor of or against a health intervention, thereby failing to help the public develop good judgment (Reference Johri and Norheim17) or an appreciation of the moral stakes of resource allocation in a context of competing needs (Reference Nussbaum18). Masking the difficult tradeoffs that are inherent in health priority setting not only affects the quality of the recommendations (as discussed above), but may inhibit broader public understanding of what is morally at stake when health coverage decisions are being made, why it is important and necessary to set priorities amidst resource constraints, and the opportunity costs associated with each investment decision (Reference Goold, Biddle, Klipp, Hall and Danis19).
A Greater Role for Deliberation: The Promise of Qualitative MCDA
Like Baltussen et al. (Reference Baltussen, Marsh, Thokala, Vakaramoko, Castro and Cleemput4), we view deliberation as a crucial component of HTA. In our view, however, qualitative MCDA is far better suited than quantitative MCDA to promoting the type of deliberative decision making that can enhance the quality and transparency of appraisals, as discussed above. In qualitative MCDA deliberation, committee members openly discuss and debate competing considerations both within and across criteria that comprise the performance matrix. These deliberations help to ensure that competing moral claims on health resources remain explicit and distinct rather than collapsed into summary criterion scores or aggregated ranks, thus improving the quality of decision making. These discussions are also more likely to avoid highly technical subjects and instead rely on a common vernacular, especially when appraisal committees are comprised of diverse stakeholders who must find a shared language to communicate effectively across disciplinary boundaries. As a result, the output of qualitative MCDA deliberations will provide appropriate details about the final recommendation, the information taken into account, the principal reasons for and against coverage, and how the final position was decided upon, such that the lay public can more easily understand and engage with the substance of recommendations.
To be sure, Baltussen et al. argue that quantitative MCDA should always include a deliberative component, so supporters of quantitative MCDA might respond that it offers the best of both approaches. However, it is critical to consider the stage at which deliberation occurs. Baltussen et al. describe deliberation as the final step in quantitative MCDA—as was the case for several quantitative MCDA studies identified in their review (Reference Goetghebeur, Wagner, Khoury, Rindress, Grégoire and Deal20–Reference Youngkong, Teerawattananon, Tantivess and Baltussen23)—meaning that deliberation occurred only after the scoring, weighting, and aggregation have been completed to raise any considerations not yet adequately captured; but we argue that deliberation should be central to the analysis and appraisal of each decision criterion, as well as the final recommendation. Deliberation may not meaningfully contribute to higher quality MCDA if it follows a quantitative process that involves the sort of information loss discussed above. Deliberation pursued as the final step of quantitative MCDA risks converting a decision problem that ought to involve the consideration of multiple tradeoffs both within and between relevant decision criteria into one that involves a single tradeoff between the result of aggregation and the ad hoc considerations introduced during deliberation. In cases where evidence, scores, and weights are presented separately to the appraisal committee ahead of deliberation, there remains the risk that framing effects will inappropriately bias decision makers toward these prior assessments at the expense of considerations raised during deliberation (Reference Devlin and Sussex5).
Additionally, such late-stage deliberation cannot address the shortcomings of the quantitative approach with respect to transparency. When deliberation happens largely after the scoring and aggregating, the bulk of the input and reasoning behind the final position is still likely to reside in the opaque technical exercise, and deliberation will do little to improve transparency and public understanding of the overall approach and final rationale. As Gutmann and Thompson write, “A deliberative justification does not even get started if those to whom it is addressed cannot understand its essential content” (Reference Gutmann and Thompson10).
Finally, MCDA appraisal committees may simply overlook or shortchange deliberation when it is the last step of a long and complex process. Indeed, Baltussen et al. found that only a minority of quantitative MCDA studies reported engaging in deliberation at all (Reference Baltussen, Marsh, Thokala, Vakaramoko, Castro and Cleemput4). Thus, to better promote the quality and transparency of appraisals, deliberation should be given a much more prominent role in appraisals through the adoption of qualitative MCDA.
One potential objection to the adoption of qualitative MCDA is that expanding the role of deliberation can result in less consistent decision making (Reference Baltussen, Marsh, Thokala, Vakaramoko, Castro and Cleemput4). However, the value of consistency depends on the type of inconsistency in question. While we should strive to limit arbitrary or unjustified variation across MCDA decisions, the relative importance of criteria may justifiably differ across particular decisions. With respect to NICE, Charlton has raised the concern that increased formalization of HTA may restrict the ability of an appraisal committee to consider such normatively relevant differences in context across decisions (Reference Charlton13). Additionally, some degree of flexibility in the identification and application of decision criteria over time and across geographic space is surely desirable given that values and health needs differ and evolve. Deliberative qualitative processes in MCDA may better allow for this flexibility in real-time than quantitative approaches. Of course, one type of unjustifiable variation across MCDA decisions would be the outsized influence of dominant voices and its negative impact on the quality of decision making, which Baltussen et al. argue is a greater concern for qualitative MCDA than quantitative MCDA (Reference Baltussen, Marsh, Thokala, Vakaramoko, Castro and Cleemput4); but quantitative MCDA, which often relies on preference elicitation to construct criteria weights, is similarly subject to the influence of the more powerful or privileged individuals that are able to gain access to these data collection activities. Thus, while it is certainly important to adopt deliberation facilitation methods that can effectively mitigate this bias in qualitative MCDA, it is not clear that this concern is a greater problem for qualitative MCDA.
Another set of potential objections is that facilitating high-quality decision making through intensive deliberation and transparently reporting the content of these deliberations may be too resource-intensive to be practicable or could result in public confusion due to information overload (Reference O'Neill24). For example, Baltussen et al. note that speed may be a priority in contexts that face a significant HTA backlog, such as Colombia in 2012–2013. Quantitative MCDA may provide a benefit over qualitative MCDA in such contexts, if in fact it can be carried out more quickly. In our view, however, a potential tradeoff between achieving speed and promoting public trust or acceptance must be considered. Conducting high-quality quantitative MCDA and transparently reporting the results with this tradeoff in mind is likely to raise its own practical difficulties. For instance, the process of eliciting preferences to establish weights for use in quantitative MCDA that are representative of the views of the public or appropriate stakeholders is itself considerably time- and resource-intensive. Moreover, if decision makers are committed to the alternative view defended above of transparency requiring access to understandable and relevant information, then ensuring that relevant results of quantitative MCDA processes are understandable to the public will involve reporting more than the criteria, scores, weights, and aggregation function used. Additional translation of these methods into laymen's terms as well as targeted education to help different affected groups to understand these methods will be needed, both of which may demand substantial levels of resources or result in information overload. It is therefore not clear that quantitative MCDA should be preferred to qualitative MCDA on pragmatic grounds.
A final potential advantage of the quantitative approach with respect to these pragmatic concerns is that it may reduce cognitive load on the appraisal committee. However, if the committee does not take scores and weights as given, but instead treats them as additional information to be considered during a deliberative phase, the extent of this advantage is mitigated. Of course, some may worry that the cognitive demand associated with expanded deliberation could result in important considerations being overlooked, ultimately affecting the quality of decision making. However, expanding the role of deliberation does not and should not entail the abandonment of a structured appraisal approach. Relevant criteria and information should always be explicitly stated and discussed, with effective facilitation to ensure they feature adequately during appraisals. More importantly, reducing cognitive load could come at the cost of foregoing necessarily complex deliberation around morally important tradeoffs, thus providing a false sense of security that the moral work inherent in appraisals has been done and failing to meet the standard of high-quality decision making defended above. A final response, then, is simply to bite the bullet. If decision makers are committed to quality and transparency as key elements of MCDA, and if deliberation in the context of qualitative MCDA can better promote quality and transparency—properly understood—than quantitative MCDA, then decision makers must find a way to effectively facilitate deliberative processes and share relevant details with the public.
Conclusion
To summarize, the flaws of quantitative MCDA go beyond the methodological weaknesses helpfully identified by Baltussen et al. and undermine key aims of the approach to deliver high-quality and transparent recommendations. Quantitative MCDA may mask the complex tradeoffs that exist within and between decision criteria and remain generally inaccessible to those who are not well-versed in its technical methods of appraisal. A more central role for deliberation can address these limitations and may also provide better opportunities for policy makers and the public to genuinely grapple with and appreciate the moral stakes of healthcare priority-setting in contexts of scarcity.
To be clear, we are not opposed to the use of any quantitative methods of analysis as part of an MCDA approach. Criteria that can be appropriately measured and quantified, such as health benefit, provide crucial input on health gains associated with different interventions. Moreover, validated measures of benefit can be used in cost-effectiveness analysis (CEA) to inform consideration of opportunity costs with respect to QALYs or DALYs, and CEA or budget impact analysis can put in perspective tradeoffs between health gains and other criteria of benefit. There are even some novel approaches to CEA that try to disaggregate the measure further or enable a better understanding of certain equity dimensions as they relate to the costs and benefits of an intervention, such as distributional CEA and extended CEA (Reference Cookson, Mirelman, Griffin, Asaria, Dawkins and Norheim25). What we do take issue with is an overall quantitative approach to MCDA appraisal that relies on numerical scoring, weighting, or aggregation within and across a number of diverse criteria, regardless of the analytical methods used to measure or describe impacts for each criterion. We therefore endorse qualitative MCDA used in combination with select decision rules or aids. In this approach, a small number of decision aids based on well-established and widely understood quantitative methods, such as CEA and budget impact analysis, can enable the consideration of health-related opportunity costs while appraisal committees ultimately rely on careful deliberation to take stock of the evidence on the wider range of morally important decision criteria.
Given the increasing interest and focus – in this journal and beyond – on both deliberation in MCDA (Reference Baltussen, Jansen, Mikkelsen, Tromp, Hontelez and Bijlmakers26–Reference Oortwijn and Klein28) and how deliberation can and should be used to inform HTA more generally (Reference Bond29), this commentary provides further insights and arguments to guide ongoing experimentation and evaluation of different deliberative approaches for health priority-setting on the bases of quality, transparency, and other important goals of decision making. To this end, we propose that it may be more constructive to move away from a typology that distinguishes MCDA in terms of quantitative and qualitative assessment and instead focus on how well different deliberative, multi-criteria approaches are able to incorporate and balance the various types of qualitative and quantitative informational inputs that are morally relevant to the decisions at hand. This would also entail closer examination of how and when deliberation is included in the process, from the specification of the evaluative criteria, to the appropriate methods for generating evidence of an intervention's performance on those criteria, to the final appraisal and recommendation about coverage—all with an eye to promoting the quality and transparency of decisions (Reference Bond29). In fact, we already see convergence on this issue among the different MCDA approaches, as evidenced by Baltussen et al.'s (Reference Baltussen, Marsh, Thokala, Vakaramoko, Castro and Cleemput4) claim that deliberation should play a role in any approach. Going forward, researchers and decision makers should remain mindful of how quantitative inputs into deliberation, while useful, can be overextended and undermine quality and transparency. Future work should ultimately emphasize the strengthening of deliberative processes with the generation of appropriate evidence that enables rich discussion of tradeoffs, rather than further advancement of oversimplified quantitative tools to support decision making.
Acknowledgements
We thank Rob Baltussen, Kalipso Chalkidou, and the anonymous reviewers for their helpful comments.
Funding
This research received no specific funding from any agency, commercial or not-for-profit sectors.