Economic evaluation in health care can be defined as the comparison of alternative options in terms of their costs and consequences (Reference Drummond, Sculpher, Torrance, O'Brien and Stoddart6). The purpose being to inform the efficient allocation of scarce resources (Reference Briggs5). Two main approaches exist, those using patient level data and those using decision analytic modeling.
When patient level data are used, economic outcomes are the result of a single sample drawn from the population (Reference Ramsey, Willke and Briggs19). However, decisions are made at the population level. Consequently, uncertainty arises from using limited samples to estimate the true (population) value of costs and effects. This source of uncertainty can be referred to as sampling variation (Reference Johnson-Masotti, Laud, Hoffmann, Hayat and Pinkerton13). Two methods have been used regularly in the applied literature to incorporate sampling variation: the nonparametric bootstrap method and Fieller's method (Reference Drummond, Sculpher, Torrance, O'Brien and Stoddart6). Both methods propagate uncertainty using only the information contained in the original data.
The past 25 years have seen an increase in the prevalence of Bayesian statistics (Reference Ashby2). In particular, the Bayesian Initiative in Health Economics & Outcomes Research was established, “to explore the extent to which formal Bayesian statistical analysis can and should be incorporated into the field of health economics and outcomes research for the purpose of assisting rational health care decision making”(Reference O'Hagan and Luce15).
Under a Bayesian interpretation, parameters of interest are ascribed a distribution reflecting uncertainty concerning the true value of the parameter (Reference Briggs4). A Bayesian analysis synthesizes two sources of information about the unknown parameters of interest. One source is the prior distribution, which represents information that is available before (or, more generally, in addition to) the data (e.g., previous trials, literature, expert opinion). In the absence of prior information, vague or uninformative prior distributions can be used. The less informative the prior, the more weight is given to the data in the analysis. The other source of information is the data, which contribute to the analysis through the likelihood function (Reference Stevens and O'Hagan23). The likelihood summarizes all of the information about the unknown parameters that is contained in the data (Reference Spiegelhalter, Abrams and Myles22). These two sources of information are combined through the use of Bayes’ theorem. Bayes’ theorem updates the prior information by taking into account, by means of the likelihood, the newly observed data. The result is a posterior distribution that represents what is now known about the unknown parameters based both on the data and the prior information (Reference Stevens and O'Hagan23). Posterior distributions can be generated using simulation techniques such as Markov Chain Monte Carlo.
The inferential outputs from a Bayesian analysis and the ability to make direct probability statements regarding unknown quantities provide a natural way of informing policy makers. The ability to take into account all available evidence, through the combination of the prior and the likelihood, speaks to another potential advantage. By focusing on the vital question: how does this new piece of evidence change what we currently believe? Bayesian methods present a more iterative approach to evaluation because prior beliefs can be updated as new evidence becomes available (Reference Spiegelhalter, Abrams and Myles22). Another potential advantage is the use of a likelihood function to model the underlying distribution of the data (Reference O'Hagan and Stevens17).
In the wake of a renewed interest in Bayesian statistics, the primary objective of this review is to describe how Bayesian methods have been used to handle uncertainty due to sampling variation in patient level economic evaluations. The results serve as a reference, detailing how these methods have been used to evaluate healthcare interventions. Specifically, the review focuses on describing the priors, the likelihoods, the presentation of uncertainty, and sensitivity analyses in these studies. Concentrating on these aspects gives a sense of how Bayesian methods have been used to incorporate additional information, accurately model the data, communicate the impact of uncertainty, and assess the robustness of the results. Findings and implications from the review are discussed.
METHODS
Data Sources and Search Strategy
We conducted a comprehensive search strategy to identify all relevant published Bayesian analyses (to the second week of November 2007). We developed the search strategy in MEDLINE and modified it for other databases. Only articles in English were considered. Ovid MEDLINE In-Process & Other Nonindexed Citations and Ovid MEDLINE (1950 to Present), EMBASE (1980 to 2007 Week 45), and Cochrane Library NHS Economic Evaluation (Issue 4, 2007) databases were searched. In addition, we searched the reference sections of relevant papers for potentially eligible studies.
Search terms were derived based on mapping keywords for Bayesian analysis (e.g., Bayesian, WinBUGS) and economic evaluation (e.g., cost, economic) to indexed subject headings within the respective databases. Terms were also derived based on investigator-nominated terms and keywords from the titles and abstracts of potentially relevant studies. Relevant keywords and subject headings were then combined allowing for alternative spellings and suffixes. Operators denoting the proximity of various search terms in relation to others were also used to derive a comprehensive retrieval strategy. The search strategy is provided in Supplementary Table 1 (which can be viewed online at www.journals.cambridge.org/thc).
Study Selection
We screened citation records in two stages. In the first stage, the titles and abstracts of retrieved articles were screened for potential inclusion or exclusion. In the second stage, those records not excluded at the first stage underwent a full-text review. Included studies met the following criteria: (i) the study conducted an economic evaluation comparing two or more healthcare interventions, (ii) the impact of uncertainty (sampling variation) on the results of the economic evaluation was incorporated using Bayesian methods, and (iii) the likelihood function was informed by patient level data from a single source (e.g., trial, study).
Excluded studies involved only patient level costs or only patient level effects, incorporated any sort of decision analytic modeling, or used Bayesian methods for purposes other than the incorporation and assessment of sampling variation (e.g., evidence synthesis, value of information analysis, heterogeneity). In both stages, a single reviewer (C.E.M.) selected articles for inclusion.
Data Synthesis and Analysis
In the context of the current analysis a descriptive synthesis of the included studies was undertaken. An abstraction form was developed to collect information on key study characteristics. To get a sense of how Bayesian methods were used to combine additional information with the data, the type of prior distribution was recorded. To illustrate how the underlying data were modeled and whether these distributions allowed for issues such as the potential dependence between costs and effects or skewness in costs, information was collected on the likelihood functions. To understand how Bayesian methods were used to inform decision makers, the presentation of uncertainty was documented. Attention was also given to whether the studies explored the sensitivity of the results to changes in the priors and the likelihoods, as this could have implications for the results. The data were then synthesized to provide an overall description of the use of Bayesian methods to handle uncertainty in economic evaluations of patient level data.
RESULTS
Literature Review
The literature search yielded 366 potentially relevant bibliographic records. From the 366 citations, 103 articles were retrieved for relevance assessment. The selection of included studies is presented in the QUORUM diagram given in Figure 1. Sixteen studies met the final inclusion criteria (Reference Al and Van Hout1;Reference Bachmann, Fairall, Clark and Mugford3;Reference Briggs4;Reference Fenwick, Wilson, Sculpher and Claxton7;Reference Hahn and Whitehead9–Reference Heitjan, Moskowitz and Whang12;Reference Negrin and Vazquez-Polo14;Reference O'Hagan and Stevens16–Reference O'Hagan, Stevens and Montmartin18;Reference Shih, Bekele and Xu20;26–Reference Vazquez-Polo, Negrin, Badia and Roset28). Thirteen of these studies were classified as methodological papers with applications (Reference Al and Van Hout1;Reference Bachmann, Fairall, Clark and Mugford3;Reference Briggs4;Reference Hahn and Whitehead9–Reference Heitjan, Moskowitz and Whang12;Reference Negrin and Vazquez-Polo14;Reference O'Hagan and Stevens16–Reference O'Hagan, Stevens and Montmartin18;Reference Vazquez-Polo, Hernandez and Lopez-Valcarcel27;Reference Vazquez-Polo, Negrin, Badia and Roset28) and three were classified as application papers (Reference Fenwick, Wilson, Sculpher and Claxton7;Reference Shih, Bekele and Xu20;26). For the purpose of this review, the former classification pertains to those papers that used applications merely for illustrative or pedagogic purposes. The latter refers to those papers whose primary objective was an economic evaluation, where Bayesian methods were used to incorporate sampling uncertainty. Supplementary Table 2 (www.journals.cambridge.org/thc) describes the included studies.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170127033741-72058-mediumThumb-S0266462309990316_fig1g.jpg?pub-status=live)
Figure 1. QUORUM diagram of studies considered for inclusion.
Assessment of Bayesian Methods
Prior Distributions. The most common type of prior used for either costs or effects was a vague or uninformative prior. Uninformative priors for costs were used in fourteen studies (Reference Al and Van Hout1;Reference Bachmann, Fairall, Clark and Mugford3;Reference Briggs4;Reference Fenwick, Wilson, Sculpher and Claxton7;Reference Hahn and Whitehead9;Reference Heitjan and Li11;Reference Heitjan, Moskowitz and Whang12;Reference Negrin and Vazquez-Polo14;Reference O'Hagan and Stevens16–Reference O'Hagan, Stevens and Montmartin18;Reference Shih, Bekele and Xu20;26;Reference Vazquez-Polo, Hernandez and Lopez-Valcarcel27) and for effects in thirteen studies (Reference Al and Van Hout1;Reference Bachmann, Fairall, Clark and Mugford3;Reference Briggs4;Reference Fenwick, Wilson, Sculpher and Claxton7;Reference Hahn and Whitehead9;Reference Heitjan, Moskowitz and Whang12;Reference Negrin and Vazquez-Polo14;Reference O'Hagan and Stevens16–Reference O'Hagan, Stevens and Montmartin18;Reference Shih, Bekele and Xu20;26;Reference Vazquez-Polo, Hernandez and Lopez-Valcarcel27). These priors were incorporated either exclusively (Reference Bachmann, Fairall, Clark and Mugford3;Reference Fenwick, Wilson, Sculpher and Claxton7;Reference Hahn and Whitehead9;Reference Heitjan, Moskowitz and Whang12;Reference Negrin and Vazquez-Polo14;Reference O'Hagan and Stevens17;26;Reference Vazquez-Polo, Hernandez and Lopez-Valcarcel27) or as part of a sensitivity analysis (Reference Al and Van Hout1;Reference Briggs4;Reference Heitjan and Li11;Reference O'Hagan and Stevens16;Reference O'Hagan, Stevens and Montmartin18;Reference Shih, Bekele and Xu20). Informative priors (empirical, subjective, or structural) were included in half of the studies (Reference Al and Van Hout1;Reference Briggs4;Reference Heitjan, Kim and Li10;Reference Heitjan and Li11;Reference O'Hagan and Stevens16;Reference O'Hagan, Stevens and Montmartin18;Reference Shih, Bekele and Xu20;Reference Vazquez-Polo, Negrin, Badia and Roset28). Priors based on empirical data were used for effects in five studies (Reference Al and Van Hout1;Reference Briggs4;Reference Heitjan, Kim and Li10;Reference Heitjan and Li11;Reference Shih, Bekele and Xu20 and for costs in three studies (Reference Al and Van Hout1;Reference Briggs4;Reference Shih, Bekele and Xu20). Data sources for the empirical priors included previous trials (Reference Al and Van Hout1), pilot studies (Reference Heitjan, Kim and Li10;Reference Heitjan and Li11), the literature (Reference Briggs4), and individual Medicare claims data (Reference Shih, Bekele and Xu20).
Priors based on subjective opinion were applied equally to costs (Reference Al and Van Hout1;Reference Heitjan, Kim and Li10;Reference Heitjan and Li11;Reference Vazquez-Polo, Negrin, Badia and Roset28) and effects (Reference Al and Van Hout1;Reference O'Hagan and Stevens16;Reference O'Hagan, Stevens and Montmartin18;Reference Vazquez-Polo, Negrin, Badia and Roset28). Subjective priors most often reflected informal reasoning (Reference Al and Van Hout1;Reference Heitjan, Kim and Li10;Reference Heitjan and Li11;Reference O'Hagan and Stevens16;Reference O'Hagan, Stevens and Montmartin18. However, one study (Reference Vazquez-Polo, Negrin, Badia and Roset28) referred to a process of eliciting expert opinion. Experts who participated in the study were asked about the mean and the probability interval to obtain the prior mean and variance of the parameters of interest. Structural priors, denoting the relative relationship between parameters as opposed to the actual numerical values, appeared in one of the studies (Reference O'Hagan and Stevens16). In this case, the prior represented the belief that the variances of costs should not be too different between patient groups. The effect of this prior information was to moderate the influence of the extreme costs. Table 1 describes the prior distributions.
Table 1. Description of Priors for Effects and Costs
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170127033741-16422-mediumThumb-S0266462309990316_tab1.jpg?pub-status=live)
a Uninformative: no information.
b Empirical: data based.
c Subjective: opinion based.
d Structural: relationship based.
Total number of priors for effects = 22 [uninformative = 13(59%), empirical = 5(23%), subjective = 4(18%)]. Total number of priors for costs = 22 [uninformative = 14(64%), empirical = 3(14%), subjective = 4(18%), structural = 1(5%)]. Percentages rounded to nearest whole number.
Likelihood Functions. Two of the applied studies (Reference Fenwick, Wilson, Sculpher and Claxton7;26) did not specify the distributional form of their likelihood functions, and one of the methodological papers (Reference Bachmann, Fairall, Clark and Mugford3) analyzed the individual level data using two different approaches. Therefore, there are thirteen examples where costs and effects are modeled directly (Reference Al and Van Hout1;Reference Bachmann, Fairall, Clark and Mugford3;Reference Briggs4;Reference Hahn and Whitehead9–Reference Heitjan, Moskowitz and Whang12;Reference Negrin and Vazquez-Polo14;Reference O'Hagan and Stevens16–Reference O'Hagan, Stevens and Montmartin18;Reference Vazquez-Polo, Hernandez and Lopez-Valcarcel27;Reference Vazquez-Polo, Negrin, Badia and Roset28), and two examples using regression-based modeling of net benefits (Reference Bachmann, Fairall, Clark and Mugford3;Reference Shih, Bekele and Xu20).
The majority of studies incorporated the potential dependence between the cost and effect data. This was achieved through the use of both multivariate normal distributions and regression analysis. Two of the studies (Reference Vazquez-Polo, Hernandez and Lopez-Valcarcel27;Reference Vazquez-Polo, Negrin, Badia and Roset28) that applied regression analysis directly to costs and effects included covariates in their likelihood functions and assessed the resulting impact on uncertainty.
For three of the studies (Reference Al and Van Hout1;Reference Briggs4;Reference Heitjan, Moskowitz and Whang12), the use of multivariate normal distributions was based on large sample approximations for the means of costs and effects. Where the likelihood functions allowed for a specific relationship between the cost and effect data, costs most often depended on effects: four studies (Reference Heitjan, Kim and Li10;Reference Heitjan and Li11;Reference O'Hagan and Stevens16;Reference Vazquez-Polo, Hernandez and Lopez-Valcarcel27) allowed costs to depend on effects, whereas only one study (Reference Bachmann, Fairall, Clark and Mugford3) allowed effects to depend on costs. Different distributions were used for effects based on whether the outcome measure was a continuous or discrete random variable.
Six studies (Reference Bachmann, Fairall, Clark and Mugford3;Reference Hahn and Whitehead9;Reference Heitjan, Kim and Li10;Reference Heitjan and Li11;Reference Negrin and Vazquez-Polo14;Reference O'Hagan and Stevens16) incorporated the potential skewness in the cost data. Three of these studies used gamma distributions (Reference Bachmann, Fairall, Clark and Mugford3;Reference Heitjan, Kim and Li10;Reference Heitjan and Li11) and two (Reference Negrin and Vazquez-Polo14;Reference O'Hagan and Stevens16) used lognormal distributions. One study (Reference Hahn and Whitehead9) divided total cost into three components and applied distributions (e.g., lognormal) to each of the cost components. Table 2 describes the likelihood functions.
Table 2. Description of Likelihoods
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170127033741-37110-mediumThumb-S0266462309990316_tab2.jpg?pub-status=live)
a (effects, costs): effects and costs determined simultaneously.
b cost|effects: costs depend on effects.
c effects|costs: effects depend on costs. Total number of distributions = 29 [multivariate normal = 9(31%), normal = 9(31%), binomial = 3(10%), gamma = 3(10%), weibull = 2(7%), lognormal = 1(3%), other = 1(3%), nonparametric = 1(3%)]. Percentages rounded to nearest whole number.
Presentation of Uncertainty. The predominant approaches to the presentation of uncertainty were Bayesian 95 percent credibility intervals (0.95 posterior probability that the true value lies in the interval), and cost-effectiveness acceptability curves (CEAC) (posterior probability that the intervention is cost-effective given the data and willingness to pay). Almost all of the studies presented cost-effectiveness acceptability curves (Reference Al and Van Hout1;Reference Bachmann, Fairall, Clark and Mugford3;Reference Briggs4;Reference Fenwick, Wilson, Sculpher and Claxton7;Reference Hahn and Whitehead9–Reference Heitjan and Li11;Reference O'Hagan and Stevens16–Reference O'Hagan, Stevens and Montmartin18;Reference Shih, Bekele and Xu20;26–Reference Vazquez-Polo, Negrin, Badia and Roset28). Six studies (Reference Al and Van Hout1;Reference Bachmann, Fairall, Clark and Mugford3;Reference Heitjan, Kim and Li10;Reference Heitjan and Li11;Reference Heitjan, Moskowitz and Whang12;Reference Vazquez-Polo, Hernandez and Lopez-Valcarcel27) presented Bayesian 95 percent credibility intervals for the incremental cost-effectiveness ratio (ICER), two studies (Reference Bachmann, Fairall, Clark and Mugford3;Reference Heitjan, Kim and Li10) presented credibility intervals for the incremental net monetary benefit (INMB), and one study (Reference Heitjan and Li11) presented credibility intervals for the incremental net health benefit (INHB). Another study (Reference Negrin and Vazquez-Polo14) that compared multiple treatment options and incorporated two measures of effectiveness proposed the cost-effectiveness acceptability plane frontier (CEAPF) as an alternative to the cost-effectiveness acceptability curve. Table 3 summarizes the presentation of uncertainty in each study.
Table 3. Presentation of Uncertainty and Sensitivity Analysis
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170127033741-47040-mediumThumb-S0266462309990316_tab3.jpg?pub-status=live)
a Refer to paper for description of priors and likelihoods. Number of presentations of uncertainty = 24 [cost-effectiveness acceptability curves (CEAC) = 14 (58%), 95% credibility interval for incremental cost-effectiveness ratio (ICER) = 6 (25%), 95% credibility interval for incremental net monetary benefit (INMB) = 2(8%), 95% credibility interval for incremental net health benefit (INHB) = 1(4%), cost-effectiveness acceptability plane frontier (CEAPF) = 1(4%)]. Number of sensitivity analyses = 10 [Prior sensitivity = 4(40%), Likelihood sensitivity = 4(40%), Prior and Likelihood sensitivity = 2(20%)]. Percentages rounded to nearest whole number.
b Proposed as an alternative to the cost-effectiveness acceptability curve when considering more than one measure of effect.
Sensitivity Analysis. In addition to assessing the impact of sampling variation on the results, ten studies (Reference Al and Van Hout1;Reference Bachmann, Fairall, Clark and Mugford3;Reference Briggs4;Reference Hahn and Whitehead9;Reference Heitjan and Li11;Reference O'Hagan and Stevens16;Reference O'Hagan, Stevens and Montmartin18;Reference Shih, Bekele and Xu20;Reference Vazquez-Polo, Hernandez and Lopez-Valcarcel27;Reference Vazquez-Polo, Negrin, Badia and Roset28) considered the sensitivity of the results to changes in the prior distributions and the likelihood functions. Of those studies, four (Reference Al and Van Hout1;Reference Briggs4;Reference Heitjan and Li11;Reference O'Hagan and Stevens16) used different priors, four (Reference Bachmann, Fairall, Clark and Mugford3;Reference Hahn and Whitehead9;Reference Vazquez-Polo, Hernandez and Lopez-Valcarcel27;Reference Vazquez-Polo, Negrin, Badia and Roset28) used different likelihoods, and two (18,20) changed both the priors and the likelihoods. Table 3 describes the sensitivity analyses that were conducted. The following summarizes the findings of those studies.
Priors. The study by Al and Van Hout (Reference Al and Van Hout1) assessed the sensitivity of the results to three different prior distributions for costs and effects: an uninformative prior disregarding all information from a previous trial, an empirical prior equal to the posterior of the previous trial, and a subjective prior that uses only 50 percent of the information from the previous trial. The authors concluded that different prior distributions may lead to different decisions. For example, given a specific willingness to pay, the probability of cost-effectiveness was 0.65 for the uninformative prior, 0.80 for the subjective prior, and 0.90 for the empirical prior.
Another study (Reference O'Hagan and Stevens16) that assessed the impact of different priors found that varying the prior information on effects made negligible difference to conclusions, because the data quite strongly indicated an improvement in effectiveness. However, the impact of the prior information on costs was much more substantial. For smaller willingness to pay values, where cost is a real consideration, the different priors produced quite different probabilities of cost-effectiveness. When weak prior information was used, the probability of cost-effectiveness never went below 0.70. When structural prior information was used, the probability of cost-effectiveness went from 0.45 to 0.65 for smaller willingness to pay values. The authors argued that this difference was primarily being driven by two outlying observations. The use of structural prior information, representing the belief that the variances of costs should not be too different between patient groups, effectively mitigated the impact of the outliers and resulted in a correspondingly lower cost-effectiveness acceptability curve for small willingness to pay values. In the remaining studies (Reference Briggs4;Reference Heitjan and Li11) the priors did not appear to be a source of sensitivity.
Likelihoods. Hahn and Whitehead (Reference Hahn and Whitehead9) compared five different likelihood functions for the cost and effect data. For two of the likelihoods, only one cost was considered, namely total cost. In the other three, the total cost was broken down into three components. Four of the likelihoods used normal or multivariate normal distributions for the cost and effect data. The remaining likelihood used other distributions (e.g., lognormal) for the cost components. The cost-effectiveness acceptability curve associated with this likelihood was different from those based on the other four likelihoods. In particular, the willingness to pay value for which the probability of cost-effectiveness is 0.50 was greater than that suggested when the other likelihoods were used.
In the regression framework presented in Vazquez-Polo et al. (Reference Vazquez-Polo, Negrin, Badia and Roset28), the authors assessed the sensitivity of the results to the inclusion of covariates, first using a continuous outcome and then a binary outcome. When the continuous outcome was used, the willingness to pay value at which the probability of cost-effectiveness is 0.50 was approximately 75 percent greater without covariates than when covariates were included. When the binary measure of effect was used, the control treatment dominated the new treatment. Similar results were found in the study by Vazquez-Polo et al. (Reference Vazquez-Polo, Hernandez and Lopez-Valcarcel27) that used only a single continuous outcome. In this study, the cost-effectiveness acceptability curve was also higher when covariates were included in the likelihood. The study by Bachmann et al. (Reference Bachmann, Fairall, Clark and Mugford3) compared the joint modeling of costs and effects using a binomial-gamma likelihood and a regression-based model of net benefits. Both likelihood functions produced similar results; however, the point estimate for the incremental cost-effectiveness ratio was approximately 20 percent higher for the binomial-gamma likelihood.
Priors and Likelihoods. One study (Reference Shih, Bekele and Xu20) examined the cost-effectiveness impact of generic drug entry using two approaches to Bayesian net benefit regression analysis. One approach pooled the data from the pre- and post-entry periods and used uninformative priors to estimate the regression parameters. The second approach proceeded in two steps. In the first step, the authors assumed uninformative priors for the regression parameters and updated these with data from the pre-entry period. In the second step, the authors used the posterior distributions generated in the first step as empirical priors for the regression parameters. Information from the post-entry period formed the likelihood data and was used to update the parameter values. At a willingness to pay of $US5,000, the probabilities of cost-effectiveness for the four non-generic drugs were 96.7 percent, 77.6 percent, 96.3 percent, and 97.0 percent, respectively, in the pre-entry period in the pooled analysis. These probabilities reduced to 36.7 percent, 62.7 percent, 33.0 percent, and 60.1 percent, respectively, in the post-entry period. The probabilities became 94.1 percent, 71.9 percent, 89.1 percent, and 92.1 percent in the analysis using the pre-entry data as a prior to update the post-entry data.
In O'Hagan et al. (Reference O'Hagan, Stevens and Montmartin18), the only substantial aspect of prior information was in regard to the true mean effect. When an informative prior was used for the effect measure, the cost-effectiveness acceptability curve was uniformly higher than when a weak prior was used. On the basis of the weak prior, the uncertainty associated with the decision was much greater, although in both cases the probability of cost-effectiveness was greater than 0.50 for all willingness to pay values. To test the robustness of the conclusions to changes in the likelihood, the authors replaced the assumption of normally distributed costs with lognormal distributions. Quite substantial differences in the cost-effectiveness acceptability curve were observed, especially for small willingness to pay values. While the probability of cost-effectiveness still exceeded 0.70 for almost all willingness to pay values, it never went beyond the level of 0.90 that was reached when the informative prior was used.
The sensitivity of the results to changes in the priors and the likelihoods is discussed in terms of changes in the probability of cost-effectiveness, as represented by the cost-effectiveness acceptability curve. This measure is chosen based on the frequency of its use among the studies as well as the relevancy of the information it imparts to decision makers. However, when considering the impact of using a more informative prior distribution, estimates of the mean difference in costs and effects might be more revealing. In general, you would expect the probability of cost-effectiveness to change when using an informative prior, even if the point estimates of the mean differences in costs and effects stayed exactly the same, because the probability of cost-effectiveness is a function of both the point estimates and the uncertainties. Al and Van Hout (Reference Al and Van Hout1) reported posterior mean differences in costs and effects of NLG 2149 and 0.098, NLG 2567 and 0.137, and NLG 2564 and 0.158 (NLG = Netherland guilders), for increasingly informative priors. The only other study to do so was O'Hagan et al. (Reference O'Hagan, Stevens and Montmartin18), which presented posterior estimates of the mean differences in costs and effects of −£574 and 1.24 (weak prior), and −£626 and 2.03 (informative prior). From the perspective of a decision maker, the issue, therefore, becomes one of whether the primary impact of the more informative prior is to reduce uncertainty, or if it actually changes the estimated differences in costs and effects in such a way as to alter the relative cost-effectiveness.
DISCUSSION AND LIMITATIONS
The Bayesian approach allows for the ability to accurately model the data and to incorporate additional information, in the form of prior distributions. The use of priors based on previous data may be less susceptible to accusations of subjectivity than opinion based priors, but they may also fail to subscribe to the notion of a fully Bayesian analysis. Some Bayesians would argue that such an approach is not in fact Bayesian at all because no subjective beliefs are used (Reference Briggs4). Despite the use of more informative priors among some of the included studies, the most common type of prior found in this review remains the vague or uninformative prior. This may reflect a deliberate attempt to give more weight to the data in the analysis. However, if prior information exists, the use of uninformative priors seemingly negates a fundamental feature of the Bayesian approach. The ability to incorporate genuine prior information in addition to the data in the final analysis is compromised when uninformative priors are used (Reference Stevens and O'Hagan23).
The rationale for choosing certain likelihoods and priors reflects the need to accurately model the data and to include all relevant prior information in the analysis. Likelihood functions, chosen based on the need to accommodate specific characteristics of the data (e.g., skewness, dependence), together with prior distributions, are intended to represent the totality of available evidence. Where the studies gave a reason for using uninformative priors (Reference Al and Van Hout1;Reference Heitjan and Li11;Reference Negrin and Vazquez-Polo14;Reference O'Hagan and Stevens16;Reference O'Hagan, Stevens and Montmartin18;Reference Shih, Bekele and Xu20;26), most stated a lack of genuine prior information. However, one study (Reference Fenwick, Wilson, Sculpher and Claxton7) commented, “vague priors ensured that the trial results had a larger influence upon the analysis than the prior beliefs.” Reasons for using informative priors included the presence of preceding trial or study results, which though the populations might differ, were viewed as being informative. The study by Shih et al. (Reference Shih, Bekele and Xu20) justified their use of prior information on the basis of preserving some of the original cost-effectiveness information in decision making.
In a Bayesian analysis of patient level data, one would assume that any estimate of the impact of sampling variation would be conditional on both the prior distribution and the likelihood function. The sensitivity of the results to changes in the priors and the likelihoods was considered in ten of the included studies (Reference Al and Van Hout1;Reference Bachmann, Fairall, Clark and Mugford3;Reference Briggs4;Reference Hahn and Whitehead9;Reference Heitjan and Li11;Reference O'Hagan and Stevens16;Reference O'Hagan, Stevens and Montmartin18;Reference Shih, Bekele and Xu20;Reference Vazquez-Polo, Hernandez and Lopez-Valcarcel27;Reference Vazquez-Polo, Negrin, Badia and Roset28). The results suggest that a failure to include sensitivity analysis could affect the estimated uncertainty and potentially lead to inappropriate inferences. Several authors (Reference Spiegelhalter, Abrams and Myles22;Reference Sung, Hayden and Greenberg24;25) have recommended the use of sensitivity analysis when reporting the results of Bayesian analyses.
The results of this review are intended to provide, for the first time, a comprehensive description of the use of Bayesian methods to handle uncertainty due to sampling variation in patient level economic evaluations. The review was limited to published studies identified from three databases and relied on a single reviewer. However, the search strategy covered the largest databases and was designed in consultation with a trained research librarian. The review was limited to patient level economic evaluations using information from a single source and did not consider decision analytic models using several data sources (e.g., Fryback et al.) (Reference Fryback, Chinnis and Ulvila8).
Despite these limitations, we believe that this review serves as a reference to those engaged in, or considering Bayesian analysis of patient level data. The decision to use Bayesian methods, rather than more traditional approaches, requires consideration of the relative advantages and disadvantages, in terms of informing healthcare policy decisions.
Potential disadvantages of Bayesian methods in healthcare evaluation include a lack of expertise, difficulty specifying and potential subjectivity of priors, as well as the additional complexity (Reference Spiegelhalter, Abrams and Myles22). Future research on the choice and elicitation of prior distributions in practical applications would seem critical to ensuring the ability of the Bayesian approach to synthesize all available evidence is fully exploited.
POLICY IMPLICATIONS
To the extent that important health policy decisions are informed by the results of economic evaluations, and that these results are subject to uncertainty, a comprehensive and robust approach is required. This would include the use of all relevant evidence to inform decision makers. The ability to combine informative priors with the data, as well as providing a natural way of handling uncertainty, suggests Bayesian methods may offer certain advantages over traditional methods.
SUPPLEMENTARY MATERIAL
Supplementary Tables 1 and 2: www.journals.cambridge.org/thc
CONTACT INFORMATION
C. Elizabeth McCarron, MA, MSc, (mccarrce@mcmaster.ca), PhD Candidate, Department of Clinical Epidemiology & Biostatistics, McMaster University, Programs for Assessment of Technology in Health (PATH) Research Institute, 25 Main Street West, Suite 2000, Hamilton, Ontario L8P 1H1, Canada
Eleanor M. Pullenayegum, PhD, (pullena@mcmaster.ca) Biostatistician, Biostatistics Unit, St. Joseph's Healthcare Hamilton, Assistant Professor, Clinical Epidemiology & Biostatistics, McMaster University, 1200 Main Street W, Hamilton, Ontario; Canada Biostatistician, Biostatistics Unit, St Joseph's Healthcare Hamilton, 50 Charlton Avenue E, Hamilton, Ontario L8N 4A6, Canada
Deborah A. Marshall, MHSA, PhD, (damarsha@ucalgary.ca), Canada Research Chair, Health Services and Systems Research, Associate Professor, Department of Community Health Sciences, Director, Health Technology Assessment, Alberta Bone Joint Health Institute, Faculty of Medicine, University of Calgary, Room 3C56, Health Research Innovation Centre, 3280 Hospital Drive NW, Calgary, Alberta T2N 4Z6, Canada
Ron Goeree, MA, (goereer@mcmaster.ca), Associate Professor, Clinical Epidemiology & Biostats, McMaster University, 1200 Main Street West, Hamilton, Ontario; Director, Programs for Assessment of Technology in Health PATH Research Institute, St. Joseph's Hospital Hamilton, 25 Main Street West, Suite 2000, Hamilton, Ontario L8P 1H1, Canada
Jean-Eric Tarride, PhD, (tarride@mcmaster.ca), Assistant Professor, Department of Clinical Epidemiology & Biostatistics, McMaster University, Programs for Assessment of Technology in Health (PATH) Research Institute, 25 Main Street West, Suite 2000, Hamilton, Ontario L8P 1H1, Canada