There is an ongoing discussion in academia and industry on the challenges of conducting economic evaluations of genomic/genetic tests (Reference Buchanan, Wordsworth and Schuh1–Reference Phillips, Deverka and Marshall3). However, the literature is developing in different domains, depending on test types and platforms, resulting in, for example, inconsistency in terminology, making it difficult for the health economists and business practitioners working in this clinical area to gain a comprehensive overview. In this article, we address this issue through a rapid literature review.
By genomic/genetic tests, we mean tests based on the analysis of DNA or RNA samples involving the examination of cell material in a test-tube using techniques to isolate and/or amplify and sequence or otherwise identify the therapeutic targets of the test. This approach may be contrasted with more traditional pathology approaches, such as immunohistochemistry (IHC), where cell material is stained and examined under a microscope by a pathologist. Often the same test may be carried out using either traditional pathology or genomic/genetic testing approaches. We distinguish four distinct types of genomic/genetic tests for the purposes of economic evaluation. The four categories are: single gene tests, multiple gene tests (or panels), multigene assays with risk scores, and whole genome/exome/transcriptome analysis. For simplicity, in this article, we will refer to the latter category as whole genome sequencing (WGS) but all comments apply equally to whole exome or whole transcriptome unless specifically stated. More detail of these test types and examples are provided in the Supplementary Materials.
Methodologically, single gene tests are straightforward to evaluate as the test is generally conducted for a specific reason with defined results and a “single trajectory” of costs and outcomes (Reference Phillips, Deverka and Marshall3). Multigene assays with risk scores, similar to single gene tests, are straightforward methodologically as, although multiple results are produced, they are interpreted by the algorithm into a single test result to inform a single decision in a specific indication. Multiple-gene tests or panels and WGS are, potentially, more complex to evaluate, as they produce multiple results, each of which may have distinct clinical and economic trajectories (Reference Phillips, Deverka and Marshall3). There may be circumstances where the incidental findings (IFs) from a panel test or WGS provide information, which is not immediately clinically actionable, but which may be useful in the future or may have implications for either the patient or a member of his or her family. Genomic/genetic tests may function as companion diagnostics (CDx), which are tests used to help match a patient to a specific drug or therapy (4). CDx may be assessed with a target therapeutic or as a stand-alone test. For example, in typical applications in cancer, a CDx is administered to all patients who may be eligible to receive a drug and only those whose tumor sample has a given mutation (or alternatively is lacking a given mutation) receives the treatment. Economic evaluation may compare a testing strategy such as this to a strategy where no-one is tested and either all patients receive the treatment or no patients receive the treatment. Health outcomes and costs are compared across the strategies. The economic evaluation focuses on the treatment strategy or preventive actions taken as a result of the test rather than the test as a technology in its own right (Reference Payne, Gavan, Wright and Thompson2).
This article aims to provide a simplified categorization of challenges in the economic evaluation of genomic/genetic tests identified from a systematic rapid review. Our categories distinguish challenges common to all economic evaluation, common to all diagnostics and those challenges pertinent for genomic/genetic tests. We provide a commentary on the challenges identified from the literature and offer our own suggested solutions to these challenges. In order to retain clarity, we consider challenges separately and take no account of normative frameworks, which may constrain analysts in particular jurisdictions. We used the twelve categories of challenge identified by Buchanan et al. (Reference Buchanan, Wordsworth and Schuh1) as a starting point for our review. We amplified each point based on our existing experience and a review of published literature, which identified forty-one papers. We included any papers, methodological or applied, which discussed challenges in the economic evaluation of genomic or genetic tests. A list of papers found and details of the rapid review methodology including a PRISMA diagram and search terms can be found in the Supplementary Materials. We extended the search terms in Buchanan et al. (Reference Buchanan, Wordsworth and Schuh1) to include “omics” in order to ensure we were capturing tests which used the broader categories of transcriptomics and proteomics. Dates of the papers identified by Buchanan et al. (Reference Buchanan, Wordsworth and Schuh1) and in our study are shown in Figure 1. It is evident from the figure that there continues to be a steady stream of papers addressing the challenges of economic evaluation of genomic/genetic tests.
Categorization of challenges identified in the literature
We retained the twelve challenges from Buchanan et al. (Reference Buchanan, Wordsworth and Schuh1) as our review identified no additional challenges. They were categorized into challenges common to the economic evaluation of all technologies, challenges common to diagnostic technologies and challenges pertinent for genomic/genetic tests. The categorized challenges are set out in Table 1 under four headings used by Buchanan et al. (Reference Buchanan, Wordsworth and Schuh1): analytical approach; cost and resource use; measuring effectiveness; and measuring outcomes. The challenges are then discussed in further detail in the sections which follow the table. A table showing which papers identified which challenges is included in the Supplementary Materials.
Table 1. Categorization of Challenges for Economic Evaluation of Genomic/Genetic Tests
Challenges of Economic Evaluation Common to All Technologies
Choice of Perspective and Time-Horizon
There was an interesting contrast between authors arguing for a wider perspective (Reference Buchanan, Wordsworth and Schuh1;Reference Fugel, Nuijten, Postma and Redekop5) and those arguing for a narrower one (Reference Oosterhoff, van der Maas and Steuten6;Reference Hart and Spencer7). Buchanan et al. (Reference Buchanan, Wordsworth and Schuh1), argue for a societal perspective as testing can affect both healthcare and life decisions (e.g., regarding family planning or schooling) and suggest that multiple analytical perspectives are adopted. Oosterhof et al. (Reference Oosterhoff, van der Maas and Steuten6) and Hart and Spencer (Reference Hart and Spencer7) argue that healthcare or societal perspectives fail to reflect the position of decision makers in specific parts of a healthcare system, such as, private payers or those in financial silos. Hart and Spencer (Reference Hart and Spencer7) claim that societal or healthcare perspective analyses are not useful for self-insuring employers who cover 49 percent of the US population. Similarly, for time-horizon, authors take opposing positions. Some authors argue for a full lifetime horizon given that impacts from genomic/genetic tests may occur far into the future and adopting shorter timeframes risk misestimating cumulative costs and effects (Reference Buchanan, Wordsworth and Schuh1;Reference Phillips, Deverka and Marshall3). Other authors argue that a shorter time-horizon is appropriate either because a shorter horizon reflects the time members typically stay in an insurance scheme (Reference Hart and Spencer7) or because biomarker tests may quickly become obsolete (Reference Doble, Harris, Thomas, Fox and Lorgelly8).
Given these differences, we suggest the perspective and time-horizon chosen should be what matters to or is mandated by the decision maker to whom the analysis is addressed. Although it would be ideal if all analyses were useful to all decision makers, the time and resource required may make this impractical and reduce the likelihood of timely information being available to inform decisions. For early evaluation a shorter time-horizon may be chosen to simplify the analysis. The limitations of such an approach should be made clear to the decision maker.
Challenges of Economic Evaluation Common to All Diagnostic Technologies
Complexity of Analysis
Various factors contribute to make the economic evaluation of diagnostic technologies complex. The decision space can rapidly become unwieldy as different positions in the clinical pathway, multiple indications (Reference Doble, Harris, Thomas, Fox and Lorgelly8;Reference Fleeman, Payne, Newman, Howell and Boland9) and different settings are explored (Reference Annemans, Redekop and Payne10). Comparators may vary by setting (Reference Oosterhoff, van der Maas and Steuten6;Reference Annemans, Redekop and Payne10) with not all comparators potentially being known (Reference Fleeman, Payne, Newman, Howell and Boland9). Setting and position in the pathway impact on prevalence and test performance (Reference Annemans, Redekop and Payne10). Different thresholds for positivity may be possible (Reference Doble11). There may also be interdependencies between the results of the different tests and different combinations of sensitivity and specificity may be preferred dependent upon where the test is placed in a clinical pathway (Reference Phillips, Deverka and Marshall3;Reference Doble, Harris, Thomas, Fox and Lorgelly8). Increased complexity leads to greater uncertainty (Reference Annemans, Redekop and Payne10) which includes parameter uncertainty (assessed in probabilistic sensitivity analysis) and also structural uncertainty which can be addressed through scenario analyses (Reference Payne, Gavan, Wright and Thompson2;Reference Fugel, Nuijten, Postma and Redekop5;Reference Oosterhoff, van der Maas and Steuten6;Reference Annemans, Redekop and Payne10). The level of complexity and heterogeneity makes it difficult to synthesize evidence using meta-analysis following systematic review thus compounding issues around lack of clinical evidence (Reference Buchanan, Wordsworth and Schuh1).
Rather than new methods being required, we believe that existing methods should be more consistently and appropriately applied. Early in the lifecycle, methods from early health technology assessment (HTA) such as simple models with test performance based on assumptions and scenario analysis could be used to explore the potential of a technology and drive evidence generation strategy (Reference D’Andrea, Marzuillo, Pelone, De Vito and Villari12). In later analysis, test performance based on evidence and behavioral aspects should be routinely incorporated. There is a tension between the desire to make the analysis generalizable and the usefulness of an analysis tailored to a particular setting. The former is potentially useful to more decision makers but may be so complex that the findings are impenetrable, it may also be expensive and take too long. The latter approach, with a focused decision problem considering only the options believed to be feasible from a clinical perspective in a specific setting may be more timely and less resource intensive (Reference Payne, Gavan, Wright and Thompson2;Reference Fugel, Nuijten, Postma and Redekop5;Reference Annemans, Redekop and Payne10;Reference Payne, Eden, Davison and Bakker13).
Range of Costs
Rather than just considering the cost of the test, economic evaluations of diagnostic technologies need to include the full range of costs both upstream and downstream that result from the introduction of the test. This may include laboratory set-up costs (Reference Payne, Eden, Davison and Bakker13) and if there is a large capital spend, such as sequencing machinery, the result of any analysis is likely to be sensitive to assumptions made about volumes of use (potentially across indications) and extensive sensitivity analysis is recommended (Reference Fugel, Nuijten, Postma and Redekop5;Reference Mistry and Mason14). It may be useful to think of a diagnostic test strategy as a complex intervention where the test needs to be assessed in its full context (Reference Payne, McAllister and Davies15). Where a testing strategy involves genetic counseling then this should be included as well as the costs of identifying individuals to be included (Reference Buchanan, Wordsworth and Schuh1;Reference D’Andrea, Marzuillo, Pelone, De Vito and Villari12).
There is no methodological difficulty with the inclusion of a full range of costs. In a comparative analysis, costs only need to be compared if they differ between arms (so it may not be appropriate to include costs of tissue acquisition, for example) although some decision makers may find a more complete cost analysis to be useful.
Evidence Base
Evidence of clinical utility is not incentivized for diagnostic tests as it is not required for regulatory approval (Reference Payne, Gavan, Wright and Thompson2). Evidence requirements for assessment and adoption are often not transparent (Reference Mistry and Mason14) and are extensive given complexity and the need to consider all costs and health outcomes stemming from the test. “End to end” studies are the gold standard for the evaluation of diagnostic tests, but these are rarely available (Reference Payne, Gavan, Wright and Thompson2;Reference Mistry and Mason14) with clinical evidence often derived from retrospective, observational data (Reference Annemans, Redekop and Payne10;Reference Garfield, Polisena and Spinner16) which is prone to bias (Reference Payne, Eden, Davison and Bakker13). Evidence may not link biomarker levels to phenotype (Reference Fleeman, Payne, Newman, Howell and Boland9) and may not consider the consequences of false negatives and false positives particularly in subgroups or real-world treatment patterns (Reference Annemans, Redekop and Payne10). It has been suggested that the under-developed evidence base is the biggest challenge in the economic evaluation of diagnostic technologies (Reference Gavan, Thompson and Payne17). The under-developed evidence base risks fundamentally undermining the credibility of economic evaluation and may lead to the rejection of potentially cost-effective diagnostic technologies by decision makers due to the level of uncertainty (Reference Phillips, Deverka and Marshall3;Reference Fleeman, Payne, Newman, Howell and Boland9;Reference Mistry and Mason14;Reference Garfield, Polisena and Spinner16;Reference Grosse18). As well as solutions to improve the evidence base such as novel trial design and real-world evidence collection (Reference Fugel, Nuijten, Postma and Redekop5), process improvements have been suggested. This may involve clearer definition of responsibility for generating evidence (Reference Fleeman, Payne, Newman, Howell and Boland9), incentivizing developers to produce evidence through improved intellectual property protection or matched funding (Reference Fugel, Nuijten, Postma and Redekop5;Reference Doble11) and decision makers supporting evidence development (Reference Fugel, Nuijten, Postma and Redekop5;Reference Annemans, Redekop and Payne10). Several authors suggest a role for early HTA or a two-stage process where evidence requirements are identified early and a collaborative approach between developer and decision maker is taken to developing the evidence (Reference Buchanan, Wordsworth and Schuh1;Reference Phillips, Deverka and Marshall3;Reference Mistry and Mason14).
This challenge requires process change rather than methods development. Early HTA involving iterative economic evaluation could be extensively used as part of a transparent regulatory and adoption process for diagnostic technologies. This should allow the identification of promising diagnostic technologies and facilitate collaborative evidence generation which is sufficient for the decision-makers’ needs and situated in a relevant context.
Behavioral Aspects
As diagnostic technologies do not directly impact health outcomes, economic evaluation must take account of what clinicians and patients do when they receive the results of a test (Reference Fugel, Nuijten, Postma and Redekop5;Reference Oosterhoff, van der Maas and Steuten6; Reference Fleeman, Payne, Newman, Howell and Boland9;Reference Annemans, Redekop and Payne10). This may require the generation of specific evidence as clinicians do not necessarily behave in predictable ways upon receipt of test results (Reference Buchanan, Wordsworth and Schuh1;Reference Thompson, Newman and Elliott19) particularly if results are discordant (Reference Annemans, Redekop and Payne10). Such evidence generation may lead to the redesign of the intervention such as the addition of training for clinicians on the interpretation of results (Reference Garfield, Polisena and Spinner16).
Behavioral uncertainty should be incorporated into economic evaluation and evidence generation strategies from the earliest stage of development of a diagnostic technology. This does not require any new methods development, rather a recognition of the issue and a consistent approach to inclusion.
Choice of Outcome Metrics
Cost utility analysis using the quality adjusted life year (QALY) as an outcome measure and incremental cost-effectiveness ratios (ICERs), is prominent in the HTA of pharmaceuticals and other medical technologies. However, decision makers are likely to find other outcome measures useful, particularly budget impact (Reference Doble11); the ability of patients to enter clinical trials on a timely basis, turnaround time or preservation of tissues (Reference Burris, Saltz and Yu20); impact on capacity constraints (Reference Payne, Eden, Davison and Bakker13); and, the creation of a market for a drug which would not exist without the test (Reference Doble11). For US self-insured employers the most appropriate metric may be cost per member per month requiring information about the budget impact of any new test and resulting cost-offsets further down the clinical pathway (Reference Hart and Spencer7). Diagnostic yield is frequently used as an outcome in economic evaluation but its usefulness to decision makers is limited by the lack of a threshold valuation for a diagnosis (Reference Alam and Schofield21) and the fact that additional diagnoses may have unpredictable impacts on costs (Reference Payne, Gavan, Wright and Thompson2).
Decision makers may value the presentation of a wide range of outcome metrics. The analyst should determine which metrics are important to the specific decision maker. This may impact upon the methods chosen (e.g., cost consequence analysis or budget impact analysis may replace cost-utility analysis).
Challenges Pertinent to the Economic Evaluation of Genomic/Genetic Tests
Heterogeneity of Tests and Platforms
Variation in costs is typical across geographic settings. For genomic/genetic tests, there are some additional challenges due to laboratories using a range of technologies, test configurations and platforms which all impact on costs and may make the synthesis of clinical effectiveness difficult to achieve (Reference Buchanan, Wordsworth and Schuh1;Reference Payne, Gavan, Wright and Thompson2;Reference Hart and Spencer7;Reference Fleeman, Payne, Newman, Howell and Boland9;Reference Payne, Eden, Davison and Bakker13;Reference Garfield, Polisena and Spinner16;Reference Burris, Saltz and Yu20). For test cost, there may be large differences between laboratory developed tests and commercial kits (Reference Buchanan, Wordsworth and Schuh1), no national tariffs or published price lists may exist (Reference Phillips, Deverka and Marshall3;Reference Fleeman, Payne, Newman, Howell and Boland9) and costs have changed over time (Reference Buchanan, Wordsworth and Schuh1;Reference Alam and Schofield21). Costing studies are starting to emerge (Reference Payne, Eden, Davison and Bakker13;Reference Burris, Saltz and Yu20;Reference Siamoglou, Karamperis, Mitropoulou and Patrinos22–Reference Marino, Touzani and Perrier26) and platform Web sites such as Genohub.com maybe a useful source of a range of prices for WGS and multiple gene tests (Reference Payne, Gavan, Wright and Thompson2).
Difficulty in estimating costs is a practical challenge for economic evaluation rather than one requiring methods development (Reference Fugel, Nuijten, Postma and Redekop5). Calls for a national price list (Reference Phillips, Deverka and Marshall3;Reference Fleeman, Payne, Newman, Howell and Boland9) risk the evaluation missing important differences between testing carried out in different locations. Costs per sample are particularly sensitive to the throughput achieved on certain platforms and an important finding of economic evaluation may be that the method used in a specific setting is not an efficient use of resources. Heterogeneity in test performances is another practical problem which may require a different approach to be taken by analysts. For example, Gavan et al. (Reference Gavan, Thompson and Payne17) describe undertaking an HTA of EGFR testing in the UK, where the team failed to develop a model as a result of uncertainties in model structure and lack of data for the range of tests evaluated. Here, it may be appropriate to evaluate an “exemplar” test akin to a Target Product Profile. The analysis could identify an exemplar test configuration, cost and test performance at which the test achieved the goal desired by the decision maker. Individual settings within the jurisdiction could compare their configuration, test performance and cost with the exemplar. This compromise may enable timely (albeit simplified) analyses to be provided to decision makers. An alternative approach may be to have a focused decision problem appropriate to a specific decision maker and setting (Reference Annemans, Redekop and Payne10).
Increasing Stratification
Undeveloped evidence base and complexity of analysis in the evaluation of diagnostic technologies are compounded by genetic stratification of disease, particularly cancer, which increases the level of uncertainty in the evidence base due to small samples and slow recruitment to clinical trials (Reference Buchanan, Wordsworth and Schuh1;Reference Fugel, Nuijten, Postma and Redekop5;Reference Fleeman, Payne, Newman, Howell and Boland9;Reference Alam and Schofield21). New trial designs and observational data may form part of a solution to this issue and new analytical approaches may be required (Reference Payne, Gavan, Wright and Thompson2;Reference Alam and Schofield21).
As discussed under the evidence base challenge, a change in process in the assessment of diagnostics may be required.
Personal Utility (The “Value of Knowing”)
The use of the QALY metric allows comparability across disease areas. However, the tools used to estimate preference-weighted utilities used to calculate QALYs may not be sufficiently sensitive to detect the impact of diagnostic and psychological consequences of testing (Reference Annemans, Redekop and Payne10;Reference Payne, McAllister and Davies15). Where results give rise to clinical actions or a new testing strategy replaces an existing one (i.e., a panel test or WGS replacing serial single gene tests), the QALY may be sufficient to capture value. Where no treatment exists, there is evidence that knowledge of diagnosis alone (or even knowledge that all avenues have been pursued) is valued by some tested individuals and/or their families (Reference Mollison, O’Daniel, Henderson, Berg and Skinner27;Reference Regier, Weymann, Buchanan, Marshall and Wordsworth28). Note that not all patients and their families place a positive value on information itself (Reference Regier, Weymann, Buchanan, Marshall and Wordsworth28). Here, it would be the choice of whether or not to have the information which could be valued or else a disutility included for information which was not wanted. Some studies have started to explore ways in which the value of knowing and other nonhealth benefits (termed “personal utility”) could be incorporated in a cost-utility framework (Reference Regier, Weymann, Buchanan, Marshall and Wordsworth28;Reference Kuppermann, Wang and Wong29).
Methodological development may be required here but if alternative metrics are developed (such as ICECAP (Reference Payne, McAllister and Davies15), discrete choice experiments (Reference Doble11), or cost benefit analysis (Reference Payne, Gavan, Wright and Thompson2)), then the problem of how to incorporate these into an evaluation framework where cost utility and the QALY are the norm remains. Work has been carried out in Canada to develop a measure incorporating the value of both clinical and personal utility (Reference Hayeems, Luca, Pullenayegum, Stephen Meyn and Ungar30). Australian and US bodies have suggested that quantification of health and nonhealth outcomes are necessary for decision making (Reference Fugel, Nuijten, Postma and Redekop5). In the UK, genetic testing is in place which has not, to the best of our knowledge, been evaluated using formal metrics, however, decision makers have been able to reach a decision about the value of the testing (31). Prior to continued methodological development it may be worth determining the extent of decision makers’ need for formal quantification of nonhealth outcomes.
Incidental Findings
Multigene tests and WGS may return IFs in addition to the results sought when the test was ordered (Reference Doble11;Reference Phillips, Douglas, Trosman and Marshall32). IF which are actionable may incur additional diagnostic or treatment costs (Reference Payne, Gavan, Wright and Thompson2;Reference Kuppermann, Wang and Wong29). There may also be an increased risk of treatment with unproven therapies (Reference Phillips, Pletcher and Ladabaum33). Patients are likely to have different preferences for information from IF, which may require development of methods to educate those undergoing testing and to support decision making (Reference Kuppermann, Wang and Wong29;Reference Bennette, Trinidad and Fullerton34;Reference Bennette, Gallego, Burke, Jarvik and Veenstra35). Multiple actionable results from multigene or WGS testing may require development of methods to aggregate results. This may not be straightforward as there may be interactive effects (e.g., on survival) among multiple results and some IFs may not be used until a later time in a patient’s life (Reference Phillips, Deverka and Marshall3;Reference Phillips, Douglas, Trosman and Marshall32).
Several methodological approaches have been suggested to incorporate IF in economic evaluations including backwards induction (Reference Doble11), weighting according to the incidence of actionable results (Reference Plumpton, Pirmohamed and Hughes36) and simplifying the analysis by selecting the most penetrant mutations (Reference Bennette, Gallego, Burke, Jarvik and Veenstra35;Reference Gallego, Shirts and Bennette37). Aggregating results may be more of a theoretical problem than a practical one at present although this may change in time. Payne et al. (Reference Payne, Gavan, Wright and Thompson2) report the use of multidisciplinary reporting committees comprising geneticists, counselors and molecular scientists. Given test results will only be actionable if reported to patients, the reporting effectively frames the intervention for evaluation purposes.
Spillover Effects
Results from genomic/genetic tests may impact on other family members or future generations (Reference Payne, Gavan, Wright and Thompson2;Reference Kuppermann, Wang and Wong29) and upon reproductive decisions (Reference Buchanan, Wordsworth and Schuh1). Such downstream impacts are a challenge in economic evaluations as it is unclear how many generations and how many family members may be affected (Reference Phillips, Douglas, Trosman and Marshall32). Results of an economic evaluation may be sensitive to assumptions around the number of family members impacted by the initial testing (Reference Bennette, Gallego, Burke, Jarvik and Veenstra35).
The number of family members can be established empirically. The number of generations impacted is unknowable. However, methodologically, incorporating benefits into future generations is not challenging, the benefit of extending beyond a certain point will be eroded by discounting and extensive sensitivity analysis can be undertaken. Reproductive decisions are more challenging and raise issues such as those discussed under the Personal Utility heading.
Discussion
We found that the twelve challenges in the economical evaluation of genomic/genetic tests described by Buchanan et al. (Reference Buchanan, Wordsworth and Schuh1) still apply. Choice of perspective and time-horizon are common to all economic evaluation. Five challenges are relevant for all diagnostic technologies (complexity, range of costs, evidence base, behavioral aspects and choice of outcome metric). A further five are particularly pertinent in the evaluation of genomic/genetic tests (heterogeneity of tests and platforms, increasing stratification of disease, personal utility, IFs and spillover effects). Current methods of economic evaluation are generally able to cope with all challenges, apart from those pertinent to genomic/genetic tests where some methodological development may be required. In particular, methods may be required to: improve the balance between timeliness and generalizability of economic evaluations given heterogeneity of tests and platforms; facilitate the inclusion of observational data given increasing stratification of disease; incorporate evidence of personal utility into cost-utility analyses; aggregate the impacts of IFs; and incorporate a utility for reproductive decision making.
This is the first study, to our knowledge to identify challenges in economic evaluation for all types of genomic/genetic tests and to distinguish challenges pertinent to genomic/genetic tests from those relevant for all diagnostics or all health technologies. Numerous papers have identified challenges to economic evaluation of genomic/genetic tests which have been referenced in the main body of this manuscript. Our contribution is to bring previously identified challenges together across all types of genomic/genetic tests and set them out in an accessible manner. A limitation of this study is that, due to inconsistencies in search terminology (Reference Payne, Gavan, Wright and Thompson2) and the use of rapid systematic review methods, it cannot be ruled out that relevant papers will have been missed. However, it is unlikely that a relevant challenge will have been missed as there is considerable overlap between studies.
This study suggests that although some methodological development may be required many challenges require a change of focus or process. Challenges in choice of perspective and time-horizon, complexity, range of costs and choice of outcome metrics can all be tackled by defining the decision problem more closely and focusing on a specific setting and decision maker. The key challenge of under-developed evidence may require process change. More focus on early economic evaluation and more resource for shared evidence generation would appear to be required. Future research in the methodological areas identified would be useful as would process development and evaluation to help the evidence base around genomic tests to be sufficient and relevant to establish both clinical and cost effectiveness.
This article has also set out potential solutions to challenges in the economic evaluation of genomic tests. With the possible exception of the solution suggested to deal with heterogeneity of test costs and platforms, the solutions suggested are not new. Rather, the novelty in our article is in presenting those solutions together with an assessment of whether methods development in economic evaluation is required. It is important to recognize that certain solutions may not be available to analysts working within the confines of a reference case set by a particular reimbursement agency. Reference cases were often developed primarily for the assessment of pharmaceuticals, and adaptations to the challenges of assessing diagnostic technologies may not have been made. The National Institute for Health and Care Excellence (NICE) in the UK is currently undertaking a wide-ranging review of methods which may go some way toward addressing some of the challenges presented here (38). In particular, there are proposals for manufacturers to provide schedules of evidence gaps, for an extension of coverage with evidence development (CED) and the ability to move directly to CED bypassing a first full assessment. We also recognize that the combination of a number of challenges presented here may create difficulties which are greater than the sum of the parts. Although analysts may be constrained by a reference case, we would urge a careful consideration of the scope of any assessment to ensure both that the analysis is manageable and that the results are comprehensible for the decision maker.
There is an ongoing discussion in academia and industry on the challenges of conducting economic evaluations of genomic/genetic tests (Reference Buchanan, Wordsworth and Schuh1–Reference Phillips, Deverka and Marshall3). However, the literature is developing in different domains, depending on test types and platforms, resulting in, for example, inconsistency in terminology, making it difficult for the health economists and business practitioners working in this clinical area to gain a comprehensive overview. In this article, we address this issue through a rapid literature review.
By genomic/genetic tests, we mean tests based on the analysis of DNA or RNA samples involving the examination of cell material in a test-tube using techniques to isolate and/or amplify and sequence or otherwise identify the therapeutic targets of the test. This approach may be contrasted with more traditional pathology approaches, such as immunohistochemistry (IHC), where cell material is stained and examined under a microscope by a pathologist. Often the same test may be carried out using either traditional pathology or genomic/genetic testing approaches. We distinguish four distinct types of genomic/genetic tests for the purposes of economic evaluation. The four categories are: single gene tests, multiple gene tests (or panels), multigene assays with risk scores, and whole genome/exome/transcriptome analysis. For simplicity, in this article, we will refer to the latter category as whole genome sequencing (WGS) but all comments apply equally to whole exome or whole transcriptome unless specifically stated. More detail of these test types and examples are provided in the Supplementary Materials.
Methodologically, single gene tests are straightforward to evaluate as the test is generally conducted for a specific reason with defined results and a “single trajectory” of costs and outcomes (Reference Phillips, Deverka and Marshall3). Multigene assays with risk scores, similar to single gene tests, are straightforward methodologically as, although multiple results are produced, they are interpreted by the algorithm into a single test result to inform a single decision in a specific indication. Multiple-gene tests or panels and WGS are, potentially, more complex to evaluate, as they produce multiple results, each of which may have distinct clinical and economic trajectories (Reference Phillips, Deverka and Marshall3). There may be circumstances where the incidental findings (IFs) from a panel test or WGS provide information, which is not immediately clinically actionable, but which may be useful in the future or may have implications for either the patient or a member of his or her family. Genomic/genetic tests may function as companion diagnostics (CDx), which are tests used to help match a patient to a specific drug or therapy (4). CDx may be assessed with a target therapeutic or as a stand-alone test. For example, in typical applications in cancer, a CDx is administered to all patients who may be eligible to receive a drug and only those whose tumor sample has a given mutation (or alternatively is lacking a given mutation) receives the treatment. Economic evaluation may compare a testing strategy such as this to a strategy where no-one is tested and either all patients receive the treatment or no patients receive the treatment. Health outcomes and costs are compared across the strategies. The economic evaluation focuses on the treatment strategy or preventive actions taken as a result of the test rather than the test as a technology in its own right (Reference Payne, Gavan, Wright and Thompson2).
This article aims to provide a simplified categorization of challenges in the economic evaluation of genomic/genetic tests identified from a systematic rapid review. Our categories distinguish challenges common to all economic evaluation, common to all diagnostics and those challenges pertinent for genomic/genetic tests. We provide a commentary on the challenges identified from the literature and offer our own suggested solutions to these challenges. In order to retain clarity, we consider challenges separately and take no account of normative frameworks, which may constrain analysts in particular jurisdictions. We used the twelve categories of challenge identified by Buchanan et al. (Reference Buchanan, Wordsworth and Schuh1) as a starting point for our review. We amplified each point based on our existing experience and a review of published literature, which identified forty-one papers. We included any papers, methodological or applied, which discussed challenges in the economic evaluation of genomic or genetic tests. A list of papers found and details of the rapid review methodology including a PRISMA diagram and search terms can be found in the Supplementary Materials. We extended the search terms in Buchanan et al. (Reference Buchanan, Wordsworth and Schuh1) to include “omics” in order to ensure we were capturing tests which used the broader categories of transcriptomics and proteomics. Dates of the papers identified by Buchanan et al. (Reference Buchanan, Wordsworth and Schuh1) and in our study are shown in Figure 1. It is evident from the figure that there continues to be a steady stream of papers addressing the challenges of economic evaluation of genomic/genetic tests.
Figure 1. Dates of papers identified by Buchanan et al. (Reference Buchanan, Wordsworth and Schuh1) and by the present review.
Categorization of challenges identified in the literature
We retained the twelve challenges from Buchanan et al. (Reference Buchanan, Wordsworth and Schuh1) as our review identified no additional challenges. They were categorized into challenges common to the economic evaluation of all technologies, challenges common to diagnostic technologies and challenges pertinent for genomic/genetic tests. The categorized challenges are set out in Table 1 under four headings used by Buchanan et al. (Reference Buchanan, Wordsworth and Schuh1): analytical approach; cost and resource use; measuring effectiveness; and measuring outcomes. The challenges are then discussed in further detail in the sections which follow the table. A table showing which papers identified which challenges is included in the Supplementary Materials.
Table 1. Categorization of Challenges for Economic Evaluation of Genomic/Genetic Tests
DCE, discrete choice experiment; HTA, health technology assessment; ICER, incremental cost effectiveness ratio; QALY, quality adjusted life year; WGS, whole genome sequencing (used in this article and table to represent whole exome and whole transcriptome analysis in addition to whole genome sequencing).
Challenges of Economic Evaluation Common to All Technologies
Choice of Perspective and Time-Horizon
There was an interesting contrast between authors arguing for a wider perspective (Reference Buchanan, Wordsworth and Schuh1;Reference Fugel, Nuijten, Postma and Redekop5) and those arguing for a narrower one (Reference Oosterhoff, van der Maas and Steuten6;Reference Hart and Spencer7). Buchanan et al. (Reference Buchanan, Wordsworth and Schuh1), argue for a societal perspective as testing can affect both healthcare and life decisions (e.g., regarding family planning or schooling) and suggest that multiple analytical perspectives are adopted. Oosterhof et al. (Reference Oosterhoff, van der Maas and Steuten6) and Hart and Spencer (Reference Hart and Spencer7) argue that healthcare or societal perspectives fail to reflect the position of decision makers in specific parts of a healthcare system, such as, private payers or those in financial silos. Hart and Spencer (Reference Hart and Spencer7) claim that societal or healthcare perspective analyses are not useful for self-insuring employers who cover 49 percent of the US population. Similarly, for time-horizon, authors take opposing positions. Some authors argue for a full lifetime horizon given that impacts from genomic/genetic tests may occur far into the future and adopting shorter timeframes risk misestimating cumulative costs and effects (Reference Buchanan, Wordsworth and Schuh1;Reference Phillips, Deverka and Marshall3). Other authors argue that a shorter time-horizon is appropriate either because a shorter horizon reflects the time members typically stay in an insurance scheme (Reference Hart and Spencer7) or because biomarker tests may quickly become obsolete (Reference Doble, Harris, Thomas, Fox and Lorgelly8).
Given these differences, we suggest the perspective and time-horizon chosen should be what matters to or is mandated by the decision maker to whom the analysis is addressed. Although it would be ideal if all analyses were useful to all decision makers, the time and resource required may make this impractical and reduce the likelihood of timely information being available to inform decisions. For early evaluation a shorter time-horizon may be chosen to simplify the analysis. The limitations of such an approach should be made clear to the decision maker.
Challenges of Economic Evaluation Common to All Diagnostic Technologies
Complexity of Analysis
Various factors contribute to make the economic evaluation of diagnostic technologies complex. The decision space can rapidly become unwieldy as different positions in the clinical pathway, multiple indications (Reference Doble, Harris, Thomas, Fox and Lorgelly8;Reference Fleeman, Payne, Newman, Howell and Boland9) and different settings are explored (Reference Annemans, Redekop and Payne10). Comparators may vary by setting (Reference Oosterhoff, van der Maas and Steuten6;Reference Annemans, Redekop and Payne10) with not all comparators potentially being known (Reference Fleeman, Payne, Newman, Howell and Boland9). Setting and position in the pathway impact on prevalence and test performance (Reference Annemans, Redekop and Payne10). Different thresholds for positivity may be possible (Reference Doble11). There may also be interdependencies between the results of the different tests and different combinations of sensitivity and specificity may be preferred dependent upon where the test is placed in a clinical pathway (Reference Phillips, Deverka and Marshall3;Reference Doble, Harris, Thomas, Fox and Lorgelly8). Increased complexity leads to greater uncertainty (Reference Annemans, Redekop and Payne10) which includes parameter uncertainty (assessed in probabilistic sensitivity analysis) and also structural uncertainty which can be addressed through scenario analyses (Reference Payne, Gavan, Wright and Thompson2;Reference Fugel, Nuijten, Postma and Redekop5;Reference Oosterhoff, van der Maas and Steuten6;Reference Annemans, Redekop and Payne10). The level of complexity and heterogeneity makes it difficult to synthesize evidence using meta-analysis following systematic review thus compounding issues around lack of clinical evidence (Reference Buchanan, Wordsworth and Schuh1).
Rather than new methods being required, we believe that existing methods should be more consistently and appropriately applied. Early in the lifecycle, methods from early health technology assessment (HTA) such as simple models with test performance based on assumptions and scenario analysis could be used to explore the potential of a technology and drive evidence generation strategy (Reference D’Andrea, Marzuillo, Pelone, De Vito and Villari12). In later analysis, test performance based on evidence and behavioral aspects should be routinely incorporated. There is a tension between the desire to make the analysis generalizable and the usefulness of an analysis tailored to a particular setting. The former is potentially useful to more decision makers but may be so complex that the findings are impenetrable, it may also be expensive and take too long. The latter approach, with a focused decision problem considering only the options believed to be feasible from a clinical perspective in a specific setting may be more timely and less resource intensive (Reference Payne, Gavan, Wright and Thompson2;Reference Fugel, Nuijten, Postma and Redekop5;Reference Annemans, Redekop and Payne10;Reference Payne, Eden, Davison and Bakker13).
Range of Costs
Rather than just considering the cost of the test, economic evaluations of diagnostic technologies need to include the full range of costs both upstream and downstream that result from the introduction of the test. This may include laboratory set-up costs (Reference Payne, Eden, Davison and Bakker13) and if there is a large capital spend, such as sequencing machinery, the result of any analysis is likely to be sensitive to assumptions made about volumes of use (potentially across indications) and extensive sensitivity analysis is recommended (Reference Fugel, Nuijten, Postma and Redekop5;Reference Mistry and Mason14). It may be useful to think of a diagnostic test strategy as a complex intervention where the test needs to be assessed in its full context (Reference Payne, McAllister and Davies15). Where a testing strategy involves genetic counseling then this should be included as well as the costs of identifying individuals to be included (Reference Buchanan, Wordsworth and Schuh1;Reference D’Andrea, Marzuillo, Pelone, De Vito and Villari12).
There is no methodological difficulty with the inclusion of a full range of costs. In a comparative analysis, costs only need to be compared if they differ between arms (so it may not be appropriate to include costs of tissue acquisition, for example) although some decision makers may find a more complete cost analysis to be useful.
Evidence Base
Evidence of clinical utility is not incentivized for diagnostic tests as it is not required for regulatory approval (Reference Payne, Gavan, Wright and Thompson2). Evidence requirements for assessment and adoption are often not transparent (Reference Mistry and Mason14) and are extensive given complexity and the need to consider all costs and health outcomes stemming from the test. “End to end” studies are the gold standard for the evaluation of diagnostic tests, but these are rarely available (Reference Payne, Gavan, Wright and Thompson2;Reference Mistry and Mason14) with clinical evidence often derived from retrospective, observational data (Reference Annemans, Redekop and Payne10;Reference Garfield, Polisena and Spinner16) which is prone to bias (Reference Payne, Eden, Davison and Bakker13). Evidence may not link biomarker levels to phenotype (Reference Fleeman, Payne, Newman, Howell and Boland9) and may not consider the consequences of false negatives and false positives particularly in subgroups or real-world treatment patterns (Reference Annemans, Redekop and Payne10). It has been suggested that the under-developed evidence base is the biggest challenge in the economic evaluation of diagnostic technologies (Reference Gavan, Thompson and Payne17). The under-developed evidence base risks fundamentally undermining the credibility of economic evaluation and may lead to the rejection of potentially cost-effective diagnostic technologies by decision makers due to the level of uncertainty (Reference Phillips, Deverka and Marshall3;Reference Fleeman, Payne, Newman, Howell and Boland9;Reference Mistry and Mason14;Reference Garfield, Polisena and Spinner16;Reference Grosse18). As well as solutions to improve the evidence base such as novel trial design and real-world evidence collection (Reference Fugel, Nuijten, Postma and Redekop5), process improvements have been suggested. This may involve clearer definition of responsibility for generating evidence (Reference Fleeman, Payne, Newman, Howell and Boland9), incentivizing developers to produce evidence through improved intellectual property protection or matched funding (Reference Fugel, Nuijten, Postma and Redekop5;Reference Doble11) and decision makers supporting evidence development (Reference Fugel, Nuijten, Postma and Redekop5;Reference Annemans, Redekop and Payne10). Several authors suggest a role for early HTA or a two-stage process where evidence requirements are identified early and a collaborative approach between developer and decision maker is taken to developing the evidence (Reference Buchanan, Wordsworth and Schuh1;Reference Phillips, Deverka and Marshall3;Reference Mistry and Mason14).
This challenge requires process change rather than methods development. Early HTA involving iterative economic evaluation could be extensively used as part of a transparent regulatory and adoption process for diagnostic technologies. This should allow the identification of promising diagnostic technologies and facilitate collaborative evidence generation which is sufficient for the decision-makers’ needs and situated in a relevant context.
Behavioral Aspects
As diagnostic technologies do not directly impact health outcomes, economic evaluation must take account of what clinicians and patients do when they receive the results of a test (Reference Fugel, Nuijten, Postma and Redekop5;Reference Oosterhoff, van der Maas and Steuten6; Reference Fleeman, Payne, Newman, Howell and Boland9;Reference Annemans, Redekop and Payne10). This may require the generation of specific evidence as clinicians do not necessarily behave in predictable ways upon receipt of test results (Reference Buchanan, Wordsworth and Schuh1;Reference Thompson, Newman and Elliott19) particularly if results are discordant (Reference Annemans, Redekop and Payne10). Such evidence generation may lead to the redesign of the intervention such as the addition of training for clinicians on the interpretation of results (Reference Garfield, Polisena and Spinner16).
Behavioral uncertainty should be incorporated into economic evaluation and evidence generation strategies from the earliest stage of development of a diagnostic technology. This does not require any new methods development, rather a recognition of the issue and a consistent approach to inclusion.
Choice of Outcome Metrics
Cost utility analysis using the quality adjusted life year (QALY) as an outcome measure and incremental cost-effectiveness ratios (ICERs), is prominent in the HTA of pharmaceuticals and other medical technologies. However, decision makers are likely to find other outcome measures useful, particularly budget impact (Reference Doble11); the ability of patients to enter clinical trials on a timely basis, turnaround time or preservation of tissues (Reference Burris, Saltz and Yu20); impact on capacity constraints (Reference Payne, Eden, Davison and Bakker13); and, the creation of a market for a drug which would not exist without the test (Reference Doble11). For US self-insured employers the most appropriate metric may be cost per member per month requiring information about the budget impact of any new test and resulting cost-offsets further down the clinical pathway (Reference Hart and Spencer7). Diagnostic yield is frequently used as an outcome in economic evaluation but its usefulness to decision makers is limited by the lack of a threshold valuation for a diagnosis (Reference Alam and Schofield21) and the fact that additional diagnoses may have unpredictable impacts on costs (Reference Payne, Gavan, Wright and Thompson2).
Decision makers may value the presentation of a wide range of outcome metrics. The analyst should determine which metrics are important to the specific decision maker. This may impact upon the methods chosen (e.g., cost consequence analysis or budget impact analysis may replace cost-utility analysis).
Challenges Pertinent to the Economic Evaluation of Genomic/Genetic Tests
Heterogeneity of Tests and Platforms
Variation in costs is typical across geographic settings. For genomic/genetic tests, there are some additional challenges due to laboratories using a range of technologies, test configurations and platforms which all impact on costs and may make the synthesis of clinical effectiveness difficult to achieve (Reference Buchanan, Wordsworth and Schuh1;Reference Payne, Gavan, Wright and Thompson2;Reference Hart and Spencer7;Reference Fleeman, Payne, Newman, Howell and Boland9;Reference Payne, Eden, Davison and Bakker13;Reference Garfield, Polisena and Spinner16;Reference Burris, Saltz and Yu20). For test cost, there may be large differences between laboratory developed tests and commercial kits (Reference Buchanan, Wordsworth and Schuh1), no national tariffs or published price lists may exist (Reference Phillips, Deverka and Marshall3;Reference Fleeman, Payne, Newman, Howell and Boland9) and costs have changed over time (Reference Buchanan, Wordsworth and Schuh1;Reference Alam and Schofield21). Costing studies are starting to emerge (Reference Payne, Eden, Davison and Bakker13;Reference Burris, Saltz and Yu20;Reference Siamoglou, Karamperis, Mitropoulou and Patrinos22–Reference Marino, Touzani and Perrier26) and platform Web sites such as Genohub.com maybe a useful source of a range of prices for WGS and multiple gene tests (Reference Payne, Gavan, Wright and Thompson2).
Difficulty in estimating costs is a practical challenge for economic evaluation rather than one requiring methods development (Reference Fugel, Nuijten, Postma and Redekop5). Calls for a national price list (Reference Phillips, Deverka and Marshall3;Reference Fleeman, Payne, Newman, Howell and Boland9) risk the evaluation missing important differences between testing carried out in different locations. Costs per sample are particularly sensitive to the throughput achieved on certain platforms and an important finding of economic evaluation may be that the method used in a specific setting is not an efficient use of resources. Heterogeneity in test performances is another practical problem which may require a different approach to be taken by analysts. For example, Gavan et al. (Reference Gavan, Thompson and Payne17) describe undertaking an HTA of EGFR testing in the UK, where the team failed to develop a model as a result of uncertainties in model structure and lack of data for the range of tests evaluated. Here, it may be appropriate to evaluate an “exemplar” test akin to a Target Product Profile. The analysis could identify an exemplar test configuration, cost and test performance at which the test achieved the goal desired by the decision maker. Individual settings within the jurisdiction could compare their configuration, test performance and cost with the exemplar. This compromise may enable timely (albeit simplified) analyses to be provided to decision makers. An alternative approach may be to have a focused decision problem appropriate to a specific decision maker and setting (Reference Annemans, Redekop and Payne10).
Increasing Stratification
Undeveloped evidence base and complexity of analysis in the evaluation of diagnostic technologies are compounded by genetic stratification of disease, particularly cancer, which increases the level of uncertainty in the evidence base due to small samples and slow recruitment to clinical trials (Reference Buchanan, Wordsworth and Schuh1;Reference Fugel, Nuijten, Postma and Redekop5;Reference Fleeman, Payne, Newman, Howell and Boland9;Reference Alam and Schofield21). New trial designs and observational data may form part of a solution to this issue and new analytical approaches may be required (Reference Payne, Gavan, Wright and Thompson2;Reference Alam and Schofield21).
As discussed under the evidence base challenge, a change in process in the assessment of diagnostics may be required.
Personal Utility (The “Value of Knowing”)
The use of the QALY metric allows comparability across disease areas. However, the tools used to estimate preference-weighted utilities used to calculate QALYs may not be sufficiently sensitive to detect the impact of diagnostic and psychological consequences of testing (Reference Annemans, Redekop and Payne10;Reference Payne, McAllister and Davies15). Where results give rise to clinical actions or a new testing strategy replaces an existing one (i.e., a panel test or WGS replacing serial single gene tests), the QALY may be sufficient to capture value. Where no treatment exists, there is evidence that knowledge of diagnosis alone (or even knowledge that all avenues have been pursued) is valued by some tested individuals and/or their families (Reference Mollison, O’Daniel, Henderson, Berg and Skinner27;Reference Regier, Weymann, Buchanan, Marshall and Wordsworth28). Note that not all patients and their families place a positive value on information itself (Reference Regier, Weymann, Buchanan, Marshall and Wordsworth28). Here, it would be the choice of whether or not to have the information which could be valued or else a disutility included for information which was not wanted. Some studies have started to explore ways in which the value of knowing and other nonhealth benefits (termed “personal utility”) could be incorporated in a cost-utility framework (Reference Regier, Weymann, Buchanan, Marshall and Wordsworth28;Reference Kuppermann, Wang and Wong29).
Methodological development may be required here but if alternative metrics are developed (such as ICECAP (Reference Payne, McAllister and Davies15), discrete choice experiments (Reference Doble11), or cost benefit analysis (Reference Payne, Gavan, Wright and Thompson2)), then the problem of how to incorporate these into an evaluation framework where cost utility and the QALY are the norm remains. Work has been carried out in Canada to develop a measure incorporating the value of both clinical and personal utility (Reference Hayeems, Luca, Pullenayegum, Stephen Meyn and Ungar30). Australian and US bodies have suggested that quantification of health and nonhealth outcomes are necessary for decision making (Reference Fugel, Nuijten, Postma and Redekop5). In the UK, genetic testing is in place which has not, to the best of our knowledge, been evaluated using formal metrics, however, decision makers have been able to reach a decision about the value of the testing (31). Prior to continued methodological development it may be worth determining the extent of decision makers’ need for formal quantification of nonhealth outcomes.
Incidental Findings
Multigene tests and WGS may return IFs in addition to the results sought when the test was ordered (Reference Doble11;Reference Phillips, Douglas, Trosman and Marshall32). IF which are actionable may incur additional diagnostic or treatment costs (Reference Payne, Gavan, Wright and Thompson2;Reference Kuppermann, Wang and Wong29). There may also be an increased risk of treatment with unproven therapies (Reference Phillips, Pletcher and Ladabaum33). Patients are likely to have different preferences for information from IF, which may require development of methods to educate those undergoing testing and to support decision making (Reference Kuppermann, Wang and Wong29;Reference Bennette, Trinidad and Fullerton34;Reference Bennette, Gallego, Burke, Jarvik and Veenstra35). Multiple actionable results from multigene or WGS testing may require development of methods to aggregate results. This may not be straightforward as there may be interactive effects (e.g., on survival) among multiple results and some IFs may not be used until a later time in a patient’s life (Reference Phillips, Deverka and Marshall3;Reference Phillips, Douglas, Trosman and Marshall32).
Several methodological approaches have been suggested to incorporate IF in economic evaluations including backwards induction (Reference Doble11), weighting according to the incidence of actionable results (Reference Plumpton, Pirmohamed and Hughes36) and simplifying the analysis by selecting the most penetrant mutations (Reference Bennette, Gallego, Burke, Jarvik and Veenstra35;Reference Gallego, Shirts and Bennette37). Aggregating results may be more of a theoretical problem than a practical one at present although this may change in time. Payne et al. (Reference Payne, Gavan, Wright and Thompson2) report the use of multidisciplinary reporting committees comprising geneticists, counselors and molecular scientists. Given test results will only be actionable if reported to patients, the reporting effectively frames the intervention for evaluation purposes.
Spillover Effects
Results from genomic/genetic tests may impact on other family members or future generations (Reference Payne, Gavan, Wright and Thompson2;Reference Kuppermann, Wang and Wong29) and upon reproductive decisions (Reference Buchanan, Wordsworth and Schuh1). Such downstream impacts are a challenge in economic evaluations as it is unclear how many generations and how many family members may be affected (Reference Phillips, Douglas, Trosman and Marshall32). Results of an economic evaluation may be sensitive to assumptions around the number of family members impacted by the initial testing (Reference Bennette, Gallego, Burke, Jarvik and Veenstra35).
The number of family members can be established empirically. The number of generations impacted is unknowable. However, methodologically, incorporating benefits into future generations is not challenging, the benefit of extending beyond a certain point will be eroded by discounting and extensive sensitivity analysis can be undertaken. Reproductive decisions are more challenging and raise issues such as those discussed under the Personal Utility heading.
Discussion
We found that the twelve challenges in the economical evaluation of genomic/genetic tests described by Buchanan et al. (Reference Buchanan, Wordsworth and Schuh1) still apply. Choice of perspective and time-horizon are common to all economic evaluation. Five challenges are relevant for all diagnostic technologies (complexity, range of costs, evidence base, behavioral aspects and choice of outcome metric). A further five are particularly pertinent in the evaluation of genomic/genetic tests (heterogeneity of tests and platforms, increasing stratification of disease, personal utility, IFs and spillover effects). Current methods of economic evaluation are generally able to cope with all challenges, apart from those pertinent to genomic/genetic tests where some methodological development may be required. In particular, methods may be required to: improve the balance between timeliness and generalizability of economic evaluations given heterogeneity of tests and platforms; facilitate the inclusion of observational data given increasing stratification of disease; incorporate evidence of personal utility into cost-utility analyses; aggregate the impacts of IFs; and incorporate a utility for reproductive decision making.
This is the first study, to our knowledge to identify challenges in economic evaluation for all types of genomic/genetic tests and to distinguish challenges pertinent to genomic/genetic tests from those relevant for all diagnostics or all health technologies. Numerous papers have identified challenges to economic evaluation of genomic/genetic tests which have been referenced in the main body of this manuscript. Our contribution is to bring previously identified challenges together across all types of genomic/genetic tests and set them out in an accessible manner. A limitation of this study is that, due to inconsistencies in search terminology (Reference Payne, Gavan, Wright and Thompson2) and the use of rapid systematic review methods, it cannot be ruled out that relevant papers will have been missed. However, it is unlikely that a relevant challenge will have been missed as there is considerable overlap between studies.
This study suggests that although some methodological development may be required many challenges require a change of focus or process. Challenges in choice of perspective and time-horizon, complexity, range of costs and choice of outcome metrics can all be tackled by defining the decision problem more closely and focusing on a specific setting and decision maker. The key challenge of under-developed evidence may require process change. More focus on early economic evaluation and more resource for shared evidence generation would appear to be required. Future research in the methodological areas identified would be useful as would process development and evaluation to help the evidence base around genomic tests to be sufficient and relevant to establish both clinical and cost effectiveness.
This article has also set out potential solutions to challenges in the economic evaluation of genomic tests. With the possible exception of the solution suggested to deal with heterogeneity of test costs and platforms, the solutions suggested are not new. Rather, the novelty in our article is in presenting those solutions together with an assessment of whether methods development in economic evaluation is required. It is important to recognize that certain solutions may not be available to analysts working within the confines of a reference case set by a particular reimbursement agency. Reference cases were often developed primarily for the assessment of pharmaceuticals, and adaptations to the challenges of assessing diagnostic technologies may not have been made. The National Institute for Health and Care Excellence (NICE) in the UK is currently undertaking a wide-ranging review of methods which may go some way toward addressing some of the challenges presented here (38). In particular, there are proposals for manufacturers to provide schedules of evidence gaps, for an extension of coverage with evidence development (CED) and the ability to move directly to CED bypassing a first full assessment. We also recognize that the combination of a number of challenges presented here may create difficulties which are greater than the sum of the parts. Although analysts may be constrained by a reference case, we would urge a careful consideration of the scope of any assessment to ensure both that the analysis is manageable and that the results are comprehensible for the decision maker.
Supplementary Material
To view supplementary material for this article, please visit https://doi.org/10.1017/S0266462322000484.
Conflicts of Interest
H.V. and A.R. are employees of BioClavis Limited, a company which is currently developing molecular diagnostic tests for clinical use. K.O., S.D., N.H., and J.B. are partially funded by BioClavis Limited. R.H. declares no conflicts of interest.
Funding Statement
J.B. is funded by a Knowledge Transfer Partnership 12310 between the University of Glasgow and BioClavis Limited. N.H., S.D., and K.O. are partly funded by Knowledge Transfer Partnership 12310 between the University of Glasgow and BioClavis Limited. H.V. and A.R. are employees of BioClavis Limited. R.H. received no funding associated with this article.