Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-02-06T16:17:17.643Z Has data issue: false hasContentIssue false

CAN WE RELIABLY BENCHMARK HEALTH TECHNOLOGY ASSESSMENT ORGANIZATIONS?

Published online by Cambridge University Press:  13 April 2012

Michael Drummond
Affiliation:
University of York email: mike.drummond@york.ac.uk
Peter Neumann
Affiliation:
Tufts Medical Center
Bengt Jönsson
Affiliation:
Stockholm School of Economics
Bryan Luce
Affiliation:
United BioSource Corporation
J. Sanford Schwartz
Affiliation:
University of Pennsylvania
Uwe Siebert
Affiliation:
University for Health Sciences, Medical Informatics and Technology
Sean D. Sullivan
Affiliation:
University of Washington
Rights & Permissions [Opens in a new window]

Abstract

Objectives: In recent years, there has been growth in the use of health technology assessment (HTA) for making decisions about the reimbursement, coverage, or guidance on the use of health technologies. Given this greater emphasis on the use of HTA, it is important to develop standards of good practice and to benchmark the various HTA organizations against these standards.

Methods: This study discusses the conceptual and methodological challenges associated with benchmarking HTA organizations and proposes a series of audit questions based on a previously published set of principles of good practice.

Results and Conclusions: It is concluded that a benchmarking exercise would be feasible and useful, although the question of who should do the benchmarking requires further discussion. Key issues for further research are the alternative methods for weighting the various principles and for generating an overall score, or summary statement of adherence to the principles. Any weighting system, if developed, would need to be explored in different jurisdictions to assess the extent to which the relative importance of the principles is perceived to vary. Finally, the development and precise wording of the audit questions requires further study, with a view to making the questions as unambiguous as possible, and the reproducibility of the assessments as high as possible.

Type
METHODS
Copyright
Copyright © Cambridge University Press 2012

In recent years, there has been growth in the use of health technology assessment (HTA) for making decisions about the reimbursement, coverage, or guidance on the use of health technologies. Given this greater emphasis on the use of HTA, it is important to develop standards of good practice and to benchmark the various HTA organizations against these standards. One of the first groups to tackle this issue of standards was EUR-ASSESS, but they did not attempt benchmarking (Reference Busse, Orvain and Velasco3). More recently the EUnetHTA project has addressed the issue of standardization of HTA, both through the work package on the “core HTA” and that concerned with the transferability of HTA results from setting to setting. However, the main emphasis has been on the harmonization of approaches in HTA, as opposed to benchmarking (9).

Building on these earlier efforts, in an earlier study, we outlined a set of fifteen key principles for the conduct of HTA for resource allocation decisions (Reference Drummond, Schwartz, Jönsson, Luce and Neumann8). These principles described and discussed elements of good practice in developing the structure and remit of HTA organizations, the methods of HTA, the processes for conducting HTA (e.g., the engagement of stakeholders) and the use of HTA in decision making (e.g., the timeliness of assessments and the link between the analysis and the decision).

Then, in a second study, we made a first, high-level assessment of the extent to which the various principles were supported and used by a sample of HTA organizations (Reference Neumann, Drummond and Jönsson15). Although the second study was intended to focus on whether the principles were widely followed, several observers noted that the data presented could be interpreted by the reader as an attempt to benchmark the various HTA organizations themselves. A third study used the fifteen principles to inform a discussion of their relevance and application in HTA activities in the Central and Latin American region (Reference Pichon-Riviere, Augustovski and Rubinstein18).

These two assessments of the application of the fifteen key principles as gross benchmarking metrics raise the question of whether the methodology for benchmarking organizations is robust enough for that purpose (Reference Gibson and Little10;Reference Neumann, Drummond and Jönsson16). Therefore, the objectives of this study are to extend our previous work and to explore the methodological challenges associated with benchmarking HTA organizations, using the fifteen key principles as the starting point. In doing so, we also discuss the potential for developing a set of audit criteria and approaches for producing an overall score, or other summary measure of adherence to the principles.

CONCEPTUAL AND METHODOLOGICAL CHALLENGES ASSOCIATED WITH BENCHMARKING HTA ORGANIZATIONS AND SYSTEMS

Accommodating the Varying Role or Remit of HTA Organizations

In discussing the application of the fifteen key principles by HTA organizations, it immediately became apparent that the role, or remit, given to a particular organization or agency could limit its potential to support or use a given principle. A good example is Principle 3, which states that “HTA should include all relevant technologies.” Clearly, if the remit of the HTA organization limits its focus to pharmaceuticals, it would not be able to adopt the principle. Of course, adherence to this principle is important within a given jurisdiction, if the benefits from the resources devoted to the assessment of health technologies are to be maximized. But adherence is probably not the responsibility of a given HTA organization. Rather, it represents an issue that needs to be resolved at the level of a given jurisdiction, either by giving a single organization a wider remit, or by establishing a range of organizations or agencies, each covering different types of health technologies. For example, in Washington State in the United States there are at least three separate HTA bodies with limited statutory authority to evaluate a specific set of health technologies. This raises the question of whether or not it is better to locate all HTA activities in a single organization with common procedures.

Location of all activities in a single organization increases the chances of securing a common approach to the scrutiny of all health technologies, but where this is not possible extra efforts should be made to ensure that all the organizations within a given jurisdiction adopt the principles to a similar extent.

In Sweden, the Swedish Council on Health Technology Assessment (SBU) is undertaking HTA studies for a variety of technologies, while the National Board of Health and Welfare is responsible for therapeutic guidelines for specific diseases. The Dental and Pharmaceutical Benefits Agency (TLV) in Sweden makes reimbursement decisions for prescription drugs (and from 1 January 2011, in a pilot project for drugs used in hospitals) and dental procedures. Increasingly there is collaboration between the agencies in carrying out studies for specific technologies. There is also an ongoing discussion of establishing a “Treatment Benefits Board” in Sweden (Reference Carlsson, Alwin and Brodtkorb4). Thus our principles may be used for evaluating a specific agency or the “system” of HTA organizations in a given jurisdiction. For example, Principle 3 would be relevant for the ongoing discussion in Sweden about the pros and cons of including medical devices in a formal HTA-based reimbursement process.

Indeed, although the role or remit given to HTA organizations may sometimes be governed by legal or political factors, on occasions it may just be a result of local convention. In such cases consideration of the principles could be a vehicle for stimulating local debate on the wisdom, or otherwise, of a particular methodological approach or operating procedure. For example, in the United Kingdom, there is currently a debate about whether the National Institute for Health and Clinical Excellence (NICE) should adopt a broader, societal perspective (Principle 7) (Reference Claxton, Walker, Palmer and Sculpher6;Reference Johannesson, Jönsson, Jönsson, Kobelt and Zethras12).

Another factor that might limit an organization's ability to adhere to the various principles is the nature of its incorporation (e.g., whether public or private). The vast majority of organizations conducting or using HTAs are publicly funded, but some, particularly in the United States, are private (e.g., some insurance plans). In considering Principle 2, “HTA should be an unbiased and transparent exercise,” we would normally expect publicly funded bodies to be transparent in their procedures. However, private organizations, because they are privately financed and governed and because they operate in a competitive environment, might be reluctant, or unable, to reveal certain items of information, such as the price they pay for acquiring technologies.

Therefore, the potential for the role or remit of an HTA organization to constrain its adoption of certain key principles suggests that, in any judgment of the organization's performance against a set of standards, a distinction might be made between “maximum score” and “maximum attainable score”. For example, if the maximum score to be obtained through adherence to all the principles was 100, but that adherence to Principle 3 (“HTA should include all relevant technologies”) attracted a score of 5, an organization whose remit restricted its ability to score on that principle would have a maximum attainable score of 95, not 100.

Thus it is important to make a distinction between whether a specific organization performs well according to its role or remit, and whether it performs well according to each principle. A low score on an issue that is worth observing and discussing does not necessarily indicate that the organization is performing sub-optimally within its remit. There may be many reasons (experience, resources, or the specific remit) why an organization performs better or worse, and the purpose of the benchmarking exercise is to identify good and bad practices and opportunities for improvement, not to criticize individual agencies.

Producing a summary statement of adherence to the principles

There are various ways in which one might produce a summary statement of an HTA organization's adherence to the principles. The simplest approach would be to present the scores against each of the principles, without necessarily seeking to amalgamate them. For example, the “balanced scorecard” presents achievements in four dimensions and has been recommended as a strategic management system for organizations operating in the private sector (Kaplan and Norton) (Reference Kaplan and Norton13). Within the healthcare field, a common representation of the improvements in health-related quality of life using the SF-36 instrument is to produce a profile, showing improvements in each of the separate dimensions (19).

An alternative would be to attempt to derive a single index or overall score. This has both strengths and limitations. In producing an overall score, or “index of attainment,” against the principles, it would also be necessary to consider whether all the principles should have equal weight. A simple addition of scores across the fifteen principles would suggest equal weighting. Derivation of weights is important, not only to determine the relative importance of the various principles, but also to explore the trade-offs between and among them. For example, an HTA organization performing well on Principle 10 (“Those conducting HTAs should actively engage all key stakeholder groups”) might struggle to perform well on Principle 13 (“HTA should be timely”), because meaningful engagement with stakeholders can take time. Therefore, the weights attached to the various principles signify the trade-offs being applied within a given jurisdiction. Methods, such as discrete choice experiments (DCEs), exist to explore these trade-offs and have been widely applied to analyze choices between healthcare interventions (Reference Bridges, Hauber and Marshall2) and choices between and among the various attributes of economic evaluations (Reference Chiou, Hay and Wallace5).

We do not explore this issue further here, because such trade-offs among the fifteen key principles probably need to be resolved at the local level. For example, different jurisdictions might assign different levels of importance to stakeholder involvement, or reach different conclusions on the balance between the methodological rigor and transparency in conducting HTAs and the timeliness of assessments. However, a benchmarking exercise makes these choices explicit and transparent across different jurisdictions. Our view is that all fifteen principles are relevant to an international assessment of HTA organizations or systems. Therefore, if a given principle is not deemed relevant in a given jurisdiction, a full justification should be given for adopting this position.

Of course, this raises the related issue of who should assign the weights. Initially, it would make sense to conduct research in several jurisdictions to assess whether a representative set of international weights could be developed. If not, weights could be assigned by a representative sample of the general population in each jurisdiction or, more likely, their elected representatives, although this may also be a matter for local debate. Alternatively, weights could be assigned by different stakeholders, thereby providing information about differences in preferences and priorities within a given jurisdiction.

Developing Unambiguous Audit Criteria

One of the weaknesses of any assessment process is that the assessments themselves may be open to interpretation. For example, in the earlier study on application of the key principles (Reference Neumann, Drummond and Jönsson15), we considered the assessment of whether the HTA organization supported the principle to be reasonably sound, because it was usually based on the organization's own documentation. (For example, it would be apparent whether the organization had developed methodological guidelines for the assessment of costs and benefits of technologies, or had established procedures for engaging with stakeholders.)

However, the assessments of use of the key principles were generally more open to interpretation. For example, judgment was required in assessing whether an HTA organization “actively engaged all key stakeholder groups” (Principle Reference Gibson and Little10). Does “active engagement” mean that draft reports are circulated for comment, or something more substantial, such as stakeholder participation in committees?

To reliably benchmark HTA entities, it is necessary to convert the general principles into a series of audit questions of high validity that can be answered as unambiguously as possible (possibly in terms of “yes” or “no,” or more likely through a Likert Scale such as “Never,” “Some of the Time,” “Most of the Time,” “Always”). Because the development of a series of audit questions is an important step in a benchmarking process, some examples are given below.

Recognizing the Stage of Development of the HTA Organization

In our earlier study considering the extent of application of the principles (Reference Neumann, Drummond and Jönsson15), we noted that the different HTA organizations considered had been in existence for different periods of time. This factor should probably be taken into account in any benchmarking exercise, because it would be unreasonable for a newly-formed HTA organization either to have formed a view on various components of its practice, or to have had time to demonstrate that it adhered to all of the principles. For example, it may take time for a new organization to develop a set of methods guidelines or to determine the precise details of its interactions with stakeholder groups. It may also take time for a new organization to argue that some elements of its remit are too constraining on its operations. Also, if an HTA organization has not had the time to build up a competence in health economics it may find it difficult to incorporate the correct methods for assessing costs and benefits (Principle 5), and without adequate statistical expertise it may find it difficult to characterize uncertainty adequately (Principle 8). However, the fact that there often are reasonable explanations for the performance of a specific organization does not diminish the value of the exercise. The purpose is to provide useful suggestions and arguments for improvement.

AUDIT CRITERIA FOR BENCHMARKING HTA ORGANIZATIONS AND SYSTEMS

Table 1 provides a series of audit questions developed for each of the fifteen key principles. An initial list of questions was developed by one of the authors (MD), based on the text of the original study. This initial list was then commented on by a full meeting of the group and group members were invited to suggest amendments or to propose supplementary questions. This consolidated list was then reviewed and agreed by all members of the group. It is clear that while the true meaning of adherence to any principle is debatable, most of the questions can be answered unambiguously. With more thought, it is possible that more such questions could be generated. For example, a previous exercise by Schwarzer and Siebert (Reference Schwarzer and Siebert20) identified ninety characteristics in eight domains in a comparison of HTA agencies in Germany, the United Kingdom, France, and Sweden.

Table 1. Audit Questions Based on the Key Principles

In this preliminary exercise, it was possible to generate more audit questions in relation to some principles as opposed to others. Depending on the scoring system used, an imbalance in the number of questions could lead to more weight being given to some principles relative to others. For example, under a simple scoring system whereby a positive response to any audit question generated the same score, a principle for which more audit questions had been defined would contribute more to the overall score. Factors such as this need to be taken into account when generating any overall score.

Nevertheless, the audit questions presented here suggest that it is possible to move from general principles to a series of focused questions that can be answered in a reasonably unambiguous manner. This would greatly assist in improving the reproducibility of assessments. However, even if the original fifteen principles were accepted as being relevant and comprehensive, the development of associated audit questions is complex and challenging. They may even need to be jurisdiction-specific, although this would limit the ability to make comparisons between entities from different jurisdictions.

Ideally, it would be better to develop and maintain a common set of principles and associated audit questions that are common across all jurisdictions. Then, depending on the extent to which the importance of the various principles, or the audit questions that assess adherence to them, vary by jurisdiction, the weights could be varied. Indeed, some principles, or particular audit questions, could have zero weight, if they were inappropriate or unimportant in a given jurisdiction.

On the other hand, it is possible that in some jurisdictions particular principles are viewed as being so important that quantitative targets are considered necessary. An example here is Principle 13 (“HTA should be timely”). The current audit questions ask only whether the organization has a definite time period for reaching decisions and whether this is adhered to. However, in a given jurisdiction it might be considered important to set a particular time limit (e.g., 90 days) for reaching a decision. Indeed, such time limits exist for a variety of regulatory and reimbursement decisions within some jurisdictions. For example, in the United States, the Medicare Modernization Act of 2004 legislated a 6-month review time limit for an initial draft of Centers for Medicare and Medicaid Services (CMS) national coverage decisions (Reference Neumann, Kamae and Palmer17). In the United Kingdom, the Scottish Medicines Consortium “aims to issue advice on all newly licensed medicines within 12 weeks of products being made available” (21).

DISCUSSION AND CONCLUSIONS

This study discusses conceptual and methodological challenges of benchmarking HTA entities, using fifteen key principles (Reference Drummond, Schwartz, Jönsson, Luce and Neumann8) as a starting point. A benchmarking exercise seems feasible and, with appropriate attention to the points identified above, may be reliable. However, several other issues merit discussion.

First, is a benchmarking exercise worthwhile? It is clear from reactions to our earlier studies that some parties have concerns about setting standards for HTA organizations and assessing adherence to the standards. These include the fact that some of the standards could be considered to be idealistic, or in part contradictory (Reference Banta1;Reference Hailey11). In addition, some parties question whether appropriate comparisons can be made between agencies, given their varying roles and responsibilities (Reference Gibson and Little10). Our view remains that, despite the methodological and practical challenges, such an exercise is warranted, and even necessary, given the extent of public and private resources invested in HTA and its increasing impact on policy and clinical decision making and access to care. Indeed, in several fields benchmarking is regarded as a natural and essential process. For example, in the United States, hospitals, health plans, physicians, nursing homes and pharmacies are routinely benchmarked against standards and against each other. For example, the National Committee on Quality Assurance (NCQA) performs quality audits on health insurance and other payers, and produces the Healthcare Effectiveness Data and information Set (HEDIS) (14).

Second, are the fifteen key principles the appropriate starting point? The principles were developed to build upon previous attempts to develop standards for HTA that were already in the literature. However, they were developed to encourage debate and stimulate closer scrutiny. Benchmarking exercises and understanding differences among HTA organizations in their acceptance and use of the principles will be useful is determining the need to reconsider, adapt, revise or reject individual principles.

Third, who should undertake the benchmarking? The objective of benchmarking any professional activity should be self-improvement rather than naming and shaming. This suggests that any successful benchmarking exercise is likely to require some level of involvement from the HTA entities themselves. However, self-regulation and assessment have well-recognized limitations. Although self-benchmarking is clearly an option, it will be preferable also to involve an external, independent body working with the organizations concerned. This is the approach that has been followed in the benchmarking of drug licensing agencies, involving the CMR International Institute for Regulatory Science, an independent not-for-profit organization (7). Several other organizations engage in voluntary accreditation exercises, including the International Society for Quality Assurance (ISQuA). ISQuA has branded several organizations as “high quality,” based on a set of principles that all organizations in that field accept as the ones to comply with. The various methodologies should also be reviewed for their suitability for this task.

The question posed in this study was “Can we reliably benchmark HTA organizations?” We have shown how the relative performance of HTA organizations might be assessed against an explicit set of principles and have highlighted some of the shortcomings of such an approach. Of course, this does not prove that this is the most relevant set of principles, or that they should have equal weight in all jurisdictions. Nevertheless, we believe that the transparency and explicitness of the process provides a useful starting point for a discussion of good practice within HTA organizations.

However, our proposals also suggest an agenda for further research. First, research is required into the alternative methods for weighting the various principles and for producing a summary of performance, such as an overall score. A weighting system, once developed, would then need to be tested in different jurisdictions to assess the extent to which the relative importance of the principles is perceived to vary.

Second, the development and precise wording of the audit questions requires further study and refinement, with a view to making the questions as unambiguous as possible, and the reproducibility of the assessments as high as possible.

CONTACT INFORMATION

Michael Drummond, BSc, MCom, DPhil, Professor of Health Economics, Centre for Health Economics, Alcuin A Block, University of York, Heslington, York, Unjted Kingdom

Peter Neumann, ScD, Professor of Medicine, Tufts Medical Centre, Boston, Massachusetts

Bengt Jönsson, PhD, Professor of Economics, Stockholm School of Economics, Stockholm, Sweden

Bryan Luce, PhD, MBA, Senior Vice President, United BioSource Corporation, Washington, DC

J. Sanford Schwartz, MD, Professor of Medicine, Health Care Management, and Economics, Wharton School and Leon Hess Professor of Internal Medicine, University of Pennsylvania, Philadelphia, Pennsylvania

Uwe Siebert, MD, MPH, MSc, ScD, Professor of Public Health, University for Health Sciences, Medical Informatics and Technology, Hall, Austria

Sean D. Sullivan, PPh, PhD, Professor of Pharmacy and Health Services, Pharmaceutical Outcomes Research and Policy Program, University of Washington, Seattle, Washington

CONFLICTS OF INTEREST

Michael Drummond, Bryan Luce, Bengt Jönsson, Peter Neumann, Uwe Siebert, and Sean Sullivan have received funding for membership in the International Group for HTA Advancement from Merck and Co., and Michael Drummond has also been a member of an expert committee for NICE. J. Sanford Schwartz has not declared his possible conflicts of interest.

References

REFERENCES

1.Banta, HD. Commentary on the article ‘Key principles for the improved conduct of health technology assessments for resource allocation decisions.’ Int J Technol Assess Health Care. 2008;24:362365.CrossRefGoogle Scholar
2.Bridges, JP, Hauber, B, Marshall, D, et al. Conjoint analysis applications in health: A checklist. A report of the ISPOR Good Research Practices for Conjoint Analysis Task Force. Value Health. 2011;14:403413.Google Scholar
3.Busse, FR, Orvain, J, Velasco, M, et al. Best practice in undertaking and reporting health technology assessments. Int J Technol Assess Health Care. 2002;18:361422.CrossRefGoogle Scholar
4.Carlsson, P., Alwin, J, Brodtkorb, T-H, et al. Nationellt system förutvärdering, prioriteringochinförandebeslutavicke-farmacologiskas-jukvårdsteknologier – en förstudie.CMT Report 2010:1; Linköpings University, Linköping; 2010.Google Scholar
5.Chiou, C-F, Hay, JW, Wallace, JF, et al. Development and validation of a grading system for the quality of cost-effectiveness studies. Med Care. 2003;41:3244.CrossRefGoogle ScholarPubMed
6.Claxton, K, Walker, S, Palmer, S, Sculpher, M. Appropriate perspectives for health care decisions. CHE Research Paper. York: Centre for Health Economics, University of York; 2010.Google Scholar
7.CMR International Institute for Regulatory Science. Agenda 2010. Expediting patients’ access to new therapies. London: CMR International; 2010.Google Scholar
8.Drummond, MF, Schwartz, JS, Jönsson, B, Luce, BR, Neumann, PJ. Key principles for the improved conduct of health technology assessments for resource allocation decisions. Int J Technol Assess Health Care. 2008; 24:244258.Google Scholar
9.EUnetHTA Joint Action. www.eunethta.org (accessed October 26, 2011).Google Scholar
10.Gibson, JM, Little, A. Evaluating HTA principles. Int J Technol Assess Health Care. 2010; 26:428429.Google Scholar
11.Hailey, D. Commentary on the article ‘Key principles for the improved conduct of health technology assessments for resource allocation decisions’. Int J Technol Assess Health Care. 2008;24:365366.CrossRefGoogle Scholar
12.Johannesson, M, Jönsson, B, Jönsson, L, Kobelt, G, Zethras, N. Why should economic evaluations of medical innovations have a societal perspective? OHE Occasional Paper. London: Office of Health Economics; 2009.CrossRefGoogle Scholar
13.Kaplan, RP, Norton, DP. Using the balanced scorecard as a strategic management system. Tampa, FL: Harvard Business Review; 1996. January-February, 7585.Google Scholar
14.NCQA. About NCQA. http://www.ncqa.org/ (accessed May 13, 2011).Google Scholar
15.Neumann, PJ, Drummond, MF, Jönsson, B, et al. Are key principles for improved health technology assessment supported and used by health technology assessment organizations? Int J Technol Assess Health Care. 2010;26:7178.Google ScholarPubMed
16.Neumann, PJ, Drummond, MF, Jönsson, B, et al. Evaluating HTA principles. Letter to the Editor. Int J Technol Assess Health Care. 2010;26:429430.Google Scholar
17.Neumann, PJ, Kamae, MS, Palmer, JA. Medicare's national coverage decisions for technologies, 1999-2007. Health Aff (Millwood). 2008;27:16201631.Google Scholar
18.Pichon-Riviere, A, Augustovski, F, Rubinstein, A, et al. Health technology assessment for resource allocation decisions: Are key principles relevant for Latin America? Int. J Technol Assess Health Care. 2010;26:421427.Google Scholar
19.Qualitymetric. The SF-36v2 health survey. www.qualitymetric.com (accessed October 31, 2011).Google Scholar
20.Schwarzer, R, Siebert, U. Methods, procedures and contextual characteristics of health technology assessment and health policy decision making: Comparison of health technology assessment agencies in Germany, United Kingdom, France and Sweden. Int J Technol Assess Health Care. 2009;25:305314.CrossRefGoogle ScholarPubMed
21.Scottish Medicines Consortium. Submission process. Scottish Medicines Consortium. http://www.scottishmedicines.org.uk/ (accessed May 13, 2011).Google Scholar
Figure 0

Table 1. Audit Questions Based on the Key Principles