Published online by Cambridge University Press: 28 May 2004
Objectives: To elucidate important differences between a health technology assessment (HTA) and a systematic review, using an HTA of positron emission tomography (PET) as an example.
Methods: Interviews with seventeen individuals who were authors or users of the PET HTA.
Results: Those interviewed identified seven areas in which HTAs often differ from traditional systematic reviews: (i) methodological standards (HTAs may include literature of relatively poor methodological quality if a topic is of importance to decision-makers), (ii) replication of previous studies (relatively common for HTAs but not systematic reviews), (iii) choice of topics (more policy oriented for HTAs, while systematic reviews tend to be driven by researcher interest), (iv) inclusion of content experts and policy-makers as authors (policy-makers more likely to be included in HTAs, although there are potential conflicts of interest), (v) inclusion of economic evaluations (more often with HTAs, although economic evaluations based upon poor clinical data may not be useful), (vi) making policy recommendations (more likely with HTAs, although this must be done with caution), and (vii) dissemination of the report (more often actively done for HTAs).
Conclusions: This case study of an HTA of PET scanning confirms that HTAs are a bridge between science and policy and require a balance between the ideals of scientific rigor and the realities of policy making.
Health technology assessment (HTA) is a multidisciplinary field that studies the medical, social, ethical, and economic implications of the development, use, and diffusion of health technologies (6). It has been described as the “bridge between the world of research and the world of decision making” (1). HTAs are being performed with increasing frequency and influence decision making in many jurisdictions. To effectively influence policy-makers, the authors of HTAs must not only strive for scientific accuracy but must also be aware of other issues such as the optimal timing of the reports' release, their political sensitivity, who the important decision-makers are, and how best to disseminate the results (8).
The Institute for Clinical Evaluative Sciences (ICES) recently conducted a health technology assessment of positron emission tomography (PET) scanning (5). After the report was completed and distributed, key stakeholders were interviewed about their perceptions of the approach that had been adopted. In this study, we describe their specific comments about the ICES HTA of PET and discuss the implications of their comments for HTAs in general, with particular emphasis upon the difference between a systematic review of the literature and an HTA.
In the summer of 2000, ICES was asked by the Committee on Technical Fees (CTF) to perform an HTA to determine the incremental benefit of PET compared with existing diagnostic technologies. ICES is an Ontario-based health services research organization, funded at arms length by the Ontario Ministry of Health and Long-term Care (MOHLTC) and external grants (5). The CTF was a committee consisting of representatives from the MOHLTC, the Ontario Medical Association (OMA), and the Ontario Hospital Association (OHA). It had been formed recently to evaluate the utilization of diagnostic tests in Ontario because their use was believed by some to be increasing at an inappropriately rapid rate. ICES and the chair of the CTF signed a contract guaranteeing that ICES could publish the results irrespective of their content and committing ICES to complete the report by May 2001.
The core team that conducted the HTA report consisted of eleven individuals, including a chair (a general internist), two oncologists and geriatricians, a neurologist with training in PET, a cardiologist, a pharmacist with health economics training, and research support staff. All clinicians except the neurologist had formal education in epidemiology or health services research but no experience with PET scanning. A systematic review of the peer-reviewed and gray literature was conducted. The search targeted articles and reports from the years 1975 to 2000 that investigated the potential usefulness of PET in oncology, neurology, and cardiology (see full report for details; 5). The findings of those original articles that met predefined methodological standards were tabulated in detail, and previous HTAs by other organizations were each summarized. Approximately three months before the release of the report, stakeholders were invited to a one-day meeting to discuss their opinions of a draft of the report. These stakeholders were chosen to reflect the views of various constituencies, including nuclear medicine, radiology, oncology, cardiology, neurology, general internal medicine, health economics, the MOHLTC, and the public. Stakeholders provided the authors with written comments and verbal input at the time of the meeting, and some subsequently contacted the authors to provide references that they believed had been inappropriately omitted from the draft report.
The comments that the authors considered valid were incorporated into the final report, which was delivered to the CTF on May 31, 2001. The report's main findings were that (i) the methodological quality of PET studies was disappointingly low, (ii) despite the shortcomings of the literature, PET might be useful and cost-effective for patients with selected cancers and intractable seizures, (iii) PET was not currently indicated for the investigation of patients with dementia or cardiac disease, (iv) no studies about the cost-effectiveness of PET in Ontario or Canada could be found, (v) in the future, PET might be found to have a role in determining which patients with severe heart failure would benefit from revascularization, and this possibility should be a major focus of research into PET, and (vi) if PET scanning is introduced in Ontario as a regular service, all patients receiving a PET scan should be asked to enroll in a prospective registry, to evaluate the real world use of PET (5).
Members of the team that produced the PET HTA (n = 11) and six stakeholders—nuclear medicine physicians (n = 2), health economist (n = 1), MOHLTC representative (n = 1), and internists (n = 2)—were interviewed in person or over the phone approximately two months after the completion of the report. All interviews were conducted by one individual and were taped. An interview guide was followed, focusing on the following general areas related to the ICES PET report: (i) general impressions, (ii) motivation for the report, (iii) methodology used, (iv) interpretation of available information, and (v) implementation of the findings of the report. All topics in the interview guide were completed by all interviewees. Tapes of the interview were transcribed in point form, and responses were tabulated into thematic categories.
Analysis of the transcripts revealed that those interviewed had identified seven areas of potential tension between the scientific rigor that is ideal in a systematic review of the literature, and the realities of policy making that impact upon the conduct of an HTA (Table 1).
For the ICES report, the authors used a previously published scheme to rate the methodological quality of the primary studies identified by the literature review, which ranged from Grade A to D (5). Before starting the literature review, a decision was made to only summarize Grade A and B studies in the report. A Grade A study was prospective, had broad generalizability, and no significant methodological flaws. A Grade B study was a prospective study with narrower generalizability, but few and well-described flaws. There were several Grade B studies for oncological indications. However, there were no Grade A or B studies in cardiology or for intractable seizures. When this was realized, a decision was made to summarize the “highest quality” Grade C studies for those two indications.
A majority of those interviewed (fourteen of seventeen) labeled the general quality of the literature about PET as poor. Small sample sizes, few Canadian studies, and a dearth of good quality cardiology and cost-effectiveness studies were identified as major drawbacks of the literature. A minority (three of seventeen) believed that poor literature had been inappropriately included in the HTA. Concerns were also expressed that different standards of evaluation had been applied across the different clinical areas. Four respondents believed that cardiology had been held to a higher standard, and two suggested that the report's recommendations about the use of PET for intractable seizures incorporated a lower standard of evidence than was used for oncology and cardiology.
It is universally agreed that methodologically poor studies often overestimate the benefits of a therapy or the value of a diagnostic test. Many methodologists argue that only Grade A studies should be included in a systematic review (9), because to do otherwise would expose the unsuspecting public to unproven technologies, perhaps at the expense of other technologies for which there is better evidence. On the other hand, clinicians and manufacturers of technologies often argue that it is difficult to conduct high-quality studies early in the life of a new technology and that ignoring a large body of literature suggesting a technology is beneficial unethically deprives the public of access to that technology.
Although no Grade A or B studies were found for patients with intractable seizures, this shortcoming was regarded as understandable because of the small number of patients with such disorders, making it difficult to conduct studies with large sample sizes. Thus, some studies of poorer methodological quality in patients with seizures were considered in the HTA. This decision was also likely influenced by the authors' empathy for patients with this condition—many of whom are very young, face a difficult decision about risky surgery, and have few alternatives. The report concluded that PET scanning should have a limited role in the investigation of patients with intractable seizures being considered for surgery. Most authors felt comfortable with this conclusion because there are few such patients, and the use of PET for this indication would not have large overall resource implications.
At the beginning of the study there were no Grade A or B studies for cardiologic indications. Unlike patients with intractable seizures, there are many patients with severe heart failure, so the paucity of good evidence could not be justified by the lack of patients to be studied. Furthermore, the resource implications of using PET for patients with heart failure are potentially enormous. Therefore, the authors believed that evidence from Grade A or B studies was required before PET could be recommended for patients with heart failure. However, there was considerable pressure from some cardiologists to introduce PET based upon the existing literature. It seemed unwise to dismiss these views with one sentence in the report declaring that the literature was of insufficient quality to be considered seriously. Thus, it was decided to review those Grade C studies (all retrospective) that were believed to be of highest quality, but not to endorse PET scanning for patients with heart failure because of the lack of good-quality studies. While the report was being written, a randomized trial in patients with heart failure was published that found no effect of PET upon mortality, compared with conventional imaging (10). Although the trial had some shortcomings, particularly related to patient selection, this study reinforced the authors' decision not to recommend PET on the basis of the Grade C evidence.
Between 1990 and 2000, at least thirty-six other published HTAs had investigated the clinical benefit of PET, many of which used well-established methods (5). One report published in 1999 was from Ontario, authored by the Council of Medical Imaging of the OMA (2). However, it did not meet many of the accepted standards of a high-quality systematic review. In particular, it had no clear literature search strategy and did not rank articles in terms of methodological quality. After considerable discussion, the principal authors of the ICES report decided that they should perform their own systematic review of the literature, while at the same time summarizing the results of the other HTAs in their report.
Some of the interviewed authors of the ICES report were concerned that performing a new HTA led to unnecessary duplication of previous work. Two of the stakeholders argued that the ICES report did not adequately acknowledge the contribution of the OMA study. Conversely, some of the authors believed that the OMA study was so methodologically flawed and inferior to other HTAs that it would have been a mistake to feature it prominently in the ICES report, despite its being an Ontario-based study.
At first blush, it seems a great waste of effort for multiple groups to use similar methods to review the same literature about the same technology. However, the ICES authors believed that their conclusions about PET would inevitably be challenged no matter what their findings—by the government if the report suggested that PET was an established technology and should be considered as an insured service and by physicians and patients if the evidence for PET's beneficial impact upon patient care was not yet established. In both circumstances, it would be crucial for the authors to have a thorough grasp of the literature and, in particular, of the merits of individual studies. The authors believed that they could only attain the appropriate level of understanding if they had reviewed the individual studies themselves and that thorough knowledge of an HTA report produced by others would be insufficient. This approach is quite different from that of a “pure” systematic review, which does not need to be repeated unless the previous review was poorly done, or important new information needs to be added to a previous review.
An HTA must determine the scope of the material to be reviewed before the study is started. Before starting the HTA, it was clear that the best evidence about the impact of PET upon clinically important outcomes was in oncology. However, decisions had to be made about which cancers to include and whether to include studies that evaluated the use of PET for determining the size of the radiotherapy field (rather than only focusing on its diagnostic role in oncology), the use of PET in patients with possible dementia, and its use in cardiology. The ICES authors chose to evaluate the diagnostic use of PET for those relatively common cancers for which the evidence was the strongest (lung, colon, breast, head, and neck, melanoma and lymphoma). At the suggestion of some of the oncologists at the stakeholders meeting, the use of PET in patients with brain tumors was later added (however, no Grade A of B studies were found). In addition, the authors decided to evaluate PET's use in dementia and heart failure, despite relatively poor-quality evidence, because the two are both common conditions that were prominently mentioned in the OMA report. It was also recognized that the prevalence of heart failure is increasing and that there are no good diagnostic tests for determining which patients benefit from revascularization. On the other hand, the use of PET to guide radiotherapy and to investigate potential coronary artery disease were not included—the former because of the lack of good-quality trials, and the latter because one high-quality economic evaluation had found that PET had an extremely unattractive cost-effectiveness ratio compared with the many diagnostic modalities already available (4).
Some respondents were concerned that not evaluating PET for conditions for which the literature was poor could delay the future clinical use of PET (six individuals) and curtail future research opportunities (seven individuals). On the other hand, four individuals contended that the report would actually encourage new research into PET.
The criteria used to delineate the scope of the study are often different for HTAs and traditional systematic reviews. For systematic reviews, the most important features tend to be the amount and quality of the literature and the interest of the reviewer. For an HTA, these factors are modified by the need for a timely report and the need to address issues of importance to policy-makers who can be pressured by industry, patients, and clinicians to introduce a technology in the absence of good-quality evidence. At the same time, the HTA authors must be wary of the financial interest of payers in minimizing the evidence in favor of a technology, if that technology is expected to add considerably to their budget. The pressure to introduce PET, combined with the ICES authors' desire to include some of the topics considered in the OMA report, led the authors to include intractable seizures, dementia, and heart failure in the ICES HTA, despite the overall poor quality of that literature. On the other hand, the need to complete the report in approximately six months contributed to the authors' decision not to review PET for the evaluation of coronary artery disease.
It is interesting that several interviewees believed that, by not including potential indications for which the literature was poor, the report might inhibit the future use of PET for those indications. This finding seems to reflect a lack of confidence in the degree to which evidence is used in decision making. In the ideal world, the literature would be constantly updated, and once evidence for the usefulness of PET for a “new” indication is available, PET would be introduced for that indication. However, this rarely occurs. Some of those interviewed suggested that they were worried that decision-makers might approve PET for the few indications mentioned in the report and then use the report in the future to deny PET for other indications. Their concerns may be valid, but it is not clear that including poor-quality evidence in the report would remedy this problem. However, there can be a “Catch-22” associated with the evaluation and introduction of an expensive technology such as PET. Payers would like high-quality evidence before PET scanning diffuses into clinical use, yet high-quality evidence cannot be generated without PET scanners! If the government only funds PET scanners for indications that have been established, Ontario researchers will never be able to contribute meaningfully to the “cutting edge” PET literature.
The OMA report was largely written by nuclear medicine specialists and was perceived by some as more of an advocacy document for PET than a systematic review of the literature. On the other hand, in an era of seemingly insatiable demands on limited resources, it is easy to suspect that policy-makers would have the opposite bias. Given the desire to make the ICES report as unbiased as possible (and also for it to be perceived as unbiased) the ICES authors decided that neither PET experts nor decision-makers would be included as authors.
Those interviewed appeared to be satisfied that the report was unbiased—thirteen respondents indicated that there was no or little bias in the selection of the literature. Nine interviewees approved of the decision not to include nuclear physicians as authors, commenting that their omission decreased the potential that the report would be biased in favor of PET. On the other hand, two argued that nuclear physicians should have been more involved. One believed that the authors had been “creative” in their use of terms describing PET, a feature that might lead informed readers to question the authors' expertise in PET. The other person suggested that there was no inherent disadvantage to an advocate's potential contributions:
“I don't think that advocates necessarily always lie. We could have helped in several areas as the drafts went along. It just might have been a shorter production process if there had been someone with the authors' group to explain technical issues. And it's quite clear from reading the document that there was no insider knowledge about PET on the writing team. The terminology is creative.”
Some of the authors indicated that their lack of knowledge about PET had necessitated a “steep learning curve” when writing the report. The majority of those interviewed believed that not including decision-makers as authors had been reasonable. Two individuals believed that policy-makers should have been more available for feedback during the preparation of the report. One worried that having no MOHLTC representation on the committee might decrease the Ministry's commitment to implement the recommendations. Another individual suggested that hospital administrators should have been invited to the stakeholders meeting.
Systematic reviews often include content experts as authors but rarely include policy-makers. HTAs include none, one, or both. It is important to recognize that all authors are biased. Some consistently favor methodological purity over clinical common sense, whereas others appear driven by a desire to introduce a new technology no matter what the evidence or cost. Most lie between these two extremes, and there can be no universal rule about who to include as the authors of an HTA, aside from attempting to have a team of authors that can provide a balanced summary and interpretation of the literature. By including neither decision-makers nor nuclear medicine physicians as authors, ICES reasoned that the two groups with the largest potential biases would be treated equitably. An HTA of PET from Quebec took a different approach to ICES by including both nuclear medicine physicians and policy-makers on its authorship team (3).
In retrospect, hospital administrators represent an important constituency that was not included in the stakeholders meeting, because hospitals will have to provide considerable funds to support the capital and operating expenses of a PET scan. It is also worth noting that the Ontario and federal governments wish to attract high-technology industries to stimulate economic growth. Their interest in PET technology extends far beyond the issues of diagnostic accuracy, impact upon outcomes, and cost-effectiveness considered in the ICES PET report, to issues of societal innovation and economic growth. No government or industry persons representing these views were asked to comment on the ICES report, which illustrates how HTA can still function in a “health-care silo.”
It is generally agreed that economic evaluations should be part of a complete HTA. The ICES authors found no economic evaluations of PET performed in Canada and, in general, perceived the quality of the published economic evaluations to be poor, largely because there was little convincing evidence about the effect of PET upon patient outcomes. The ICES authors decided to summarize the small number of economic evaluations of reasonable quality, but not to perform their own economic evaluations (5).
The vast majority of those interviewed believed that the ICES report had dealt with the existing cost-effectiveness literature appropriately, although one stakeholder believed that the authors should have conducted their own economic evaluation.
An estimate of cost-effectiveness and overall budget impact is important for decision-makers and, thus, is included in most HTAs. However, this is usually not a focus of systematic reviews. It is generally agreed that evidence about diagnostic accuracy is likely to be generalizable across countries but that clinical practice and the costs of resources are not. However, the major limitation of the published economic evaluations of PET was not the lack of information about Ontario resource use, but the poor quality of the information about PET's effect upon patient outcomes, without which it is impossible to accurately estimate cost-effectiveness.
The authors decided not to conduct their own economic evaluations, because their systematic review of the literature found no convincing evidence about PET's effect upon outcomes. They believed that the results of economic evaluations based upon poor clinical evidence would not be worth the considerable effort needed to produce them.
Authors of an HTA must determine how directive they should be in their conclusions. In this case, the authors considered whether the report should simply be an up-to-date review of the literature, whether it should outline a series of policy options, or whether it should actually suggest how many PET cyclotrons and scanners (if any) should be introduced in Ontario. It was decided that the ICES report would use administrative databases to estimate the number of patients with cancers and intractable seizures in Ontario who might benefit from PET.
Most interviewees believed that the ICES approach was reasonable, although one individual worried that the lack of clear policy recommendations might have diminished the potential influence of the report.
It is likely that policy-makers pay more attention to an HTA report if it actually suggests a course of action (e.g., “Ontario should introduce eight PET scanners during the next two years”) than if it simply provides a systematic review of the literature. The problem with the former approach is that the authors of an HTA rarely, if ever, are aware of the evidence about all the other technologies or programs that could be funded instead of the one they have evaluated. At the time the ICES PET HTA was being written, there was considerable concern about excessive waiting lists for radiotherapy and elective hip and knee arthroplasty, over-crowded emergency rooms, and lack of adequate home care in Ontario. Thus, it seemed inappropriate for the ICES authors to suggest that PET should be introduced, when they had not compared PET with other competing demands on scarce resources. In addition, the PET authors were not in a position to balance the “medical” aspects of PET with the possible impact of investing in PET technology on the economy of Ontario. The authors hoped that describing the number of patients who would be eligible for PET for various clinical indications would provide decision-makers with useful information about the amount of resources needed if PET would be introduced for those indications.
Upon the completion of an HTA, the authors are faced with the choice of whether they should consider their job finished, or whether they should lobby policy-makers to act quickly and decisively on the study's findings. The ICES authors decided to submit their report to the COTF and post it on the ICES Web site but not to actively disseminate it.
Many of those initially interviewed were pessimistic about the report's likely impact—eight individuals indicated that they were unsure or did not think that the report would have any influence on policy. However, during the year after publication, individuals within the Ministry of Health have used the report to push for the implementation of PET in a limited context, accompanied by research and evaluation (7). Two decision-makers in the Ministry of Health were interviewed approximately a year after the publication of the report to determine whether the HTA had influenced Ministry policy. In general, they approved of ICES's hands-off approach to advocacy, although one thought that ICES might have used even weaker language in its report about the usefulness of PET.
Because simply publishing a report with no active dissemination often has little impact upon policy, some may view publication without any dissemination efforts as an abdication of the authors' responsibility to have their findings incorporated into decision making. This criticism is more commonly aimed at HTAs than systematic reviews, because HTAs are expected to influence policy. On the other hand, an HTA team that is too aggressive in pushing its findings may alienate decision-makers, which can be counterproductive. As well, it is not always clear who the true decision-makers are. Individual decision-makers change frequently, they can be hard to reach, and active dissemination is time consuming and expensive. Thus, the appropriate approach to dissemination varies markedly, depending on the circumstances. Despite the most passive approach to dissemination possible, the ICES report still appeared to have considerable influence (7). This experience illustrates the importance of a report reaching the right decision-makers at the right time.
Unlike systematic reviews, HTAs are not solely concerned with evaluation of the scientific evidence. By using an actual HTA as an example, this study has explored seven key differences between an HTA and a systematic review (Table 1). These relate to methodological standards, repeating of previous studies, breadth versus depth, inclusion of content experts and policy-makers, performance of economic evaluations, making policy recommendations, and active dissemination.
HTAs cross the ideological divide between scientific investigation and political decision making, and authors of HTAs should recognize that policy-makers consider “impartial” investigators as political stakeholders too—stakeholders in the scientific community, which has its own set of values and priorities. Just as a court of law weighs the words of an “impartial” expert witness with the testimony of those more emotionally entangled with the details of the case, so too the verdict on the dissemination of a health technology ultimately encompasses scientific evidence, competing social and professional priorities, and the public's views.
The authors thank all those individuals who agreed to be interviewed for this study.
Differences between Systematic Reviews and Health Technology Assessments (HTAs)