New technologies may be introduced to the healthcare market when there is little evidence of justification. In addition, there are instances when the new evidence may be introduced into the public domain in rapid succession. This finding is especially true in the area of rapidly evolving technologies (RETs). This rapid change and rapid dissemination of new and at times conflicting data present numerous challenges for those responsible for health technology assessment (HTA).
The National Institute for Health and Clinical Excellence (NICE) is the independent organization responsible for producing national guidance on the use of selected new and established health technologies for the National Health Service (NHS) in England and Wales. The guidance issued about the use of technology is based on an appraisal, which involves several sources, including a technology assessment report (TAR) and input from health professionals, patient groups, and healthcare users (23;24). The purpose of the TAR is to assess independently available clinical and cost evidence to inform the appraisal process. However, NICE guidance needs to be provided before the integration of RETs into clinical practice. This provision may mean that there is limited available published data to inform the decision process and, therefore, a requirement for TAR teams to identify, assess, and include data from conference abstracts and presentations in their assessment reports.
Conference abstracts and presentations are used to inform the research committee regarding planned or ongoing trials as well as being a forum to allow for the rapid release of important new findings. New available media technologies now mean that data released at such conferences is often broadcast to international audiences and, therefore, provide an increased opportunity for the dissemination of information and encourage early uptake of new technologies.
However, from the perspective of the review teams, conference abstracts and presentations are difficult to locate. They may be poorly indexed in standard databases typically searched during the systematic review process (e.g., MEDLINE, EMBASE). In addition, these databases rarely index journal supplements, in which studies available as conference abstracts often appear. Extended search strategies, including additional sources, are often required to identify these studies (25;26).
The limited information about the methodological detail of a study made available in conference abstracts and presentation provides a challenge to systematic reviewers. Data in abstracts or presentations may not be complete. Such sources may only report interim analyses or short-term follow-up data. In addition, there is evidence that inconsistencies regarding results, as well as the specification of the primary outcome measures, may occur between conference abstracts/presentations and subsequent full reports (2;5;13;29;30). A recent NICE review of literature searching for clinical and cost-effectiveness studies used in health technology assessments suggests that there is a need for further research in this area to be able to understand better the effect of these changes on the conclusions of the review (26). The aims of this case study were to examine the consistency of reporting between conference abstracts and presentations and subsequent full publications, the ability to judge methodological quality of trials from conference abstracts and presentations, and the effect of inclusion or exclusion of data from these sources on the pooled effect estimates in a meta-analysis.
METHODS
Selection of Case Studies
Two researchers (Y.D. and T.W.) assessed the eligibility of case studies resulting from an audit of published TARs (9) on a case-by-case basis. TARs were eligible if they had (i) informed the NICE appraisal process and been published as an HTA monograph by the end of October 2004, (ii) evaluated RETs, (iii) identified and included randomized controlled trial (RCT) data from conference abstracts, and (iv) included a meta-analysis where data from abstracts were included.
The term abstract in this context refers to initial, interim, or final reports of research studies presented at conferences, meetings, workshops, and symposiums, usually published in non-peer reviewed form in conference proceedings or journal supplements (or available after the conference through Internet-based sites). Full-text articles refer to reports of research studies published in full in a journal or journal supplement.
We identified the abstracts used and any subsequent publications by searching electronic databases for the first author (and trial investigators when necessary). The quality of the published reports was assessed using the Centre for Reviews and Dissemination Report 4 (16) and carried out independently by two researchers (Y.D. and S.D.).
We carried out sensitivity analyses of the key outcomes identified in the TAR to assess the impact of any data discrepancies by undertaking meta-analyses under the following three scenarios:
- Scenario 1: data included from all sources included in the original meta-analysis in the review (i.e., including both abstracts/presentations and full publications);
- Scenario 2: data included only from full publications available at the time the review was originally released (i.e., excluding abstracts/presentations); and
- Scenario 3: data included from all full papers published to date (i.e., excluding abstracts/presentations).
We aimed to compare the results of Scenarios 2 and 3 against those in the original review (Scenario 1).
RESULTS
We identified thirteen TARs of RETs (3;4;6–8,10;11;15;18–21;28). Of these, only one TAR had identified and included RCT data from conference abstracts and presentations and carried out a meta-analysis that included data from these sources (11). This TAR was used as a case study.
Case Study
This review (11) was part of the systematic review of coronary artery stents published in September 2004 and was conducted to assess the clinical and cost-effectiveness of the use of drug-eluting stents (DES) compared with non-DES in patients with coronary artery disease.
Of the twelve included RCTs, only two were fully published in peer-reviewed journals at the time of the submission of the DES review (February 2003). Sources of information primarily included conference abstracts and presentations or reports from Internet-based conference sites. In total, thirty conference abstracts and twenty-three presentations were identified in the review.
We identified two further trials that had been published in full in peer-reviewed journals by the time the NICE guidance was issued on the use of coronary artery stents in October 2003. By the end of 2004, all but one trial were fully published, and we identified four further conference presentations of three trials included in the review.
Quality Assessment
The ability to judge the methodological quality of studies was limited by the information available in the abstracts at the time of preparation of this review. In the original review, quality assessment was carried out for eleven studies using conference abstracts and presentations and data provided by pharmaceutical manufacturers. Only one trial was available as a published journal article, and another was published in full after the quality assessment was completed.
The overall quality of reporting in abstracts and presentations was generally poor, especially in abstracts, possibly because of limited space (see Table 1). In particular, none of the abstracts or presentations described the method of randomization or allocation concealment, and only a small number of abstracts (five of thirty) presented baseline characteristics and comparability in the trial. There was no mention of blinding in more than half of the abstracts and one quarter of the presentations.
Data Inconsistencies
Incomplete or inconsistent reporting of data was apparent in the electronic and printed abstract/presentation sources used. Most inconsistencies were between the conference slide presentations and data reported in published full-text reports. In nine studies reporting event rates, seven studies reporting mortality, seven studies reporting any myocardial infarction (MI), and three studies reporting binary stenosis, conference abstract data were inconsistent with the subsequently published full-text articles. There were often discrepancies in the numbers of patients reported in different conference presentations, with no explanation.
Data Analysis
Meta-analyses were carried out for four outcomes: combined event rates (mainly mortality, MI, and repeat revascularization), mortality, any MI, and binary stenosis. Data were pooled using a fixed-effect model with odds ratios and 95% confidence intervals.
Event Rates
Scenario 2 (i.e., excluding data from abstracts/presentations from the original meta-analysis) differed from Scenario 1 (i.e., including data from all sources available at the time of the original review) only in the short term, where there was a lack of evidence of any difference between treatment groups instead of the marginally beneficial (but not statistically significant) effect of non-DES over DES that was indicated in the review. In Scenario 3 (i.e., including data only from all full papers published to date), the short-term and 6-month results did not differ largely from Scenario 1, but the 12-month results were no longer significant (Table 2).
Mortality
In both Scenarios 1 and 2, there was no evidence of a difference between treatment groups in the short term. The beneficial (but not significant) effect in favor of non-DES at 6 and 12 months indicated in the review was not supported by Scenario 2. In Scenario 3, the direction of effect at the three time points was the same and the significance of the results was similar to Scenario 1 (Table 3).
Myocardial Infarction
In Scenario 2, the 6-month estimates could not be estimated due to a lack of studies reporting this outcome. The short-term estimate indicated no evidence of a difference between treatments in contrast to the beneficial (but not significant) effect of non-DES indicated in the review. At 12 months, there was a marginal (but not significant) effect of treatment in the opposite direction to that indicated in the review. In Scenario 3, the direction of effect for all results was the same as those that were observed in the review, but the significance of short-term and 12-month results changed when compared to the review (Table 4).
Binary Stenosis
This outcome is reported only at 6–9 months in all data sources. Analyses of data across the three scenarios indicated a significant benefit favoring DES (Table 5).
DISCUSSION
The case study of DES is a good example of an HTA appraisal that assesses a rapidly evolving technology. The speed of development of the stent technology was such that, at the time of the preparation of the review, there was not only a rapid evolution of publications but also new data were being released at regular intervals as part of specialist meetings. At the time of the original submission of the review to NICE in February 2003, only two of the twelve included trials were published in full. Two further fully published trials were identified by the time NICE guidance was issued on the use of coronary stents in October 2003, and by the end of 2004, all but one of the twelve trials were published in full.
In this case study, insufficient reporting of the methodological details of the trials in conference abstracts and presentations severely hampered the ability to judge key aspects of the quality of the identified trials. The overall quality of reporting in these sources, particularly in printed conference abstracts, was generally poor. The view that it is difficult to judge trial quality from abstracts is supported by other studies (2;12;17).
Results from this case study demonstrate discrepancies in data available between abstracts, on-line conference presentations, and subsequently published full-length articles. Reasons for these differences remain unclear. Possible reasons may include typographic errors, change in definitions across abstracts/presentations and full publications, and selective reporting. Other studies have also highlighted discrepancies between data from conference abstracts and subsequent publications (2;5;29;30). Chokkalingam and colleagues (5) found that reasons for discrepancies included misinterpretation in the abstract of the number of patients analyzed as the number randomized, and presentation of interim results in the abstract. Tooher and colleagues (29), who compared results from thirty-seven trials available as conference proceedings and subsequent full publications, reported that results from less than half of the abstracts agreed with the full publications, and the direction of results reported in these sources were different in approximately one fifth of trials.
Sensitivity analyses indicate that the conclusions of the review would have changed in terms of direction of effect and statistical significance in one outcome if abstracts/presentations had been excluded at the time of the review. Using data solely from full papers published to date would not have altered the direction of effect of any of the results compared to those published in the original review, but the statistical significance of three results would have changed. These differences, as well as those observed in precision and I2 statistics between the three scenarios, have resulted because including abstracts/presentations or more recently published trials increases the pooled sample size and available data. It is important to note that, as there was only one review eligible to be used as a case study in this research, findings from these analyses may be of limited generalizability.
Several studies have investigated the potential impact of inclusion of gray literature in systematic reviews. Hopewell and colleagues' Cochrane methodology review (14) examined the impact of gray literature in meta-analyses of RCTs of healthcare interventions (where abstracts were the most common type of gray literature in nearly half of the included studies). They found that published trials typically reported an overall greater treatment effect than trials from gray literature, but as in our case study, this difference was not usually statistically significant.
Inability to assess the quality of a trial included in the review may potentially lead to uncertainty regarding the reliability and validity of results and conclusions obtained from the review. One may argue that reviewers could acquire further information from investigators, but this acquisition may be a difficult task and attempts to get information may not always be successful (1;21;22;27).
Selective reporting (i.e., selection of a subset of the original variables recorded for inclusion in the publication of trials) may lead to more substantial differences between sources, and different sources may typically report outcomes at different time points. For example, shorter-term results may appear in abstract/presentation but not in the full paper. Also the definition of an outcome (e.g., composition of event rates, cardiac versus all mortality) may vary across sources, leading to possible discrepancies between reported results. Conference presentations are the major forum for dissemination of new research results from pharmaceutical and device companies to healthcare providers. The majority of participants at such meetings are unlikely to investigate discrepancies of data between the conference and the subsequent full research report publications.
CONCLUSIONS
Results of this study add to the body of evidence related to the inconsistency of results reported in conferences to those subsequently available in published full-text articles. Data discrepancies identified across sources in TARs should be highlighted and their impact discussed in the review. Where abstract/presentation data are included, reviewers should identify and discuss the effect of including data from these sources by, for example, carrying out a sensitivity analysis with and without data from conference abstracts and presentations included in the analysis. Lack of study details reported in conference abstracts and presentations limit the ability of reviewers to assess confidently the methodological quality of trials available only as abstracts/presentations. Conference abstracts particularly tend to provide limited details of study methodology and reported outcomes.
CONTACT INFORMATION
Yenal Dundar, Dr (yenal@liv.ac.uk), Research Fellow, Liverpool Reviews and Implementation Group, Faculty of Medicine, The University of Liverpool, Ashton St. Sherrington Buildings, Liverpool L69 3GE, UK
Susanna Dodd, MSc (s.r.dodd@liv.ac.uk), Research Associate, Paula Williamson (prw@liv.ac.uk), Professor and Director, Centre for Medical Statistics and Health Evaluation, The University of Liverpool, Shelley Cottage, Brownlow Street, Liverpool L69 3GS, UK, Rumona Dickson, MSc (r.dickson@liv.ac.uk), Director, Tom Walley, MD (t.walley@liv.ac.uk) Professor of Clinical Pharmacology, Liverpool Reviews and Implementation Group, Faculty of Medicine, The University of Liverpool, Ashton St. Sherrington Buildings, Liverpool L69 3GE, UK
The evidence described in this article is based on research (Project reference: 04/05/01) commissioned by the (UK) National Health Service National Co-ordinating Centre for Health Technology Assessment programme (NCCHTA). The authors are pleased to acknowledge the support and the contributions of the colleagues involved in the larger health technology assessment project: J Critchley and A Haycox as well as experts who commented on drafts of the assessment report.