As a reaction to several scandals, for example, metal on metal hip implants and breast implants (Reference Storz-Pfennig, Schmedders and Dettloff1), as well as to the criticism expressed by numerous stakeholders, the European Parliament opened proceedings on the revision of the current European Union (EU) regulation on medical devices (MD) in 2012. One clearly demanded change in the European regulation process refers to the requirement for high quality clinical trials on medium and high risk MD (Reference Storz-Pfennig, Schmedders and Dettloff1–Reference McCulloch4). As a consequence, appropriate outcome measures need to be chosen to assess efficacy, effectiveness, and adverse events of MD and therewith improve the patient's safety.
Conducting well-designed clinical trials on MD requires special consideration related to the scope and the relevant outcomes to be measured. The process of defining relevant outcome measures for clinical trials on MD is particularly complex, as there is a large variety of existing outcomes that are potentially relevant. Thus, the chosen outcomes vary greatly across clinical trials, making the assessment of clinical trials in evidence syntheses, such as health technology assessment (HTA) reports or systematic reviews (SR), very challenging. Also conducting a meta-analysis is often not feasible, due to the fact that outcome measures collected in the trials are very inhomogeneous (Reference Clarke5;Reference Williamson, Altman and Blazeby6).
The European Clinical Research Infrastructures Network (ECRIN) is a European Union (EU)-funded network aiming to support multinational clinical research and to provide information on clinical trial methodology, among others, for MD studies. HTA institutions have certain procedures to determine outcome measures that represent the efficacy, the effectiveness, as well as the safety of the MD in a specific indication. The aim of this project was to provide an overview of common procedures of HTA institutions in defining relevant outcome measures in MD trials to support researchers in the selection process of outcomes that reflect all important aspects related to the MD.
METHODS
A two-step literature search was carried out to identify relevant information on the process of defining outcome measures in a MD trial. In the first step, institutions involved in HTA were identified by searching the member lists of the international networks International Network of Agencies for Health Technology Assessment (INAHTA), the European Network for Health Technology Assessment (EUnetHTA), and the global scientific and professional society Health Technology Assessment International (HTAi) in 2012. The Web pages of all identified institutions involved in HTA were systematically searched by two researchers for handbooks and manuals containing information on methods for the predefinition process of outcome measures for a HTA or SR.
To be considered for analysis, the methodological manual or handbook should describe the HTA approach which includes one or more of the following aspects: Information about the predefinition of the scope, involvement of external groups, literature search to collect information a priori to the HTA or SR, publication of the scope a priori to the HTA or SR, a description of the selection process and ranking of outcomes, importance of patient relevant outcomes, and dealing with surrogates in the assessment of MD. Moreover, the publication should be available in English or German. The publication date was not restricted.
In total, the Web pages of 126 institutions were searched in November and December 2012. As some HTA institutions might use unpublished manuals which are not available on their Web page, all institutions were contacted by email in a second step in July 2013. After 1 month, a reminder was sent out. In the email request, we asked for the current methodological manual and the questions posed were related to the process of predefining outcome measures in general as well as specifically with respect to MD related research questions when preparing a HTA or a SR. If the responses given by the institutions fulfilled the inclusion criteria, we contacted them again to get authorization for publication.
In July 2014, we updated the search for methodological manuals or guidelines. If several versions of the same manual were available, only the most current one was included. Two reviewers screened the identified publications for relevance according to the inclusion criteria independently. If discrepancies related to the inclusion were apparent, they were resolved by discussion until consensus was reached. Afterward, the relevant information was extracted to synthesize the data using standardized tables prepared a priori distinguishing the process of defining outcome measures in general and the process of defining outcome measures especially for MD trials. The data were extracted by one reviewer and checked by a second reviewer for quality assurance. Discrepancies were resolved by discussion until consensus was reached, or by means of involving an independent third reviewer.
RESULTS
The literature search resulted in eighty-seven potentially relevant manuals and an additional forty responses to the email request (Figure 1). After screening, sixty-five manuals and twenty-six responses were excluded due to not fulfilling the eligibility criteria. Overall, twenty-two manuals were identified as relevant for further analysis and fourteen responses of the additional email request contained relevant information and were included for further analysis. The response rate to the email request was 31.7 percent. After matching the results and including only the latest versions of the duplicate manuals, a total of twenty-four manuals of nineteen HTA institutions were included as well as ten responses obtained with an authorization for publication. In summary, the overview contains information of twenty-six institutions worldwide. Table 1 gives an overview of the included institutions, reports, and email requests. The complete extraction table can be found in the Supplementary Table 1. Responses to the email requests are cited with letters in alphabetical order.
* Responses to the email requests are cited in alphabetical order.
Process of Defining the Scope
Overall, four of the twenty-six institutions adopt the methodology provided by other agencies: Three institutions refer to methods described in the Cochrane Handbook [(7), f, h]. Adelaide Health Technology Assessment (AHTA) (a) refers to the series of publications by the National Health and Medical Research Council (NHMRC) and the manual by Goodman HTA 101 (Reference Goodman8) is mentioned by the Health Technology Assessment Section, Ministry of Health Malaysia (MaHTAS) (h). The a priori definition of the scope is described by twenty-three institutions [(7;9–29) a, b, d, e, g]. The same amount of institutions includes external experts, such as stakeholders, manufacturers, patient representatives, consumer organizations, or clinical experts (Table 2).
* Responses to the email requests are cited in alphabetical order.
The step of conducting a literature search to specify the scope is described in ten handbooks and four email responses [(7;10;14;17;20–22;25;27;28) a, b, e, g]. Altogether, four HTA institutions publish the scope of the planned report online on their Web sites for comments before conducting the HTA/SR [(10;20;21;27) a].
Analysis of the Definition of Outcomes in General
Of all included institutions, 88.5 percent describe the type of outcomes that should be considered in detail (7;10;Reference Bochenek, Golicki and Kaczor11;13;Reference Kristensen and Sigmund14;Reference Fröschl, Bornschein and Brunner-Ziegler17–21;Reference Gartlehner, Wild and Felder-Puig23;25–31) a, c, d, j] (Table 3).
* Responses to the email requests are cited in alphabetical order.
Altogether, 84.6 percent of the institutions agree that the main focus should be on patient relevant outcomes playing an important role in a given disease [(7;9–18;20;21;24;26;27;29–31) a, b, d, e, j]. Examples reported are among others morbidity, mortality, recovery, and pain [(7;10;13;14;16;17;20;24;27;30;31) a, e, j]. Many institutions, namely sixteen of twenty-six institutions (61.5 percent) report on the importance of patient reported outcomes such as health-related quality of life [(7;10;11;13;14;16;17;20;21;24;26;30;31) a, e, j]. Regarding the patient perspective, some recommend to consider patient satisfaction (7;Reference Wild, Gartlehner and Zechmeister24), patient preferences (7;10), compliance and acceptance (7), as well as subjective experience such as physical functioning (13).
Overall, 42.3 percent of the institutions mention that they consider all relevant risk and/or safety side effects (7;Reference Bochenek, Golicki and Kaczor11;Reference Kristensen and Sigmund14;16–18;20;21;26;27;30;31). For example, mortality and morbidity should be included in the safety assessment that are directly related to the use of the intervention (7) and emphasis should be given to relevant adverse events when assessments of harm are executed (20). Furthermore, some institutions mention to consider outcomes that are unintended, long-term outcomes, or outcomes that occur in a follow-up of a trial or an observational study (13;27). Another aspect stated is that the selection of outcomes should be feasible to detect differences between the interventions (Reference Bochenek, Golicki and Kaczor11;18). Furthermore, two institutions mention that consequences for the patient's family and caregivers are also important (Reference Kristensen and Sigmund14).
When rating outcomes, seven institutions make a distinction between primary and secondary outcomes [(12;14;17;20;31) j]. For instance, when considering safety outcomes, the Medical Services Advisory Committee (MSAC) (31) recommends to first consider common safety outcomes that do not cause significant harm, then rare and/or severe outcomes on individual occurrences and finally outcomes caused by misclassifications or misdiagnosis when assessing diagnostic tests. Research questions in HTA on appropriate outcome measures related to diagnostic assessment is addressed by four institutions (12;18;19;28). They suggest the following diagnostic outcomes as important: overall accuracy of testing, impact on management, effects on patient outcomes/health outcomes, the safety for patients in case of misdiagnosis, and the determination for an appropriate threshold for a test.
Analysis of the Use of Surrogates
Surrogates are mentioned by fourteen of all twenty-six institutions [(11–14;17;18;20;22;23;26–28;31) b, j]. Altogether, eleven institutions have standardized procedures for the assessment of surrogates [(11–13;17;18;20;22;23;26;31) b]. Of these, four institutions recommend to take surrogates into consideration only when it is foreseeable that no clinical trials with patient relevant significant endpoints are available (Reference Bochenek, Golicki and Kaczor11–13;Reference Desomer, VandeVoorde and DeLaet22;31). When surrogates are taken into account, some institutions state that there should be a clear biological or medical rationale between the patient relevant outcome and the surrogate (Reference Fröschl, Bornschein and Brunner-Ziegler17;18;31), and this association should be presented (Reference Bochenek, Golicki and Kaczor11;12;19;26). Furthermore, four institutions suggest paying attention to the validity and reliability of surrogates (12;13;18;20).
Analysis of the Definition of Outcomes Specific for Medical Devices
Regarding the definition of outcome measures specifically related to MD, information could be obtained in 26 percent of all obtained manuals and email responses. Of these, seven institutions describe that they assess MD, but that the procedure is nearly the same as for the assessment procedure of other interventions (b, d, e, f, g, h, i). The Institute for Quality and Efficiency in Health Care (IQWiG) (h) points out that they do not assess MD separately, but the surrounding procedure (e.g., surgery) in which the MD is used.
Particularly considered outcomes for MD assessment are device failure, device breaking, device slipping, migrating of the device, and screw loosening (a). The Australian Safety and Efficacy Register of New Interventional Procedures –Surgical (ASERNIP-S) (b) describes that all adverse events related to the device should be assessed, for example, implantable device infections or battery replacement, as well as device failure. The Canadian Agency for Drugs and Technologies in Health (CADTH) (12) sets the focus on diagnostic devices and recommends to evaluate the impact on test sensitivity and specificity on follow-up care and health outcomes.
Furthermore, CADTH (12) calls for focusing not solely on the efficacy of the MD but on the entire episode of care surrounding the use of the MD. In one manual and one email response, examples could be found on the predefinition process of MD related outcome measures [(17) b]. ASERNIP-S (b) describes an example of a device assessment for renal nerve denervation which reduces blood pressure. As a primary outcome, the reduction in stroke or other similar patient relevant measures is mentioned as well as outcomes such as blood pressure readings.
In the handbook by Gesundheit Österreich GmbH, Bundesinstitut für Qualität im Gesundheitswesen (GÖG/BIQG) (Reference Fröschl, Bornschein and Brunner-Ziegler17), two examples on the a priori definition of outcome measures on MD could be identified. The first example deals with the assessment of a blood pressure cuff, in which the outcomes accuracy (mmHg) and the reliability of measurement are determined as relevant. The second example describes the assessment of chronic obstructive pulmonary disease (COPD) screening with spirometry. There, the outcomes are focused on the process of COPD, for example, lung function, the rate of exacerbation, and mortality are mentioned as appropriate.
DISCUSSION
Overall, the results of this overview on common procedures of HTA institutions in defining relevant outcome measures display that the methodological approaches are largely comparable in the included manuals. These cover standardized procedures, i.e., involving experts and the assessment of all relevant risk and safety site effects. Regarding the selection of MD outcomes, information could be ascertained from only nine institutions. Of these, some institutions provided only minimal information. This is possibly due to the fact that the procedure in which the MD is used, is assessed more frequently and not necessarily the MD alone.
This fact leads to the conclusion that concrete formulations for the approach of defining appropriate outcome measures on MD are not existent anywhere and that the approaches are inhomogeneous among the HTA institutions, so far. This raises the question if a homogenous approach should be implemented in the field of HTA on MD. The predefinition process of outcomes to assess MD is challenging due to the heterogeneity of the various MD. Moreover, assessing MD cannot actually be seen separately from the procedure, and it is not easy to ascertain a clear cause–effect relationship (Reference Siebert, Clauss and Carlisle32). Thus, the outcomes included in the analysis should reflect the whole procedure as a complex intervention and all different kind of settings the MD can be used in.
Concerning this aspect, the IDEAL (Idea, Development, Exploration, Assessment, Long-term follow-up) Collaboration published recommendations regarding structuring the development process of a new surgical technique including MD (Reference McCulloch, Altman and Campbell33). They demand to consider the influence of several interacting components during the operation. Furthermore, many variations in surgical strategies are feasible, for example, minimally invasive or open approaches to insert the MD, and learning curves have a conspicuous effect on the outcome.
In addition, the infrastructure, staffing, and local policies influence the outcome, too. Hence, importance should be given to the definition of the intervention and the degree of standardization sought for to evaluate the MD (Reference McCulloch, Altman and Campbell33;Reference Ergina, Cook and Blazeby34). Support for the definition process of relevant outcome measures is provided by several initiatives, for example the Grading of Recommendations Assessment, Development and Evaluation (GRADE) working group (Reference Ibargoyen-Roteta, Gutierrez-Ibarluzea and Rico-Iturrioz35). They developed methods to grade the quality of evidence that can support the outcome definition and classify the outcomes in critical and noncritical outcomes.
Furthermore, the EU-funded Core Outcome Measures in Effectiveness Trials Initiative (COMET) establish core outcome sets with a focus on minimal requirements for data collection in clinical trials should be mentioned. Core outcome sets include outcome measures that are consented and defined as necessary to report in clinical trials (Reference Clarke5). COMET is engaged to bring researchers together to develop core outcome sets related to many procedures and diseases along with corresponding stakeholders, for example, patient representatives and industry representatives (Reference Williamson, Altman and Blazeby6). The medical device outcome measure database provided by ECRIN (www.ecrin.org) offers information on used outcome measures for a tremendous amount of MD based on concrete HTA reports (Reference Demotes-Mainard and Kubiak36).
The use of developed core outcome measures in combination with the information provided by the ECRIN medical device outcome measure database shall give further support in the selection of relevant outcome measures (Reference Clarke5). There are some other important EU projects that focus on MD. For example, the EU project “advancing and strengthening the methodological tools and practices relating to the application and implementation of Health Technology Assessment (ADVANCE-HTA)” develops a more comprehensive taxonomic model for MD categorization that may have an effect also on the definition of the outcomes of interest as well as the research designs for MD trials in the future (37). The EU project EUnetHTA Joint Action 2 is developing methodological guidelines, among others on the evaluation of therapeutic MD (38). Furthermore, a recent study on methods and the preliminary results of three EU Projects focusing on the differences of the evaluation of MD was published (Reference Schnell-Inderst, Mayer and Lauterberg39). Challenges in MD assessments as well as some more information on guideline development for the assessment of therapeutic MD is described.
All information provided on the developed tools and in the guidelines for adequate MD assessment help to increase the value and quality of research, as it is defined in the current Lancet REWARD statement (http://www.thelancet.com/campaigns/efficiency). As a consequence thereof, the quality of trials as well as of HTA reports will increase, and therewith, its impact on making rational decisions in health care is more likely to increase (Reference Siebert, Clauss and Carlisle32).
This review has some limitations. First, there is a language bias because we included only English and German manuals, even though most HTA institutions around the world do not use those languages as the common language of work. This was due to the lack of financial resources for professional translations. When we encountered a Web site that did not offer an English version, we tried to find methodological manuals, but despite much effort, sometimes we were unable to identify methodological manuals.
Second, the comparison of the manuals might be partially inconsistent because of the different terminology used. We, therefore, focused not only on specific terms when searching the manuals but rather on full sections in the manuals in the extraction process. Afterward, we classified the statements according to the items we searched for. Third, the comparison of the manuals and the email responses is potentially unbalanced, as the level of detail regarding the information provided in the email request is not as high as in manuals.
In addition, it is possible that not all institutions involved in HTA were considered, as only the members of the international HTA networks INAHTA, EUnetHTA, and the society HTAi were included in the analysis. Networks like the Health Technology Assessment Network of the Americas (RedETSA) or the relatively new collaboration of HTA institutions in Asia called HTAsiaLink were not considered. Beyond that, several small institutions are eventually not covered in this analysis. Moreover, the member lists of the HTA international networks which were searched in 2012 and updated in 2014 might have changed and, thus, the member status as well as the status of the institutions might have changed.
For instance, DACEHTA and NBoH merged in the same year we performed our first search to the National Board of Health. In this case, the information of both institutions were extracted separately to increase transparency. In addition, the Danish Health Authority stopped making HTA reports in the year 2012, but we decided to include the manual from DACEHTA/ NBoH as it contains relevant information. One further potential source of bias is that the literature search focused on methodological manuals on the assessment and preparation of HTA and SR. Some institutions may have more detailed reports, for example on the use of surrogates and scientific journals related to specific types of outcome measures which are not included in this analysis. Despite the mentioned limitations, this analysis presents the status-quo and potential gaps in the field of outcome measures in MD trials and evidence syntheses. Along with this analysis, improvements in the procedures of defining appropriate outcome measures related to MD can be pursued.
CONCLUSION
This is the first detailed analysis of common procedures that HTA institutions perform in the context of defining relevant outcome measures for the assessment of MD. Concerning the definition of outcomes in general, we found out that many institutions set the main focus on patient relevant outcomes playing an important role in a given disease, such as morbidity, mortality, recovery, and pain. Furthermore, many institutions report on the importance of health-related quality of life.
Overall our search revealed that standardized procedures for MD from the perspective of HTA institutions are not widespread. The a priori definition of relevant outcome measures for MD assessment is particularly complex, as there is a large variety of existing outcomes that are potentially relevant as well as the fact that often the MD cannot be seen separately from the procedure. This leads to the question if a homogenous approach should be implemented in the field of HTA on MD. It should be considered that the initial evidence is obtained by researchers conducting primary studies. They should be supported and enhanced to plan clinical trials on MD by using appropriate sources as, for example, the IDEAL recommendations, core outcome measures databases, and the ECRIN outcome measure database. The databases and the presented overview on the procedures of HTA institutions shall support researchers in planning well-designed trials on MD as well as in improving the existing procedures in HTAs and SRs.
SUPPLEMENTARY MATERIAL
Supplementary Table 1: https://doi.org/10.1017/S0266462317000216
CONFLICTS OF INTEREST
The authors report grants from European Clinical Research Infrastructures Network (ECRIN) European Union during the conduct of the study. This project was realized within the European Clinical Research Infrastructures Network - Integrated Activity (ECRIN-IA), which is funded by the European Commission. Project reference: 284395, Funded under: FP7-INFRASTRUCTURES. The project did not influence the research procedure, for example, the collection of literature as well as the analysis and interpretation of the obtained information.