Traditionally, health technology assessment (HTA) focuses on effectiveness and cost-effectiveness while other domains as part of the original definition of HTA such as ethical, legal and social aspects have been neglected. Within the INTEGRATE-HTA project, we advocate for integration of social cultural, ethical, legal, organizational issues with effectiveness and cost-effectiveness (Reference Gerhardus1). Social cultural, ethical, and legal aspects present within the context may interact with the intervention, which might make an intervention complex. If an HTA process fails to take into account the complexity of an intervention, then it could result in misleading conclusions (Reference Gerhardus1).
Another source of complexity, studied in work package 4 of the INTEGRATE-HTA project, is heterogeneity in patient characteristics and preferences for treatment outcomes. Depending on their characteristics, patients may respond differently to a specific treatment, both in terms of beneficial or adverse effects. Moreover, patients may differ in how they value particular treatment outcomes. For instance, in the context of treatment of patients with epilepsy, a specific drug may be known to lead, on average, to slightly superior control of seizures, improved mood, but also weight gain. Even though this superiority may be true on average, seizure control will be better achieved in some patients using other drugs. Also, weight gain, if it occurs, may be less of a problem to some patients when compared with others.
This example illustrates the need for information about moderators and predictors of treatment effects as well as patient preferences for specific treatment outcomes. First, information about moderators or predictors can be used to guide the search for subgroup analyses. Furthermore, the effectiveness of an intervention under assessment could be enhanced by targeting the treatment to a group that is most likely to benefit and thus guide implementation of a new intervention. Information about patient preferences for treatment outcome may be used to guide the effectiveness evaluation so as to ensure that this preferred outcome becomes one of the most important outcomes in the evaluation. Furthermore, in the decision-making process following an HTA, one could give greater weight to results on outcomes that matter to patients most.
The objective of work package 4 of the INTEGRATE-HTA project was to develop tools to efficiently retrieve and critically appraise available evidence on (i) moderators and predictors of treatment effects as well as on (ii) patient preferences for treatment outcomes, using search filters and critical appraisal tools. This study takes the perspective of HTA researchers who wish to use the best available evidence to develop recommendations about how, and for whom, healthcare technologies may be optimally targeted. This study summarizes previously published guidance (Reference Van Hoorn, Tummers, Kievit and Van der Wilt2) and offers insight into how the tools can be used by HTA researchers or agencies.
MODERATORS AND PREDICTORS OF TREATMENT EFFECTS
Moderators are variables that influence the strength of a relation between two other variables, for instance those for a treatment and an effect (Reference Kraemer, Wilson, Fairburn and Agras3–Reference Baron and Kenny5). Age and gender are common moderators. The term “moderator” is similar to the epidemiological term “effect modifier” and the statistical term “interaction” (Reference MacCorquodale and Meehl6). In practice, it could mean that a treatment could work for men but not equally well for women. Predictors are characteristics or variables that influence the outcome independent of the treatment. The effect would be the same if a different treatment was applied. For instance, gender is a predictor for mortality, but when treatment had equal effects in men and women, gender is a predictor and not a moderator. Moderators can be investigated using subgroup analyses or regression analysis using interaction terms. Predictors can be investigated by looking at measures of association as in regression analyses (Reference March and Curry7–Reference Viechtbauer9).
Search Filters
Efficiently identifying and using information from the literature concerning moderators or predictors of treatment effect requires an appropriate search strategy. As an alternative for hand searching the literature, a well-defined search strategy is more efficient in retrieving relevant articles and allows for replication of results by its transparency. PubMed Clinical Query (PCQ) filters have been developed to find literature on prognosis, treatment, or clinical prediction guides. These filters produce results containing articles concerning diagnosis or disease stage, study design/methodology, clinical prediction (i.e., prognosis, independent of treatment), outcome measures (including patient reported outcomes and quality of life), or treatment effects in general. A search filter specifically aimed at retrieving studies reporting on moderators of treatment outcome (i.e., variables that influence how well or likely a patient responds to a treatment) could have added value to existing filters.
Such a search filter was created by first collecting relevant articles on moderators and predictors of treatment effects. All articles published in the year 2011 in specific journals in the field of rheumatoid arthritis and general medicine were searched by hand and any of these reporting moderators or predictors of treatment effects were selected. Subsequently, search terms identified from these papers were algorithmically combined to derive the optimal combination of search terms for finding articles on moderators and predictors of treatment effects. The applied methods followed accepted good practice in search filter creation (Reference Jenkins10;Reference White, Glanville, Lefebvre and Sheldon11). More details on the development on the search filters can be found in the Guidance for the assessment of treatment moderators and patients’ preferences (Reference Van Hoorn, Tummers, Kievit and Van der Wilt2).
This methodology resulted in four sets of search filters, which were tested on the same hand searched set of papers. The results are listed in table 1. Each set contains the top three search filters optimized for the respective performance measure. If the purpose of an HTA is to be exhaustive, the filter optimized for sensitivity will be most appropriate. However, it will probably return a relatively large proportion of papers of low relevance. Search filters with high accuracy, specificity, or low number of papers needed to screen (number needed to read (NNR) (Reference Bachmann, Coray, Estermann and Ter Riet12) will return fewer irrelevant papers at the expense of missing potentially important information. Clearly, the choice of which strategy to use depends on the goal of the systematic review, the numbers of usable retrieved papers, and the amount of time that the user is willing and able to invest. The chosen search filter needs to be combined with a disease-specific search filter relevant to the field of interest. We do not recommend to combine the search filters with any limits on publication type, because moderators and predictors are reported in epidemiological studies, randomized controlled trials (RCTs) as well as meta-analysis (Reference March and Curry7–Reference Viechtbauer9).
Note. Combinations of search terms with the best sensitivity, best specificity, and lowest NNR for detecting articles reporting on moderators or predictors of treatment outcome.
a Keeping sensitivity ≥ 25%, specificity ≥ 75%, and accuracy ≥ 75%.
Se: sensitivity; Sp: specificity; Ac: accuracy; NNR: number needed to read; [tiab] = title/abstract, words, and numbers included in the title, collection title, abstract, and other abstract of a citation; [ti] = title, words, and numbers included in the title or collection title; mesh: Medical Subject Headings. Sh: subject heading
Critical Appraisal Tool
Once relevant literature has been selected, the next step is to assess it for quality. Two critical appraisal tools for prediction and moderation of treatment effect, based on RCTs, have previously been published (Reference Sun, Briel and Busse13;Reference Pincus, Miles and Froud14). However, relevant and valid moderator or predictor information is also found in studies other than RCTs. Also, such information may be derived from a body of evidence, as represented within a systematic review, rather than solely from single studies. Therefore, within work package 4 of the INTEGRATE-HTA project, we developed a more flexible critical appraisal tool suitable for all alternatives of moderator/predictor (or subgroup) analyses.
For the development of the tool, we conducted a preliminary literature review of aspects important for validity of moderators and predictors of treatment effect. This review was based on searches in PubMed and Google Scholar, citation chasing, author searches, related articles, and consultation with experts. As the aim was not to itemize every single aspect of validity, but simply to identify a diverse range of indicative factors, we did not aim for comprehensive coverage of the literature. Forty-nine appraisal criteria were identified in the literature. Subsequently, a Delphi procedure with three rounds was used, following the Research and Development (RAND) Appropriateness Method guidelines (Reference Fitch, Bernstein, Aguilar, Burnand and LaCalle15), to augment and then value a set of appraisal criteria retrieved from the literature. Fourteen experts from (bio)statistics, epidemiology, and other associated fields participated in the Delphi procedure.
Based on these results, a final selection of criteria was included in a test version of the appraisal checklist. Following internal testing and external feedback the test version was amended to create a final version. We would like to refer to the Guidance for the assessment of treatment moderators and patients’ preferences (Reference Van Hoorn, Tummers, Kievit and Van der Wilt2) for more details about the development of the tool and the results of the difference Delphi rounds and testing phases. Please note that work on the checklist was completed after the guidance was published. We are currently working on an update of the guidance.
The final version of the CHecklist for the Appraisal of Moderators and Predictors (CHAMP) consists of a seventeen questions (listed in table 2), each completed as “yes,” “no,” “don't know,” or “not applicable.” Those seventeen items cover the design (e.g., a priori plausibility), analysis (e.g. use of interaction tests), and results (e.g., complete reporting) of moderator and predictor analysis, together with the transferability of the results. The final version of CHAMP, including an in-depth explanation of the rationale behind each question, is presented in Supplementary File 1.
Note. Combination of terms with the best sensitivity, best specificity, and lowest NNR.
a Keeping sensitivity ≥ 25%, specificity ≥ 75%, and accuracy ≥ 75%.
Se, sensitivity; Sp, specificity; Ac, accuracy; NNR, number needed to nead; [tiab], title/abstract, words, and numbers included in the title, collection title, abstract, and other abstract of a citation; [ti],title, words, and numbers included in the title or collection title; mesh, Medical Subject Headings; Sh, subject heading.
Using the checklist should help to arrive at a transparent and uniform overall judgement of the quality of a moderator or predictor analysis. CHAMP can be used to determine whether evidence is sufficient to warrant subgroup analyses in meta-analyses, to systematically value and describe evidence in systematic reviews, to design prediction models, and to facilitate individualized healthcare. CHAMP is designed to be used in conjunction with a quality tool such as the Cochrane risk of bias tool (Reference Higgins, Altman and Gøtzsche16), to judge the overall quality of the study.
PATIENT PREFERENCES FOR TREATMENT OUTCOMES
The value of a specific technology for a defined individual does not only depend on moderators and predictors but also on their personal preferences. The importance of incorporating patients’ preferences in medical decision making is increasingly recognized. The importance of patient preferences for treatment outcomes can be illustrated by the example of an HTA of the pediatric cochlear implant. Whereas the literature mainly reported outcomes for hearing and speech, the deaf community was at least equally interested in social and emotional development outcomes (Reference Reuzel, van der Wilt, ten Have and de Vries Robbe17).
In a further example, patients with chronic kidney disease differed regarding the weight they assigned to various hemodialysis-related outcomes when compared with the views of nephrologists and HTA authors (Reference Janssen, Scheibler and Gerhardus18). Both examples illustrate that interventions may be considered superior in aspects deemed important to medical professionals or decision makers but not to patients. The value of interventions should, therefore, also be established from the viewpoint of the target population, that is, the patients.
Patients’ preferences are usually described as a preference for one treatment or another. These preferences are difficult to generalize as they are highly context-dependent (Reference Jansen, Kievit, Nooij and Stiggelbout19). Therefore, it is more relevant to retrieve information on treatment outcomes that might explain such preferences, for example, risks on adverse events, or specific outcomes such as functional status. Searching for information on preferences for treatment outcomes in the medical literature, for instance using PubMed, can be time-consuming (Reference Ely, Osheroff, Chambliss, Ebell and Rosenbaum20;Reference Eiring, Landmark and Aas21) and may be problematic because patient preferences are elicited in many ways (Reference Eiring, Landmark and Aas21;Reference Opmeer, de Borgie, Mol and Bossuyt22). Heterogeneity in methods used and reporting styles makes it more difficult to retrieve relevant literature (Reference Janssen, Gerhardus, Schroer-Gunther and Scheibler23). Therefore, we developed a search filter, similar to PubMed's Clinical Queries, with high performance in retrieving scientific papers that report empirical evidence on patients’ preferences for treatment outcomes (Reference van Hoorn R and Booth24).
Search Filters
Development of search filters for patient preferences followed a similar process to that for the filters for finding moderators and predictors: [i] a comprehensive set of search terms and combinations of terms was constructed and [ii] the results of these combinations of terms were tested in a set of relevant papers. This methodology resulted in a set of search filters either optimized for sensitivity, specificity, accuracy, or NNR as shown in Table 3 (Reference van Hoorn R and Booth24).
Note that questions 10–12 are listed twice, as they are applicable both to individual studies and sets of studies covering the same moderator or predictor (body of evidence).
Testing revealed that papers on patient preferences in general and for treatment outcomes, specifically, are a needle in the medical literature haystack. Only 22 of all 8,238 hand-searched articles (0.27 percent) reported empirical evidence on patients’ preferences for treatment outcomes. We identified three possible reasons for this finding: (i) there is little research performed on this subject; (ii) the research is inadequately reported and cannot be retrieved at a title or abstract level; or (iii) the journals we had carefully preselected do not commonly publish papers on preferences. Based on this finding, we recommend starting with the sensitivity-optimized filters. When the initial set of retrieved literature seems unmanageably large, then a specificity-optimized filter can be used.
Inevitably, the performance of the search filters presented in this study reflects the terminology used by researchers to publish their work on patients’ preferences for treatment outcomes in 2011. Any changes in terminology over time affect the performance of search filters. For this reason, ongoing update of the performance of these search filters on a periodic basis is warranted.
Critical Appraisal Checklist
The aim of the appraisal checklist is to determine whether a study reporting on patient preferences for treatment outcomes has (i) been executed rigorously and (ii) whether the findings are relevant to the research questions of the HTA. Given the diversity of methods to elicit preferences, quantitative as well as qualitative, it did not seem feasible to develop one generic tool including items relevant to all study designs. Therefore, we mapped the methods currently being used to elicit patient preferences for treatment outcomes and then searched for existing guidance or tools to appraise these methods.
To explore methodologies most commonly used to elicit patient preferences for treatment outcome, we analyzed the papers identified during development of the search strategy, as well as expert opinion, and conducted additional PubMed and Google Scholar searches. A separate search was performed, for each method found, to identify appraisal criteria specific to that method. These searches combined method-related search terms with appraisal related search terms, such as “appraisal” or “quality.” The search identified various studies that detail quality criteria of potential value when appraising studies on patient preferences for treatment outcomes.
Despite the large variety of methods available to elicit patient preferences (Reference Janssen, Gerhardus, Schroer-Gunther and Scheibler23), we identified considerable overlap in how data are collected or interpreted across methods. Grouping of appraisal criteria was performed primarily on a conceptual, not a methodological, basis. Following creation of a test version, the tool was tested in a case study and revised following user feedback.
The final checklist consists of six questions (listed in Table 4), each completed as “yes,” “no,” “don't know,” or “not applicable.” The checklist, including an in-depth explanation of the rationale behind each question, is presented in Supplementary File 2. By answering the individual items, users should be able to identify relevant quality issues. The items in the checklist can be considered as a set of key quality indicators: the more these criteria are met, the greater the likelihood that a study was adequately conducted. For appraisal of specific aspects, or to determine the appropriateness of the method, will require in-depth knowledge of the specific methods used.
APPLICATION OF THE TOOLS WITHIN INTEGRATE-HTA
In terms of the INTEGRATE-HTA process (Reference Wahlster, Brereton and Burns25), information about moderators and predictors for treatment effects as well as the patient preferences for treatment outcome are best used as input for a logic model (Reference Rohwer, Booth and Pfadenhauer26). A logic model can be used to conceptualize the complexity of a technology by making a graphical description of a system within which the technologies operate, its elements and any relationships within the system. On the one hand, information on preferences and potential moderators or predictors can be used to guide the effectiveness assessment. On the other hand, social cultural, ethical, or legal issues may determine or moderate either the preferences for treatment outcomes or treatment effects itself. This information may be looped back into the INTEGRATE-HTA process model (Reference Wahlster, Brereton and Burns25) and into the logic model, making it a more comprehensive, iterative, and integrated process.
This process is illustrated by the HTA on reinforced models of palliative care which served as a case study in the INTEGRATE-HTA project (Reference Brereton, Wahlster and Lysdahl27). The developed search filters and appraisal tools were tested within this case study. According to the evidence retrieved, many patients receiving palliative care expressed a strong preference to die at home, in their familiar surroundings, an outcome not to be neglected when comparing different modalities of palliative care. Furthermore, the logic model should incorporate factors that may affect the likelihood of patients dying in their own familiar surroundings, as for example, the presence of an informal caregiver.
In addition, we found papers reporting on factors that influence preferences for treatment outcomes. For example, in the North-American setting, Blacks and Hispanics are more likely to want to spend their last days in a hospital, and more likely to want life-prolonging drugs, when compared with Caucasians (Reference Barnato, Anthony, Skinner, Gallagher and Fisher28). This would mean that a home-based palliative care intervention does not per se fit the needs of patients from this ethnic background and will probably not result in the best outcomes for this group.
Incorporating such information within an HTA may lead to a more targeted indication for particular services, as opposed to a “one size fits all” approach. However, numerous, diverse factors could influence clinical decision making, especially when a complex intervention is involved. Each additional factor incorporated within the decision-making process adds to complexity and may incur additional costs. Especially when determining a genetic profile, specific biomarkers or laboratory tests will add to the overall costs of the technology. Before implementation decision makers will need to consider whether the extra costs of a more individualized approach will outweigh the benefits. However, use of information on treatment moderating and patient preferences in clinical decision making holds the potential to improve the quality of care, efficiency of care, and saving costs (Reference Isaacs and Ferraccioli29;Reference Jakka and Rossbach30).
The proposed approach of INTEGRATE-HTA and in particular the results of work package 4 are in contrast with the traditional HTA were the cost-effectiveness analysis plays central role. If a cost-effectiveness analysis is done from a societal perspective, which is preferred by most guideline of CE analyses, health states are valued from the perspective of the general population. Indeed from a societal perspective, accounting for fairness, societal values are preferred but from an individual perspective of the patient all kind of other outcomes could be preferred. Including patient preferences in HTA can increase public acceptance of health policy, increase transparency and legitimacy by involving stakeholders, and is, therefore, essential to good HTA practice (Reference Bridges and Jones31). Therefore, we highly recommend that information about treatment moderation and patient preferences is incorporated within an HTA to target populations that will benefit the most and to assess value of outcomes that are prioritized by patients.
CONCLUSION
This INTEGRATE-HTA work package 4 resulted in different tools to retrieve and critically appraise literature on moderators and predictors of treatment effects or patient preferences for treatment outcomes. Using the tools enables HTA researchers to retrieve information on subgroups for whom a specific treatment will produce more benefit. Incorporating this knowledge in the HTA process holds the promise of better targeting and, ultimately, enhancing overall effectiveness and efficiency of healthcare technology. Finally, incorporating information on preferences for treatment outcomes will foster HTA that addresses outcomes that are important to patients.
SUPPLEMENTARY MATERIAL
Supplementary File 1: https://doi.org/10.1017/S0266462317000885
Supplementary File 2: https://doi.org/10.1017/S0266462317000885
CONFLICTS OF INTEREST
The authors have nothing to disclose.