Many countries face the prospect of rapid increases in expenditure related to Alzheimer's disease (AD). Governments are faced with the task of making decisions about which drugs and interventions should be funded. Health technology assessment agencies or other decision-making bodies are responsible for such decisions based on reviews of clinical and effectiveness or cost-effectiveness evidence through a process called health technology assessment (HTA). Assessing the effectiveness or cost-effectiveness for approved AD drugs has been difficult because AD drugs have historically promised only very small or no effects on functional improvement or modifying disease progression (1;2). Although HTA processes vary by country, they have in common that evidence on effectiveness or cost-effectiveness is reviewed by a technical team and interpreted by a group of stakeholders, who present different perspectives such as those of clinicians, drug companies, patient representatives, and researchers. Which outcomes and outcome measures influence final decisions are likely to be based on various criteria including: whether they reflect meaningful changes in a person's life (which is important from the perspective of people living with the condition, their families, and carers); whether they are measurable in study designs (which is important from a developer and manufacturer perspective); and whether they are clinically and economically relevant (which is important from a payer perspective). Processes leading to decisions are complex, and are likely to vary between countries. The aims of our study were to understand: (Reference Bradley, Akehurst, Ballard, Banerjee, Blennow and Bremner1) which outcomes and outcome measures are likely to be prioritized in HTAs for AD drugs in different countries, and (Reference Goldman, Fillit and Neumann2) which processes influence these priorities. This study complemented other work which sought information on outcome prioritization from the perspective of patients, carers, and practitioners (3;4).
Methods
Overall Approach and Selection of Countries
We employed two methods: a literature review and case studies. Findings of the literature review informed the design of the case studies. Both methods are explained in more detail below. Researchers with methodological expertise in systematic reviews (CT and AL) and qualitative research interviews (MN) as well as researchers specialized in medicine and neurology (CS), with knowledge of drug reimbursements and of HTA processes (CB and AG), and of dementia policies and economics (MK and RW) were involved in reviewing the methods throughout the research. In addition, the advisory group of the larger research program of which this study was a part and which consisted of members from HTA or regulatory agencies across the world commented formally on initial findings. Three European countries were selected: England, Germany, and The Netherlands. The respective HTA agencies are the National Institute for Health and Care Excellence (NICE) in England, the Institute for Quality and Efficiency in Health Care (IQWiG) in Germany, and the Dutch Zorginstituut Nederland (ZIN). The choice was influenced by the size of the economy and roles and responsibilities of HTA agencies with the aim to have multiple perspectives: England and Germany present two large economies in Europe, in which HTA agencies have taken on different roles and responsibilities. For example, although in England drugs need to be cost-effective in order to be publicly funded (5;6), in Germany decisions about whether drugs are funded and at what price are primarily based on their added therapeutic benefit (Reference Paris and Belloni7). The Netherlands, as a relatively small economy in Europe, has taken a middle ground approach in this regard: the cost-effectiveness of drugs needs to be proven if their cost is above a certain threshold (8). An overview of the main features of the HTA agencies in the three different countries is shown in Table 1.
Table 1. Features of HTA agencies in England, Germany, and The Netherlands
Data Collection and Analysis
First, we conducted a literature review of studies which analyzed how outcomes are prioritized during HTA processes in the three countries. For the purpose of the literature review, we pragmatically defined prioritized outcomes (and their measures) as those that informed the final decision about whether the drug gets funded, or about its price. We made this decision based on initial searches, which showed how the issue has been investigated in the literature of HTAs. Because we expected that there would be limited evidence from the AD field, we searched for studies across disease areas. Details on search strategies, review, and data extraction methods can be found in Supplement 1. Second, we gathered data on how outcomes and outcome measures had been prioritized in past HTAs of AD drugs. We conducted case studies based on information available on HTA Web sites, which documented the decision processes from the beginning to final recommendation. Here, we conceptualized “prioritization” as the process of deriving decisions about which outcomes and measures should inform the value of AD drugs. We therefore considered any evidence of how decisions were made including views and opinions expressed by stakeholders about the importance of certain outcomes and measures, and how they thought they should inform the decision about the value of AD drugs. Information was extracted on topics relevant to outcomes and outcome measures considered in the appraisal. The framework for case studies and the data extraction form can be found in Supplement 2. The analysis was a thematic one, in which we used a mix of inductive and deductive methods for deriving themes.
About the Data Sources
Our literature review identified a total of thirty-two studies for the three countries. Thirteen studies referred to England; fourteen to Germany (this included one study which also referred to England); and six to The Netherlands. Studies used the following types of methods: quantitative analysis using statistical methods (n = 10); qualitative or mixed methods (n = 16); literature reviews (n = 3); and opinion papers or editorials (n = 4). The main data sources were HTA reports and documentation of the decision processes from HTA agencies' Web sites and interviews. Details of studies can be found in Supplement 3.
The case studies referred to publicly available documentation of HTAs for AD drugs (cholinesterase inhibitors and memantine) carried out between 2006 and 2010 in each of the three countries. This included altogether six HTAs: England (n = 1; covering cholinesterase inhibitors and memantine together); Germany (n = 2; one for cholinesterase inhibitors and one for memantine); and The Netherlands (n = 3; memantine; donepezil; and rivastigmine for Parkinson's disease). What was documented varied widely between HTAs and countries but covered at a minimum:
• draft and final scope (including an agreed set of outcomes and outcome measures);
• draft and final appraisal of the reviewed evidence (including decisions or recommendations);
• consultation comments by stakeholders to draft scope and appraisal.
The list of documents that were identified as well as a list of stakeholders involved in the HTA processes can be found in Supplements 4 and 5; no documentation was available for stakeholder consultation in The Netherlands' HTAs.
Results
A range of evidence related to relevant outcomes and outcome measures was collated under eight themes. The purpose of the collation was to have distinguishable themes that reflected the different aspects covered in the case studies and the literature. The themes are related and to a certain degree overlapping. The findings for each will be described in turn, and we refer in brackets to the numbered data source, which can be found in Supplements 3 and 4.
Cost-Effectiveness
In England, decisions about whether to fund AD drugs were based on cost-effectiveness, which in turn was based on health-related quality of life (QoL; in the form of quality adjusted life-years measured with the EQ-5D) and institutionalization (Supplement 4: 4.5). No other economic consequences (e.g., for hospital care) were included or discussed. Both, health-related QoL and institutionalization were in additional analysis extrapolated from clinical scales for cognition and functioning (Supplement 4: 4.1; 4.2). In Germany and in The Netherlands, no additional economic analysis and no review of economic evidence were conducted, and there was no mention of cost-effectiveness in the scoping documents (Supplement 4: 4.12; 4.13; 4.17–1.19). This partly reflects the different approaches in the three countries toward including cost-effectiveness evidence in HTAs (Supplementary Table 1: 3.5): Germany does not include cost-effectiveness in their HTAs. In The Netherlands, the prices of the drugs were considered “too low” to justify the need for cost-effectiveness considerations, that is, as long as they had additional value and no adverse consequences they would be funded (personal communication with ZIN representative).
Quality of Life (QoL)
There were differences in the ways HTA agencies responded to challenges of measuring QoL for people with AD: NICE allowed the prediction of QoL in the form of economic modeling based on surrogate outcomes measured with clinical scales. This approach was in contrast to the one taken by IQWiG, which does not accept the use of QoL measures such as the EQ-5D and which has been consistently found to rarely accept QoL evidence (Supplementary Table 3: 3.14 3.21; 3.23; 3.25). Methodological requirements (such as a minimum follow-up rate of 70 percent) frequently lead to the exclusion of evidence. Based on this and other methodological requirements not met by studies, IQWiG concluded that the evidence of an impact of AD drugs on QoL was insufficient (Supplementary Table 4: 4.14–4.16; 4.23–4.25). The resulting exclusion of QoL outcomes in the appraisal of AD drugs was criticized by some of the stakeholders (Supplementary Table 4: 4.12; 4.15; 4.16; 4.18; 4.19; 4.21). ZIN, although generally accepting and prioritizing QoL evidence including when measured through the EQ-5D [8], did not review QoL evidence in their HTAs of AD drugs (Supplementary Table 4: 4.26–4.28). We were unable to find an explanation.
Outcomes Measured with Clinical Scales (O-CS)
A wide range of outcomes were measured with clinical scales. Table 2 presents an overview of the scales used in studies reviewed for the technology assessments. Not all scales, however, informed the advice or decisions about the value of AD drugs equally. In all three countries O-CS such as cognition (measured, e.g., with the Alzheimer's Disease Assessment Scale-cognitive subscale; ADAS-cog) or functioning (measured, e.g., with Activities of Daily Living scales; ADL) had an important influence on final decisions (Supplement 4: 4.4; 4.15; 4.24). In the HTAs in England there was less evidence of stakeholder discussion about their relevance to people with AD (Supplement 4: 4.7–4.10). The surrogate nature of those outcomes was made explicit in NICE's documentation (Supplement 4: 4.1–4.4). The debate about the relevance of O-CS for people with AD was strongest in German HTAs (Supplement 4: 4.20; 4.21). Although manufacturers argued the importance of clinical outcomes—in particular cognition—as reliable indicators of QoL with good psychometric properties, some stakeholders doubted whether clinical scales measured something that was meaningful to individuals (Supplement 4: 4.20; 4.21). Both, IQWiG and the Federal Joint Committee (Gemeinsamer Bundesausschuss; G-BA), the body that makes the final and legally binding decision about which drugs are funded, appeared to treat all O-CS as final health outcomes (Supplement 4: 4.12–4.14), which meant that they bypassed some of their stricter methodological requirements that would have applied if they had been treated as surrogate outcomes (Supplementary Table 3: 3.24; 3.27). In terms of specific measures, IQWiG did not accept the use of global assessment outcomes (measured e.g., with the Clinician Interview-Based Impression of Change; CIBIC), which were seen as reflecting the clinician's perspective rather than the perspective of the person with AD (case studies). Instead, they expressed a preference for measures which evaluated personal goal attainment such as the Goal Attainment Scale (Supplement 4: 4.14). ZIN pragmatically accepted those O-CS that had been accepted by the European Medicines Agency as validated outcome measures (Supplementary Table 3: 3.22; Supplement 4: 4.26–4.28). This excluded the Mini-Mental State Examination (MMSE) as a measure for cognition, which was in contrast to NICE, which accepted its use as a main outcome measure for modeling final QoL end points (Supplement 4: 4.3; 4.4; 4.26–4.28). ZIN noted that the wide range of outcome measures across different domains made the comparison of findings from studies difficult (Supplement 4: 4.26–4.28).
Table 2. Clinical scales used in studies identified in HTAs of AD drugs in England, Germany, and The Netherlands
During HTAs of AD drugs in all three countries, stakeholders raised concerns about how to interpret the identified (often very small) changes on clinical scales (Supplement 4: 4.7–4.11; 4.16; 4.20). Several stakeholders argued that there was need for greater clarity, from the beginning, on cut-off points on various scales (Supplement 4: 4.7–4.11; 4.16; 4.20). They should be based on evidence, reflect disease severities, and be relevant to people with dementia. NICE tried to address the challenge of low effect sizes by giving particular weight to multi-domain changes (i.e., a simultaneous change in different scales). IQWiG considered every single outcome separately and as a result came to more conservative conclusions about the value of drugs (i.e., they concluded more uncertainty about their effectiveness), resulting in criticism from the drug manufacturers (Supplement 4: 4.16; 4.20). In The Netherlands, ZIN expected manufacturers to set out and justify relevant cut-offs before conducting studies (Supplement 4: 4.26–4.28). Responding to the uncertainty over clinical relevance and relevance to people with AD, it decided to make the introduction of the drugs subject to start and stop criteria and delegated the application of those to clinicians (Supplement 4: 4.26–4.28).
Adverse Effects
In England, benefit-harm considerations were not given much weight during HTAs possibly because safety concerns were addressed already as part of market authorization and aspects of adverse effects were thought to be captured in QoL outcomes (Supplementary Table 3: 3.11). Some stakeholders felt that the better tolerability of AD drugs, when compared to alternative treatments (such as antipsychotics), was undervalued in this approach (Supplement 4: 4.11). In contrast, adverse effects were regarded as important outcomes from the perspective of people living with AD in Germany and The Netherlands (Supplement 4: 4.14; 4.16; 4.26–4.28). Stakeholders of the HTAs carried out by IQWiG criticized the lack of long-term safety data on AD drugs and raised concerns about whether adverse effects had been underestimated (Supplement 4: 4.11). In The Netherlands, ZIN sometimes left benefit-harm decisions to clinicians, as it concluded that the evidence did not support general conclusions (Supplement 4: 4.26–4.28).
Outcomes Relevant to People with AD
In both England and Germany, stakeholders (mainly patient representatives but also researchers and commissioners) argued that many outcomes relevant to people with AD were not being picked up by the clinical scales (Supplement 4: 4.2; 4.8; 4.11; 4.21; 4.23; 4.25). They advocated for including more tangible outcomes (e.g., ability for someone to pick up the phone) as well as long-term outcomes (e.g., institutionalization). An ability to maintain aspects of personal identity was seen as another important outcome. Stakeholders highlighted an urgent need for appropriate outcome measures in early stages of AD (Supplement 4: 4.8; 4.11).
In both countries, stakeholders thought that this required more flexible approaches toward including evidence (Supplement 4: 4.8; 4.21; 4.23; 4.25). Although this need for different and more flexible processes was to a large extent shared by NICE, in Germany IQWiG and G-BA believed that such changes would contradict legislation and reduce the necessary methodological robustness (Supplement 4: 4.21; 4.23; 4.25).
In the Dutch HTAs for AD drugs, the challenges of considering outcomes that mattered to people with AD, their carers and families were not documented but had been—according to a ZIN representative—discussed at several stages of the process (personal communication).
Carers' Outcomes
In NICE's HTAs of AD drugs, carers' QoL was a primary outcome or end point (Supplementary Table 4: 4.1; 4.3), reflecting the priority given by NICE to this group. However, final decisions were based on an economic model, which did not include carers' outcomes, an omission which was criticized by some stakeholders (Supplement 4: 4.7; 4.11). In Germany, carers' outcomes were not viewed as the responsibility of the healthcare system and were given lower priority relative to outcomes for people with AD (Supplement 4: 4.21; 4.25). Although some stakeholders argued for including carers' outcomes in its own right, there seemed to be an overall consensus that carers' outcomes were important mainly because of their impact on the person with AD (Supplement 4: 4.21; 4.25). In addition, IQWiG was skeptical about carer-reported outcomes for the person with dementia, which they argued reflected the needs of the carer rather than the needs of the person with dementia (Supplement 4: 4.21; 4.25). Dutch HTAs for AD drugs did not include carers' outcomes (Supplement 4: 4.26–4.28).
Institutionalization
Institutionalization for someone with AD was a stated outcome in HTA agencies' documentation and discussed as important by stakeholders in Germany and England (Supplement 4: 4.1; 4.3; 4.12; 4.13; 4.17–4.19). Only English HTAs included institutionalization as an outcome in the economic modeling although stakeholders discussed whether it was possible to accurately predict this outcome because there were many other correlated factors such as the carer's situation and availability of care in the community (Supplement 4: 4.8). In German HTAs institutionalization was viewed by some stakeholders as an outcome that was primarily important from an economic perspective (Supplement 4: 4.21; 4.25). Some stakeholders thought that “institutionalization” could not be measured separately from “time spent caring,” and that instead “hours of care provided” should be measured independently of whether they were provided by a professional or by an unpaid carer (Supplement 4: 4.21; 4.25). Similar to the discussion in English HTAs, stakeholders discussed the lack of evidence on these outcomes and methodological challenges of including them. In The Netherlands there was no recorded information on these outcomes.
Table 3 presents an overview of the findings. We applied categories indicating if an outcome or group of outcomes was prioritized or not prioritized. “Prioritized” outcomes were those that informed decisions and “Not prioritized” outcomes were those that did not inform decisions.
Table 3. Prioritization of outcomes and outcome measures in HTAs of AD drugs
Discussion
This study assessed the outcomes and outcome measures that dominated HTAs of AD drugs in three European countries, and the processes that influenced those priorities. This is to our knowledge the first study, which examines how outcomes and measures for AD drugs are currently prioritized in technology assessments. This study contributes to an increased transparency about reasons for and challenges of including certain outcomes when assessing the value of AD drugs. Overall, we identified some challenges in the process of how outcomes, outcome measures and cut-off points were defined in technology assessments of AD drugs. This included a lack of early involvement of stakeholders in discussions of appropriate outcomes and outcome measures as well as of cut-off points for appropriate effect sizes. In addition, a narrow focus on evidence from certain types of studies, namely randomized controlled trials, led to a strong focus on outcomes measured with clinical scales to the potential exclusion of (long-term) outcomes relevant for people with AD.
Our study was exploratory in nature, and we chose to conduct two methods to address the gap in evidence about the role of outcomes and outcome measure in HTAs of AD drugs. We first reviewed studies that analyzed the influence of outcomes and measures on decisions of the value of drugs in HTAs. Although this provided useful knowledge about common decision-making patterns in HTAs (and reasons for those), it provided only limited information about the process, by which decisions were made about outcomes and measures, and the process by which they influenced decisions. Although we have no affirmative knowledge of the reason for this missing focus of studies, it is plausible that decisions about outcomes and measures are regarded objective or neutral. It is also likely to reflect a wide acceptance of outcomes measured with clinical scales as patient-relevant. As a result, designers of studies and manufacturers have to make decisions about outcomes and measures without certainty whether those will be accepted by HTAs. In the case of HTAs for AD drugs, this is likely to have contributed to the use of a wide range of measures. With the second method, the case studies, we therefore sought to address the gap in evidence about the process by which outcomes and measures are influencing decisions through in-depth analysis of reports produced for technology assessments. This kind of analysis allowed us to understand the nature of decision processes, and stakeholder viewpoints. Although this study was explorative in nature, we were able to shed new light on the important, currently under-investigated role of outcomes and outcome measures in influencing the value of AD drugs.
In terms of methodological robustness, the literature review, although pragmatic, applied systematic search strategies and involved detailed data extraction. Researchers with a high and diverse level of methodological and clinical expertise were involved in and contributed to the robustness of the research process. Approval of the research methods and interpretation of the findings was provided by experts in the field. In terms of limitations, for the case studies, we were reliant on publicly available information, which was limited, especially for the Dutch case studies. Furthermore, by focusing on HTAs with the most comprehensive information and those that were most comparable between countries, we might have missed some aspects of more recent updates of HTAs. Overall, our findings need to be interpreted in the context of a rapidly evolving field. Considerations that decision makers need to take into account today may very well change in the future, for example in light of new evidence and new technologies.
The findings from this study suggest that there are substantial challenges in including outcomes relevant to people with AD when assessing the value and cost-effectiveness of AD drugs. Those challenges are not only relevant to existing AD drugs but to other types of treatment and interventions, which seek to prevent or alter the progression of AD. Unless there is an agreed set of outcomes, outcome measure and cut-offs that define a meaningful diversion from the path without intervention, it will be challenging to assess the value of a drug or an intervention (in particular in relation to other interventions). In the future, this is likely to be relevant to pricing or investment decisions for disease-modifying treatments, which may need to be offered at pre-dementia stages, and which would require measuring surrogate outcomes such as imaging or other biomarkers (Reference Foster, Hackett, White, Chenevert, Svarvar and Bain9). Without outcome measures that are acceptable to relevant stakeholders—including patients, carers, and the wider public—and agreed before HTAs are conducted or preferably even before studies are being developed, there is a risk of delays in the appropriate evaluation of, and access to, new treatments (Reference Ruof, Knoerzer, Dünne, Dintsios, Staab and Schwartz10). Clear methodological guidance on accepted outcome measures in fields such as prevention and diagnostics is therefore needed (Reference Versteegh, Knies and Brouwer11).
This includes the need to consider patient-relevant outcomes in HTAs in addition to clinical outcomes (Reference Kinter, Schmeding, Rudolph, DosReis and Bridges12). Although in early stages, innovative methods have been developed (and tested) that allow HTA agencies to consider patient preferences over different outcomes when developing methodological guidance (Reference Kinter, Schmeding, Rudolph, DosReis and Bridges12). Knowledge is also becoming increasingly available about how to best include patient and carers' perspectives in HTAs (Reference Perfetto, Boutin, Reid, Gascho and Oehrlein13). Decisions about the value of drugs in HTAs in some countries (including England) have shown to be substantially influenced by aspects of value not captured by clinical and economic evidence (Reference Nicod and Kanavos14). Although this is a reflection of including patient and stakeholders perspectives it also raises questions about transparency and consistency of decisions (Reference Nicod and Kanavos14). Therefore, including outcomes, measures and cut-offs that are more patient-relevant (and agreeing on those in advance) is likely to contribute to more consistent decision making as it reduces the need for additional considerations that in effect address the issue of evidence not being sufficiently relevant to what matters to patients, carers, and the wider public.
Furthermore, the challenges we identified suggest a need for collaborative approaches between multiple stakeholders to enable decisions on outcomes and measures to be made early in the process. Some of the required processes are already in place, to varying degrees in different countries, whereas others still need to be developed.
Such multi-stakeholder approaches should go hand in hand with including wider sets of evidence, often referred to as real-world evidence (Reference Makady, de Boer, Hillege, Klungel and Goettsch15). This requires an investment in data that can be used to demonstrate long-term impact on costs and outcomes (Reference Bradley, Akehurst, Ballard, Banerjee, Blennow and Bremner1). This might include data on the costs associated with different rates of disease progression so that cost savings linked to a delay in disease progression can be estimated. Findings from a study that modeled the likely cost-effectiveness of disease-modifying treatments (should they become available) showed that in England the benefit from deferring onset by 1 year would be substantial at about £28,000 (in 2012/13 prices) (Reference Anderson, Knapp, Wittenberg, Handels and Schott16). This highlights the importance of including such data in decision making. Unless the impacts on disease progression, QoL, need for care and costs over time are considered, there is a risk that that future AD drugs and interventions are not valued in line with patient, carers, and wider public interests.
Many countries face the prospect of rapid increases in expenditure related to Alzheimer's disease (AD). Governments are faced with the task of making decisions about which drugs and interventions should be funded. Health technology assessment agencies or other decision-making bodies are responsible for such decisions based on reviews of clinical and effectiveness or cost-effectiveness evidence through a process called health technology assessment (HTA). Assessing the effectiveness or cost-effectiveness for approved AD drugs has been difficult because AD drugs have historically promised only very small or no effects on functional improvement or modifying disease progression (1;2). Although HTA processes vary by country, they have in common that evidence on effectiveness or cost-effectiveness is reviewed by a technical team and interpreted by a group of stakeholders, who present different perspectives such as those of clinicians, drug companies, patient representatives, and researchers. Which outcomes and outcome measures influence final decisions are likely to be based on various criteria including: whether they reflect meaningful changes in a person's life (which is important from the perspective of people living with the condition, their families, and carers); whether they are measurable in study designs (which is important from a developer and manufacturer perspective); and whether they are clinically and economically relevant (which is important from a payer perspective). Processes leading to decisions are complex, and are likely to vary between countries. The aims of our study were to understand: (Reference Bradley, Akehurst, Ballard, Banerjee, Blennow and Bremner1) which outcomes and outcome measures are likely to be prioritized in HTAs for AD drugs in different countries, and (Reference Goldman, Fillit and Neumann2) which processes influence these priorities. This study complemented other work which sought information on outcome prioritization from the perspective of patients, carers, and practitioners (3;4).
Methods
Overall Approach and Selection of Countries
We employed two methods: a literature review and case studies. Findings of the literature review informed the design of the case studies. Both methods are explained in more detail below. Researchers with methodological expertise in systematic reviews (CT and AL) and qualitative research interviews (MN) as well as researchers specialized in medicine and neurology (CS), with knowledge of drug reimbursements and of HTA processes (CB and AG), and of dementia policies and economics (MK and RW) were involved in reviewing the methods throughout the research. In addition, the advisory group of the larger research program of which this study was a part and which consisted of members from HTA or regulatory agencies across the world commented formally on initial findings. Three European countries were selected: England, Germany, and The Netherlands. The respective HTA agencies are the National Institute for Health and Care Excellence (NICE) in England, the Institute for Quality and Efficiency in Health Care (IQWiG) in Germany, and the Dutch Zorginstituut Nederland (ZIN). The choice was influenced by the size of the economy and roles and responsibilities of HTA agencies with the aim to have multiple perspectives: England and Germany present two large economies in Europe, in which HTA agencies have taken on different roles and responsibilities. For example, although in England drugs need to be cost-effective in order to be publicly funded (5;6), in Germany decisions about whether drugs are funded and at what price are primarily based on their added therapeutic benefit (Reference Paris and Belloni7). The Netherlands, as a relatively small economy in Europe, has taken a middle ground approach in this regard: the cost-effectiveness of drugs needs to be proven if their cost is above a certain threshold (8). An overview of the main features of the HTA agencies in the three different countries is shown in Table 1.
Table 1. Features of HTA agencies in England, Germany, and The Netherlands
Data Collection and Analysis
First, we conducted a literature review of studies which analyzed how outcomes are prioritized during HTA processes in the three countries. For the purpose of the literature review, we pragmatically defined prioritized outcomes (and their measures) as those that informed the final decision about whether the drug gets funded, or about its price. We made this decision based on initial searches, which showed how the issue has been investigated in the literature of HTAs. Because we expected that there would be limited evidence from the AD field, we searched for studies across disease areas. Details on search strategies, review, and data extraction methods can be found in Supplement 1. Second, we gathered data on how outcomes and outcome measures had been prioritized in past HTAs of AD drugs. We conducted case studies based on information available on HTA Web sites, which documented the decision processes from the beginning to final recommendation. Here, we conceptualized “prioritization” as the process of deriving decisions about which outcomes and measures should inform the value of AD drugs. We therefore considered any evidence of how decisions were made including views and opinions expressed by stakeholders about the importance of certain outcomes and measures, and how they thought they should inform the decision about the value of AD drugs. Information was extracted on topics relevant to outcomes and outcome measures considered in the appraisal. The framework for case studies and the data extraction form can be found in Supplement 2. The analysis was a thematic one, in which we used a mix of inductive and deductive methods for deriving themes.
About the Data Sources
Our literature review identified a total of thirty-two studies for the three countries. Thirteen studies referred to England; fourteen to Germany (this included one study which also referred to England); and six to The Netherlands. Studies used the following types of methods: quantitative analysis using statistical methods (n = 10); qualitative or mixed methods (n = 16); literature reviews (n = 3); and opinion papers or editorials (n = 4). The main data sources were HTA reports and documentation of the decision processes from HTA agencies' Web sites and interviews. Details of studies can be found in Supplement 3.
The case studies referred to publicly available documentation of HTAs for AD drugs (cholinesterase inhibitors and memantine) carried out between 2006 and 2010 in each of the three countries. This included altogether six HTAs: England (n = 1; covering cholinesterase inhibitors and memantine together); Germany (n = 2; one for cholinesterase inhibitors and one for memantine); and The Netherlands (n = 3; memantine; donepezil; and rivastigmine for Parkinson's disease). What was documented varied widely between HTAs and countries but covered at a minimum:
• draft and final scope (including an agreed set of outcomes and outcome measures);
• draft and final appraisal of the reviewed evidence (including decisions or recommendations);
• consultation comments by stakeholders to draft scope and appraisal.
The list of documents that were identified as well as a list of stakeholders involved in the HTA processes can be found in Supplements 4 and 5; no documentation was available for stakeholder consultation in The Netherlands' HTAs.
Results
A range of evidence related to relevant outcomes and outcome measures was collated under eight themes. The purpose of the collation was to have distinguishable themes that reflected the different aspects covered in the case studies and the literature. The themes are related and to a certain degree overlapping. The findings for each will be described in turn, and we refer in brackets to the numbered data source, which can be found in Supplements 3 and 4.
Cost-Effectiveness
In England, decisions about whether to fund AD drugs were based on cost-effectiveness, which in turn was based on health-related quality of life (QoL; in the form of quality adjusted life-years measured with the EQ-5D) and institutionalization (Supplement 4: 4.5). No other economic consequences (e.g., for hospital care) were included or discussed. Both, health-related QoL and institutionalization were in additional analysis extrapolated from clinical scales for cognition and functioning (Supplement 4: 4.1; 4.2). In Germany and in The Netherlands, no additional economic analysis and no review of economic evidence were conducted, and there was no mention of cost-effectiveness in the scoping documents (Supplement 4: 4.12; 4.13; 4.17–1.19). This partly reflects the different approaches in the three countries toward including cost-effectiveness evidence in HTAs (Supplementary Table 1: 3.5): Germany does not include cost-effectiveness in their HTAs. In The Netherlands, the prices of the drugs were considered “too low” to justify the need for cost-effectiveness considerations, that is, as long as they had additional value and no adverse consequences they would be funded (personal communication with ZIN representative).
Quality of Life (QoL)
There were differences in the ways HTA agencies responded to challenges of measuring QoL for people with AD: NICE allowed the prediction of QoL in the form of economic modeling based on surrogate outcomes measured with clinical scales. This approach was in contrast to the one taken by IQWiG, which does not accept the use of QoL measures such as the EQ-5D and which has been consistently found to rarely accept QoL evidence (Supplementary Table 3: 3.14 3.21; 3.23; 3.25). Methodological requirements (such as a minimum follow-up rate of 70 percent) frequently lead to the exclusion of evidence. Based on this and other methodological requirements not met by studies, IQWiG concluded that the evidence of an impact of AD drugs on QoL was insufficient (Supplementary Table 4: 4.14–4.16; 4.23–4.25). The resulting exclusion of QoL outcomes in the appraisal of AD drugs was criticized by some of the stakeholders (Supplementary Table 4: 4.12; 4.15; 4.16; 4.18; 4.19; 4.21). ZIN, although generally accepting and prioritizing QoL evidence including when measured through the EQ-5D [8], did not review QoL evidence in their HTAs of AD drugs (Supplementary Table 4: 4.26–4.28). We were unable to find an explanation.
Outcomes Measured with Clinical Scales (O-CS)
A wide range of outcomes were measured with clinical scales. Table 2 presents an overview of the scales used in studies reviewed for the technology assessments. Not all scales, however, informed the advice or decisions about the value of AD drugs equally. In all three countries O-CS such as cognition (measured, e.g., with the Alzheimer's Disease Assessment Scale-cognitive subscale; ADAS-cog) or functioning (measured, e.g., with Activities of Daily Living scales; ADL) had an important influence on final decisions (Supplement 4: 4.4; 4.15; 4.24). In the HTAs in England there was less evidence of stakeholder discussion about their relevance to people with AD (Supplement 4: 4.7–4.10). The surrogate nature of those outcomes was made explicit in NICE's documentation (Supplement 4: 4.1–4.4). The debate about the relevance of O-CS for people with AD was strongest in German HTAs (Supplement 4: 4.20; 4.21). Although manufacturers argued the importance of clinical outcomes—in particular cognition—as reliable indicators of QoL with good psychometric properties, some stakeholders doubted whether clinical scales measured something that was meaningful to individuals (Supplement 4: 4.20; 4.21). Both, IQWiG and the Federal Joint Committee (Gemeinsamer Bundesausschuss; G-BA), the body that makes the final and legally binding decision about which drugs are funded, appeared to treat all O-CS as final health outcomes (Supplement 4: 4.12–4.14), which meant that they bypassed some of their stricter methodological requirements that would have applied if they had been treated as surrogate outcomes (Supplementary Table 3: 3.24; 3.27). In terms of specific measures, IQWiG did not accept the use of global assessment outcomes (measured e.g., with the Clinician Interview-Based Impression of Change; CIBIC), which were seen as reflecting the clinician's perspective rather than the perspective of the person with AD (case studies). Instead, they expressed a preference for measures which evaluated personal goal attainment such as the Goal Attainment Scale (Supplement 4: 4.14). ZIN pragmatically accepted those O-CS that had been accepted by the European Medicines Agency as validated outcome measures (Supplementary Table 3: 3.22; Supplement 4: 4.26–4.28). This excluded the Mini-Mental State Examination (MMSE) as a measure for cognition, which was in contrast to NICE, which accepted its use as a main outcome measure for modeling final QoL end points (Supplement 4: 4.3; 4.4; 4.26–4.28). ZIN noted that the wide range of outcome measures across different domains made the comparison of findings from studies difficult (Supplement 4: 4.26–4.28).
Table 2. Clinical scales used in studies identified in HTAs of AD drugs in England, Germany, and The Netherlands
ADL, activities of daily living; ADAS-cog, Alzheimer's Disease Assessment Scale-cognitive subscale; ADCS-ADL, Alzheimer's Disease Cooperative Study-Activities of Daily Living; ADCS-MCIADL, the mild cognitive impairment ADL scale; ACTS, allocation of caregiver time burden; BGP, Behavioral Rating Scale for Geriatric Patients; BEHAVE-AD, Behavioral Pathology in Alzheimer's Disease Rating Scale; BRSD, Behavioral Rating Scale for Dementias; BGP-C, Behavioral Rating Scale for Geriatric Patients-Cognitive Subscale; BrADL, Bristol Activities of daily Living Scale; CDR, clinical dementia rating; CDR-SB, CDR sum of boxes; CIBIC-plus, Clinician's Interview-based Impression of Change-plus; CAS, Caregiver Activity Survey; CBS, Caregiving Burden Scale; CSS, Caregiver Stress Scale; CMCS, Caregiver-rated Modified Crichton Scale; DAD, Disability Assessment for Dementia; FAST, Functional Assessment Staging; GAS, Goal Attainment Scale; GDS, Global Deterioration Scale; GBS, Gottfried, Brine and Steen Scale; IDDD, Interview for Deterioration in Daily Living Activities in Dementia subscale; IADL-plus, Instrumental Activities of Daily Living (-plus; J-CGIC, Japanese-Clinical Global Impression of Change; MMSE, Mini Mental State Examination, NPI, Neuropsychiatric Inventory; NOSGER, Nurses' Observation Scale for Geriatric Patients; NPI-D, Neuropsychiatric Inventory Caregiver Distress Scale; NPI, Neuropsychiatric inventory; PDS, Progressive Deterioration Scale; SIB, Severe Impairment Battery scales; PSMS-plus, Physical Self-Maintenance Scale -plus; SIB, Severe Impairment Battery.
During HTAs of AD drugs in all three countries, stakeholders raised concerns about how to interpret the identified (often very small) changes on clinical scales (Supplement 4: 4.7–4.11; 4.16; 4.20). Several stakeholders argued that there was need for greater clarity, from the beginning, on cut-off points on various scales (Supplement 4: 4.7–4.11; 4.16; 4.20). They should be based on evidence, reflect disease severities, and be relevant to people with dementia. NICE tried to address the challenge of low effect sizes by giving particular weight to multi-domain changes (i.e., a simultaneous change in different scales). IQWiG considered every single outcome separately and as a result came to more conservative conclusions about the value of drugs (i.e., they concluded more uncertainty about their effectiveness), resulting in criticism from the drug manufacturers (Supplement 4: 4.16; 4.20). In The Netherlands, ZIN expected manufacturers to set out and justify relevant cut-offs before conducting studies (Supplement 4: 4.26–4.28). Responding to the uncertainty over clinical relevance and relevance to people with AD, it decided to make the introduction of the drugs subject to start and stop criteria and delegated the application of those to clinicians (Supplement 4: 4.26–4.28).
Adverse Effects
In England, benefit-harm considerations were not given much weight during HTAs possibly because safety concerns were addressed already as part of market authorization and aspects of adverse effects were thought to be captured in QoL outcomes (Supplementary Table 3: 3.11). Some stakeholders felt that the better tolerability of AD drugs, when compared to alternative treatments (such as antipsychotics), was undervalued in this approach (Supplement 4: 4.11). In contrast, adverse effects were regarded as important outcomes from the perspective of people living with AD in Germany and The Netherlands (Supplement 4: 4.14; 4.16; 4.26–4.28). Stakeholders of the HTAs carried out by IQWiG criticized the lack of long-term safety data on AD drugs and raised concerns about whether adverse effects had been underestimated (Supplement 4: 4.11). In The Netherlands, ZIN sometimes left benefit-harm decisions to clinicians, as it concluded that the evidence did not support general conclusions (Supplement 4: 4.26–4.28).
Outcomes Relevant to People with AD
In both England and Germany, stakeholders (mainly patient representatives but also researchers and commissioners) argued that many outcomes relevant to people with AD were not being picked up by the clinical scales (Supplement 4: 4.2; 4.8; 4.11; 4.21; 4.23; 4.25). They advocated for including more tangible outcomes (e.g., ability for someone to pick up the phone) as well as long-term outcomes (e.g., institutionalization). An ability to maintain aspects of personal identity was seen as another important outcome. Stakeholders highlighted an urgent need for appropriate outcome measures in early stages of AD (Supplement 4: 4.8; 4.11).
In both countries, stakeholders thought that this required more flexible approaches toward including evidence (Supplement 4: 4.8; 4.21; 4.23; 4.25). Although this need for different and more flexible processes was to a large extent shared by NICE, in Germany IQWiG and G-BA believed that such changes would contradict legislation and reduce the necessary methodological robustness (Supplement 4: 4.21; 4.23; 4.25).
In the Dutch HTAs for AD drugs, the challenges of considering outcomes that mattered to people with AD, their carers and families were not documented but had been—according to a ZIN representative—discussed at several stages of the process (personal communication).
Carers' Outcomes
In NICE's HTAs of AD drugs, carers' QoL was a primary outcome or end point (Supplementary Table 4: 4.1; 4.3), reflecting the priority given by NICE to this group. However, final decisions were based on an economic model, which did not include carers' outcomes, an omission which was criticized by some stakeholders (Supplement 4: 4.7; 4.11). In Germany, carers' outcomes were not viewed as the responsibility of the healthcare system and were given lower priority relative to outcomes for people with AD (Supplement 4: 4.21; 4.25). Although some stakeholders argued for including carers' outcomes in its own right, there seemed to be an overall consensus that carers' outcomes were important mainly because of their impact on the person with AD (Supplement 4: 4.21; 4.25). In addition, IQWiG was skeptical about carer-reported outcomes for the person with dementia, which they argued reflected the needs of the carer rather than the needs of the person with dementia (Supplement 4: 4.21; 4.25). Dutch HTAs for AD drugs did not include carers' outcomes (Supplement 4: 4.26–4.28).
Institutionalization
Institutionalization for someone with AD was a stated outcome in HTA agencies' documentation and discussed as important by stakeholders in Germany and England (Supplement 4: 4.1; 4.3; 4.12; 4.13; 4.17–4.19). Only English HTAs included institutionalization as an outcome in the economic modeling although stakeholders discussed whether it was possible to accurately predict this outcome because there were many other correlated factors such as the carer's situation and availability of care in the community (Supplement 4: 4.8). In German HTAs institutionalization was viewed by some stakeholders as an outcome that was primarily important from an economic perspective (Supplement 4: 4.21; 4.25). Some stakeholders thought that “institutionalization” could not be measured separately from “time spent caring,” and that instead “hours of care provided” should be measured independently of whether they were provided by a professional or by an unpaid carer (Supplement 4: 4.21; 4.25). Similar to the discussion in English HTAs, stakeholders discussed the lack of evidence on these outcomes and methodological challenges of including them. In The Netherlands there was no recorded information on these outcomes.
Table 3 presents an overview of the findings. We applied categories indicating if an outcome or group of outcomes was prioritized or not prioritized. “Prioritized” outcomes were those that informed decisions and “Not prioritized” outcomes were those that did not inform decisions.
Table 3. Prioritization of outcomes and outcome measures in HTAs of AD drugs
Discussion
This study assessed the outcomes and outcome measures that dominated HTAs of AD drugs in three European countries, and the processes that influenced those priorities. This is to our knowledge the first study, which examines how outcomes and measures for AD drugs are currently prioritized in technology assessments. This study contributes to an increased transparency about reasons for and challenges of including certain outcomes when assessing the value of AD drugs. Overall, we identified some challenges in the process of how outcomes, outcome measures and cut-off points were defined in technology assessments of AD drugs. This included a lack of early involvement of stakeholders in discussions of appropriate outcomes and outcome measures as well as of cut-off points for appropriate effect sizes. In addition, a narrow focus on evidence from certain types of studies, namely randomized controlled trials, led to a strong focus on outcomes measured with clinical scales to the potential exclusion of (long-term) outcomes relevant for people with AD.
Our study was exploratory in nature, and we chose to conduct two methods to address the gap in evidence about the role of outcomes and outcome measure in HTAs of AD drugs. We first reviewed studies that analyzed the influence of outcomes and measures on decisions of the value of drugs in HTAs. Although this provided useful knowledge about common decision-making patterns in HTAs (and reasons for those), it provided only limited information about the process, by which decisions were made about outcomes and measures, and the process by which they influenced decisions. Although we have no affirmative knowledge of the reason for this missing focus of studies, it is plausible that decisions about outcomes and measures are regarded objective or neutral. It is also likely to reflect a wide acceptance of outcomes measured with clinical scales as patient-relevant. As a result, designers of studies and manufacturers have to make decisions about outcomes and measures without certainty whether those will be accepted by HTAs. In the case of HTAs for AD drugs, this is likely to have contributed to the use of a wide range of measures. With the second method, the case studies, we therefore sought to address the gap in evidence about the process by which outcomes and measures are influencing decisions through in-depth analysis of reports produced for technology assessments. This kind of analysis allowed us to understand the nature of decision processes, and stakeholder viewpoints. Although this study was explorative in nature, we were able to shed new light on the important, currently under-investigated role of outcomes and outcome measures in influencing the value of AD drugs.
In terms of methodological robustness, the literature review, although pragmatic, applied systematic search strategies and involved detailed data extraction. Researchers with a high and diverse level of methodological and clinical expertise were involved in and contributed to the robustness of the research process. Approval of the research methods and interpretation of the findings was provided by experts in the field. In terms of limitations, for the case studies, we were reliant on publicly available information, which was limited, especially for the Dutch case studies. Furthermore, by focusing on HTAs with the most comprehensive information and those that were most comparable between countries, we might have missed some aspects of more recent updates of HTAs. Overall, our findings need to be interpreted in the context of a rapidly evolving field. Considerations that decision makers need to take into account today may very well change in the future, for example in light of new evidence and new technologies.
The findings from this study suggest that there are substantial challenges in including outcomes relevant to people with AD when assessing the value and cost-effectiveness of AD drugs. Those challenges are not only relevant to existing AD drugs but to other types of treatment and interventions, which seek to prevent or alter the progression of AD. Unless there is an agreed set of outcomes, outcome measure and cut-offs that define a meaningful diversion from the path without intervention, it will be challenging to assess the value of a drug or an intervention (in particular in relation to other interventions). In the future, this is likely to be relevant to pricing or investment decisions for disease-modifying treatments, which may need to be offered at pre-dementia stages, and which would require measuring surrogate outcomes such as imaging or other biomarkers (Reference Foster, Hackett, White, Chenevert, Svarvar and Bain9). Without outcome measures that are acceptable to relevant stakeholders—including patients, carers, and the wider public—and agreed before HTAs are conducted or preferably even before studies are being developed, there is a risk of delays in the appropriate evaluation of, and access to, new treatments (Reference Ruof, Knoerzer, Dünne, Dintsios, Staab and Schwartz10). Clear methodological guidance on accepted outcome measures in fields such as prevention and diagnostics is therefore needed (Reference Versteegh, Knies and Brouwer11).
This includes the need to consider patient-relevant outcomes in HTAs in addition to clinical outcomes (Reference Kinter, Schmeding, Rudolph, DosReis and Bridges12). Although in early stages, innovative methods have been developed (and tested) that allow HTA agencies to consider patient preferences over different outcomes when developing methodological guidance (Reference Kinter, Schmeding, Rudolph, DosReis and Bridges12). Knowledge is also becoming increasingly available about how to best include patient and carers' perspectives in HTAs (Reference Perfetto, Boutin, Reid, Gascho and Oehrlein13). Decisions about the value of drugs in HTAs in some countries (including England) have shown to be substantially influenced by aspects of value not captured by clinical and economic evidence (Reference Nicod and Kanavos14). Although this is a reflection of including patient and stakeholders perspectives it also raises questions about transparency and consistency of decisions (Reference Nicod and Kanavos14). Therefore, including outcomes, measures and cut-offs that are more patient-relevant (and agreeing on those in advance) is likely to contribute to more consistent decision making as it reduces the need for additional considerations that in effect address the issue of evidence not being sufficiently relevant to what matters to patients, carers, and the wider public.
Furthermore, the challenges we identified suggest a need for collaborative approaches between multiple stakeholders to enable decisions on outcomes and measures to be made early in the process. Some of the required processes are already in place, to varying degrees in different countries, whereas others still need to be developed.
Such multi-stakeholder approaches should go hand in hand with including wider sets of evidence, often referred to as real-world evidence (Reference Makady, de Boer, Hillege, Klungel and Goettsch15). This requires an investment in data that can be used to demonstrate long-term impact on costs and outcomes (Reference Bradley, Akehurst, Ballard, Banerjee, Blennow and Bremner1). This might include data on the costs associated with different rates of disease progression so that cost savings linked to a delay in disease progression can be estimated. Findings from a study that modeled the likely cost-effectiveness of disease-modifying treatments (should they become available) showed that in England the benefit from deferring onset by 1 year would be substantial at about £28,000 (in 2012/13 prices) (Reference Anderson, Knapp, Wittenberg, Handels and Schott16). This highlights the importance of including such data in decision making. Unless the impacts on disease progression, QoL, need for care and costs over time are considered, there is a risk that that future AD drugs and interventions are not valued in line with patient, carers, and wider public interests.
Conclusions
This study investigated the role of outcomes and outcome measures in HTAs of AD drugs in three European countries. The findings highlight the strong priority placed on outcomes measured with clinical scales as well as the challenges of considering measures that capture changes in disease progression that are potentially relevant from the perspective of people living with the condition, their families, and carers. We conclude that there is an urgent need to reform HTA processes to appropriately assess the value of AD drugs.
Supplementary Material
The supplementary material for this article can be found at https://doi.org/10.1017/S0266462320000574.
Acknowledgments
We would like to thank members of the ROADMAP health technology assessment and regulatory bodies expert advisory group (EXAG) as well as the following individuals, who provided advice on the scope of the work and on the interpretation of findings: Dr Amr Makady from the Zorginstituut Nederland (ZIN), and Dr Joshua Pink and Dr Jacoline Bouvy from the National Institute for Health and Care Excellence (NICE). We performed the study as part of the Innovative Medicines Initiative/Horizon 2020 ROADMAP (Real world Outcomes across the Alzheimer's Disease spectrum for better care: Multi-modal data Access Platform) project.
Financial support
This study was part of the Real World outcomes across the AD spectrum for better care (ROADMAP) project. This project received funding from the Innovative Medicines Initiative 2 Joint Undertaking [grant agreement no 116020 (“ROADMAP”)]. This Joint Undertaking receives support from the European Union's Horizon 2020 research and innovation program and the European Federation of Pharmaceutical Industries and Associations (EFPIA).
Conflict of Interest
CS was lead (jointly with CB) of work package 2 of the ROADMAP project; CS is lead of dementia outcomes work package for the UK MRC-funded Dementias Platform UK. CB is employed by F. Hoffman-La Roche. AG is a partner of Quantify Research, providing consultancy services to pharmaceutical companies and other private and public organizations and institutions. AGs contribution to ROADMAP was on behalf of Roche Pharmaceuticals. AB, AL, CT, MK, MN, and RW have no conflict of interests to declare.