Health technology assessment (HTA) is increasingly used to inform health policy decisions. It has become a powerful tool when linked to jurisdictional legislation that determines reimbursement and pricing policies. Because these decisions must be politically and legally defensible, they have hastened the need for recognized “best” practice in HTA (1;Reference Bronner7;Reference Drummond, Schwartz and Jonsson24;Reference Ferguson, Dubinsky and Kirsch30;Reference Francke and Hart31. Additionally, the globalization of “health” means the decisions taken and fiduciary responsibility of local health systems has an increased global importance (Reference Cox14). As with the institution of science in general, the increasing importance of HTA strongly suggests the need to study it in a transparent and comparative way (Reference Kuhn52).
Despite increased activity worldwide, there is currently a lack of understanding of the differences in its application of HTA, leading to questions about its quality, comparability, generalizability, applicability, and practical usefulness (Reference Chinitz13;Reference Cox14;27;Reference Freeman32). Additionally, a lack of recognized standards in quality assurance and the debate regarding the need for harmonized HTA methods and processes has been recently recognized (Reference Draborg and Andersen20;27). Despite increased harmonization activity, the need for recommended international standards has been communicated at the European level and from experts in the field (Reference Cox14;27;Reference Petherick, Villanueva, Dumville, Bryan and Dharmage73;84).
Descriptions of HTA methods and processes (Reference Brehaut and Juzwishin6;Reference Gagnon, Sanchez and Pons33;Reference Hailey39;Reference Hivon, Lehoux, Denis and Tailliez45;Reference Hutton, McGrath and Frybourg47;Reference Lafortune, Farand, Mondou, Sicotte and Battista53;Reference Lehoux and Williams-Jones55;Reference Oliver, Mossialos and Robinson64;Reference Philips, Bojke, Sculpher, Claxton and Golder74;Reference Wanke, Juzwishin, Thornley and Chan79) as well as international comparisons (e.g., Reference Banta, Gelband, Jonsson and Battista3;Reference Barbieri, Drummond and Willke4;Reference Chinitz13;Reference Dickson, Hurst and Jacobzone17;Reference Draborg and Andersen19;Reference Draborg, Gyrd-Hansen, Poulsen and Horder22;Reference Garcia-Altes, Ondategui-Parra and Neumann34;Reference Hjelmgren, Berggren and Andersson46;Reference Martelli, Torre and Ghionno57;Reference Oortwijn, Banta and Cranovsky65;Reference Perry, Gardner and Thamer70–Reference Perry and Thamer72;Reference Wild and Gibis83) already exist. The importance of institutional relationships with HTA has also been well recognized (Reference Chinitz13). However, we were not aware of any attempt to develop a systematic description and comparison of features across HTA agencies. Because a descriptive framework could be helpful to those studying and developing HTA programs, we sought to develop and apply a descriptive framework using selected examples of European HTA agencies that could be applied to any HTA organization internationally.
METHODS AND DATA
Selection of Sample Agencies
The choice of cases relies on conceptual, not on representative grounds (Reference Miles and Huberman59). We based our selection on the following characteristics: The included agencies should be leading institutions in industrialized European countries. They have to have had an established agency history, operate nationally, and be mainly publicly financed. We wanted to identify at least two contrary healthcare structures (social health versus national health insurance), within which potentially particular differences could be distinguished (centralized versus decentralized). Information had to be available in German or English, and contact persons and experts had to be accessible. From a larger list of possibilities, we chose the following agencies: DAHTA@DIMDI - German Agency for HTA at the German Institute for Medical Documentation and Information (Germany); HAS – French National Authority for Health (France); IQWiG – Institute for Quality and Efficiency in Health Care (Germany); NICE – National Institute for Health and Clinical Excellence (England, Wales, UK); and the SBU – Swedish Council on Technology Assessment in Health Care (Sweden).
At the time of our research, DAHTA@DIMDI, IQWiG, HAS, and NICE were in transition due to legislation amendments and organizational and financial factors (8–12;Reference Degos16;41;61;62; Goehlen, personal communication [24 January 2008]; Meyer, personal communication [10 April 2008]).
Data and Information Collection
Data were collected from a systematic literature review, handsearch, and survey. We used databases, Web sites, and staff of HTA organizations to source data. A literature search was performed (R.S.) without limits of time or study type in electronic medical and health-economic databases between October 4, 2007, and November 11, 2007, and last updated in spring 2008 (for details see Figure 1). Titles and abstracts were screened (R.S.), data extracted and encoded (R.S.), and the results checked by a second reviewer (U.S.). Differences were resolved by discussion between both authors. We used all search terms describing HTA in the title field and included HTA reports on methods. Information on HTA and HTA agencies not provided by electronic medical databases was searched using Internet search engines.

Figure 1. Flowchart of identification and inclusion of literature and information search. Notes: Date of systematic database searches: 4 October through 8 November 2007. EMBASE, Excerpta Medica Database; Econlit, Economic Literature Database; MEDLINE, Medical Literature Analysis and Retrieval System Online; SCI/SSCI, Science Citation Index Expanded, IJTAHC, International Journal of Technology Assessment in Health Care; CRD, Centre for Reviews and Dissemination; NCCHTA, National Coordinating Centre for Health Technology Assessment.
We used the selected articles to identify elementary features, which described HTA and HTA agencies. Descriptive features were deemed as relevant if they were discussed by experts in the field. Experts were defined as authors of respective publications or members of (inter)national working groups on HTA, describing HTA (Reference Banta2;Reference Battista5;Reference Drummond, Manca and Sculpher23;Reference Drummond, Schwartz and Jonsson24;Reference Gerhardus and Dintsios35 and experienced by collaboration in large projects such as those financed by the European Commission (Reference Cranovsky, Matillon and Banta15;27–29;Reference Hailey39;Reference Henshall, Oortwijn, Stevens, Granados and Banta44;48;Reference Liberati, Sheldon and Banta56;Reference Velasco-Garrido, Perleth and Drummond77;Reference Werko and Banta82). Being “elementary” was defined as disaggregated and qualitatively judged as important, necessary, typical, and constitutive for a meaningful description.
We then searched agency and other relevant Web sites for supplementary data and information if organizational specifics or methodological and procedural aspects of HTA agencies were missing or dated.
Finally, we contacted staff from each agency for particular information for any still-missing data. We used questionnaires describing missing information for each agency and then either sent them by email or used them to conduct telephone interviews. Data were extracted into an evidence table.
Framework Development
From the collected data, we identified elements and characteristics of “HTA” that were not context- or region-specific. We classified these elements using a two-level scheme. Main categories of elements were called “domains” and descriptors of these domains were called “items.” Each item was then classified according to a standardized value called an “indicator.” If we did not find any adequate information for an item, we entered “unknown.”
Data Extraction and Comparison
We used a cross-case comparison methodology, which involves drawing up a matrix of features that have been found to be present in the cases, and marking whether each feature is present or not in each case. This allows for determinations of difference or similarity between cases (Reference Weed80). To do this, we arranged domains, items, and indicators in a standardized descriptive table format. Information was encoded along the standardized indicators and entered in a descriptive table. At this nonaggregated level, we accumulated a complex qualitative dataset comprising 5 (number of the agencies) times 90 (number of the items) “data-points.” We used a simple quantitative algorithm to operationalize and assess the similarity between agencies regarding items and domains. For this purpose, for each item, we performed pairwise comparisons of all five agencies (i.e., 10 comparisons: agency 1 versus agency 2, agency 1 versus agency 3. . . agency 4 versus agency 5). We assigned 1 point for equal comparisons and 0 points otherwise. We defined item similarity as the percentage of actual points out of the maximum points (i.e., 10 points). For example, if among five agencies, all used a societal perspective for economic evaluation, then similarity was 100 percent. If three agencies used a societal perspective and two agencies the payer's perspective, then 3 points would be given for the three identical pairs with the societal perspective plus 1 point for the one pair with payer's perspective, yielding 4 points, and, 4/10 = 40 percent similarity. Domain similarity across agencies was then calculated as the average of all item similarities of this domain.
We then compared and qualitatively interpreted the table entries across the agencies regarding similarities and differences. Finally, based on all preceding steps, we explored and interpreted similarities and differences across all agencies by taking into account the logic behind the conceptualization of our framework across three areas: the institutional setting; the legal context of the agency; and/or the country.
RESULTS
Search
As shown in Figure 1, a total of 11,662 citations were identified from the literature search. Additional material derived from references and other sources increased this to 12,036. After exclusion of material in languages other than English and German (n = 53), and removing duplicates (n = 233), 11,803 citations remained. After screening titles and abstracts, we excluded 10,335 publications, leaving 1,468 potentially relevant publications. Full texts of these publications were retrieved, the method and results sections examined for potential relevance. Those not excluded at this stage were screened in detail; ultimately 306 publications were included.
Framework
Our search for frameworks applicable to HTA revealed some generally suitable examples sourced from public health, (new) public management and other disciplines and science perspectives (Reference Easton26;Reference Hailey39;Reference Hansson40;Reference Hutton, McGrath and Frybourg47;Reference Jann, Wegrich, Schubert and Bandelow49;Reference Mintzberg60;Reference Reinermann75;Reference von Rosenstiel, Molt and Rüttinger78;Reference Weiss81). From these perspectives, we derived an initial crude generic concept including both static-structural and dynamic-procedural aspects covering the elements “input” (representing HTA organization, infrastructure, environment), activity (performing HTA, maintaining organization), “output” (reports, results, recommendations), “usage” (decision preparation, decision making, dissemination, implementation), and “impact” (change of different parameters).
From three previously identified frameworks (Reference Hailey39;Reference Jonsson, Banta, Henshall and Sampietro-Colom50;Reference Wanke, Juzwishin, Thornley and Chan79), we found ninety items that fell under the following domains: (a) organization, (b) scope, (c) processes, (d) methods, (e) dissemination, (f) decision, (g) implementation, and (h) impact. Using these domains, we constructed a structure- and sequence-based HTA framework that connects an outcome area (population), to a decision-making area (policy), and back to a production location (science). Figure 2 depicts the domains and general framework.

Figure 2. Proportion of similarity across Domains and Areas. Notes: To be read from the left to the right across three areas: The scope of an organization determines the processes and the methods used. The product of the organization is then disseminated reaching the policy area for decision making, decisions are intended to be implemented in society and to have an impact. Theoretically, a feedback loop back to the policy and the science area could be assumed.
Application and Interpretation of the Framework
The number of items per domain ranged from twenty in the domain “methods” to six in the domains “implementation” and “impact.” Four domains contained between eleven and twenty items: “methods,” “processes” (n = 11 items), “scope” (n = 13), and “organization” (n = 16), whereas the other four domains had between six and nine items. Table 1 shows the percentage similarity at both item and domain levels. Disaggregated information of all ninety items in eight domains is provided in detail in a supplementary list which is available from the authors on request.
Table 1. Percentages of Similarities in Domains

Note. For each item, we performed pairwise comparisons of all five agencies (10 comparisons). We assigned 1 point per pair that showed identical item characteristics and 0 points otherwise. Item similarity was defined as the percentage of actual points out of the 10 possible points. 5/5: all agencies identical regarding this item; similarity score = 10; 4/1: 4 agencies identical, one agency different from all others; similarity score = 6; 3/2: 3 agencies identical, another two agencies identical but different from the other three; similarity score = 4; 3/1/1: 3 agencies identical, each of the remaining two agencies different from these three and also different from each others; similarity score = 3; 2/2/1: 2 agencies identical, another 2 agencies identical, but different from the first two agencies, one agency different from all others; similarity score = 2; 2/1/1/1: 2 agencies identical, each of the remaining 3 agencies different from each other; similarity score = 1; 1/1/1/1/1: all agencies are different; similarity score = 0.
“Dissemination” was the domain with the highest similarity percentage (40 percent) while “implementation” showed the lowest similarity by 17 percent (see Table 1). From highest to lowest, other domains rated as follows: Scope (38 percent), Organization (35 percent), Methods (29 percent), Process (26 percent), Impact (23 percent), and Decision (19 percent). We observed a higher similarity across HTA agencies in the “science” area ranging from 26 percent to 40 percent in contrast to policy and population areas, which ranged between 17 percent and 23 percent (see Figure 2). When framework sequence is taken into account, the analysis reveals the lowest percentage similarity (“policy”) occurs downstream of the highest percentage similarity (“science”). Table 2 provides a descriptive summary of the greatest similarities and differences explicitly.
Table 2. Similarities and Differences Across Agencies at Item Level

Note. Read from left right. Letters and numbers A1 to H6 listed in columns ‘most similar’ and ‘least similar’ represent the Items per domain A-H. DM, Decision maker.
DISCUSSION
We developed a universal framework for comparing structural and procedural elements and characteristics of HTA organizations and applied it to five organizations from four countries. A table with a total of ninety items falling into eight domains (“organization,” “scope,” “processes,” “methods,” “dissemination,” “decision,” “implementation,” and “impact”) was constructed.
Our comparison of a sample of five HTA agencies (DAHTA@DIMDI, HAS, IQWiG, NICE, and SBU) revealed considerably more differences (83–60 percent) than similarities across agencies and countries at all levels of the framework structure. The magnitude of similarity expressed as percentage of identical characteristics in pairwise comparisons across agencies was moderate and ranged between 17 and 40 percent across all domains. The greatest similarity is present in the domain “dissemination” which addresses the distribution of HTA information. Three of eight domains, “decision,” “implementation,” and “impact,” were not able to show complete similarity for any of the items in their respective domains and scored below 25 percent.
Strengths and Limitations of the Study
The strength of our work is that we took a more systematic and rigorous approach to developing this framework than had previously been attempted. We also have included information current up to 2008 and have explored important domains in greater depth than previous attempts by delineating their constituent items.
Our work has several limitations. First, we focused on a small sample of national agencies and have ignored hospital-based, only for-profit or private HTA agencies, academic HTA units, or units not involved in decision making on a national level.
Second, despite using a systematic approach for information gathering and data collection, the qualitative process used can introduce biases leading to misclassification of characteristics from, for example, a lack of standardization of terms, the use of different types and sources of information, and misrepresented information in published or translated documents. Our framework does not also capture pragmatic factors such as resources and hidden politicized processes, social and cultural values, subjective, intuitive reasoning, implicit, nontransparent principles (Reference Dowie18), or psychological group effects from procedural contexts (Reference Dowie18) that may also play a role.
Finally, due to the small number of included institutions at this step we could not perform a comprehensive quantitative analysis to describe the frequencies of HTA features among agencies or, more interestingly, the correlations between different HTA features.
Comparison to Other Work
Our approach regarding the conceptualization of the framework and the choice of its elements approximates the closest to those of Hailey, Jonsson et al., and Wanke et al. (Reference Hailey39;Reference Jonsson, Banta, Henshall and Sampietro-Colom50;Reference Wanke, Juzwishin, Thornley and Chan79). However, we believe this work is more up-to-date and explores the features of HTA agencies in greater depth.
Previous frameworks have been developed but have used fewer descriptive criteria (Reference Garcia-Altes, Ondategui-Parra and Neumann34;Reference Perry, Gardner and Thamer70) or have focused specifically on agency performance (Reference Hailey39) or function (Reference Lafortune, Farand, Mondou, Sicotte and Battista53) and have used different methods. Some studies have examined a wider sample of agencies (but with fewer criteria) (Reference Perry, Gardner and Thamer70), identified agencies after developing a framework (Reference Martelli, Torre and Ghionno57), or identified agencies specific to a single country (Reference Lehoux, Tailliez, Denis and Hivon54). One of the strengths of our study compared with some of these efforts is that agencies were identified a priori and criteria have been disaggregated to allow readers to judge whether criteria have been accurately represented.
Other previous studies have had a narrower focus than ours, with one study focused only on agencies with an explicit connection to pharmaceutical licensing, reimbursement, and pricing (86) and another examining aspects specific to the conduct of health economic evaluation across agencies. In contrast to our findings, Hjelmgren et al. (Reference Hjelmgren, Berggren and Andersson46) found disagreement in choice of perspective, resources, included costs, and “in methods of evaluating resources used.”
Our study is different from previous publications on key principles in HTA (Reference Drummond, Schwartz and Jonsson24;Reference Drummond, Schwartz and Jonsson25;Reference Liberati, Sheldon and Banta56) in that our work is descriptive and has no normative aspects. Also, as opposed to recent efforts exploring relationships between HTA-relevant issues and their implications, we did not rely on existing HTA reports in a quantitative manner (Reference Draborg and Andersen19–Reference Draborg, Gyrd-Hansen, Poulsen and Horder22;Reference Lehoux, Tailliez, Denis and Hivon54). We were, therefore, unable to show time trends and used HTA reports only to illustrate partial aspects like the reporting structure.
We believe these findings shed some light on the question of HTA harmonization and suggest this could be difficult. In particular, we identified differences of up to 83 percent per domain. However, agencies like SBU are almost 20 years in charge different from other agencies. This begs the question as to what extent harmonization or differentiation is needed, for which purpose and who will profit. In this context, it is worthy to mention, that very recently representatives of the EUnetHTA movement (Reference Kristensen51) emphasized that a higher similarity is to be expected regarding HTA methods which could be further standardized. Similarly, Liberati et al. (Reference Liberati, Sheldon and Banta56) stated in the report on methodology of the subgroup of the EUR-ASSESS project that, “factors such as the particularities of decisions and the decision-making process, political factors and influences, and cultural variability mean that there can never be one process or method of HTA applicable to all circumstances.”
Like Hutton et al. (Reference Hutton, McGrath and Frybourg47), we attempted to better understand the potential use of HTA and identified nearly-sufficient capacity at least in the largest agencies studied (NICE and HAS). However, we did not find evidence for economies of scale in dissemination, implementation, or impact.
Our findings support those of a previous 3-year OECD analysis that suggested “only limited evidence of the effectiveness of HTA in terms of its influence on decision making, on health technology use or on health outcomes” (66). In line with our analysis, a lack of linkage between HTA and policy making was found.
CONCLUSIONS, POLICY IMPLICATIONS, and RECOMMENDATIONS
In conclusion, our study presents a detailed structured and contextual framework on HTA as a standardized template. Our template can be useful for purposes within the agency context, when comparing an agency with other agencies, and across the areas of science, policy, and population.
The application of our framework within a restricted HTA landscape of five HTA agencies in four countries demonstrates that there is great diversity regarding agency characteristics. The fact that considerably more differences than similarities exist when assessing only five agencies shows how difficult harmonization could be, but this must be confirmed when our database is extended with further agencies. Nevertheless, according to our systemic approach and due to the genuinely multidisciplinary nature of “HTA,” we recommend an improved interdisciplinary dialogue between users and producers from different areas.
Our findings suggest some key factors for exploring harmonization, including contextual (i.e., framing) factors and exploring their relevance within a country-specific context. We also found the characteristics of some agencies were bound to country-specific organizational and procedural views, which could be explained by obligations to answer to primarily national demands.
Our findings also suggest the field of HTA needs to be better studied to better interpret and challenge differences across organizations. Efforts to further examine domains or items within the legal, policy, and healthcare system context should be encouraged.
CONTACT INFORMATION
Ruth Schwarzer, MA, MPH, ScD (ruth.schwarzer@umit.at), Senior Scientist, Institute of Public Health, Medical Decision Making and Health Technology Assessment, UMIT–University for Health Sciences, Medical Informatics and Technology, Eduard Wallnoefer Center 1, A-6060 Hall i.T., Austria
Uwe Siebert, MD, MPH, MSc, ScD (public-health@umit.at), Professor of Public Health (UMIT), Chair, Department of Public Health, Information Systems and Health Technology Assessment, UMIT–University for Health Sciences, Medical Informatics and Technology, Eduard Wallnoefer Center 1, Hall i.T., Austria, A-6060; Adjunct Professor of Health Policy and Management, Center for Health Decision Science, Department of Health Policy and Management, Harvard School of Public Health, 718 Huntington Avenue, Boston, Massachusetts 02115; Director of Cardiovascular Research Program, Institute for Technology Assessment and Department of Radiology, Massachusetts General Hospital, Harvard Medical School, 101 Merrimac Street, Boston, Massachusetts 02114