Published online by Cambridge University Press: 26 April 2005
Objectives: Many evaluations underestimate the utility associated with diagnostic interventions by failing to capture the nonclinical value of diagnostic information. This is a cause of bias in resource allocation decisions. A study was undertaken to investigate preferences for the assessment of cardiac risk, testing the suitability of conjoint analysis, a multiattribute preference elicitation method, in the field of clinical diagnosis.
Methods: Two conjoint analysis models focusing on selected characteristics of cardiac risk assessment in asymptomatic patients 40–50 years of age were applied to elicit preferences for cardiac risk assessment from samples of general practitioners and the general public in the United Kingdom and Italy. Both models were based on rankings of alternative scenarios, and the results were analyzed using multivariate analysis of variance and an ordered probit model.
Results: In both countries, members of the public attached at least three times more importance to prognostic value (relative to clinical value) than did general practitioners. Significantly different patterns were found in the two countries with regard to other characteristics of the assessment. Variation within samples was partly associated with personal characteristics.
Conclusions: Only a fraction of the value of cardiac risk assessment to individuals and physicians in this study was linked to health outcomes. The study confirmed the appropriateness and validity of conjoint analysis in the assessment of preferences for diagnostic interventions. A wider use of this technique might significantly strengthen the existing evidence-base for diagnostic interventions, leading to a more efficient use of health-care resources.
As a result of the increasing scarcity of health-care resources, primarily due to technological progress, health-care decision-makers face severe constraints in their choices about the adoption and use of medical interventions. An efficient allocation of scarce health-care resources can only be based on a sound and thorough assessment of the effectiveness and cost-effectiveness of competing medical interventions. Health technology assessment and, more recently, evidence-based medicine have made attempts to provide decision-makers with information about the outcomes of medical interventions. However, the evidence-base for diagnostic interventions remains very weak, mainly due to the limited availability of sound assessment methods and to the relatively modest interest expressed by many research funding organizations for this important area of medicine.
The principal limitation of methods for assessing the outcomes of diagnostic tests is their failure to measure important aspects of the value of diagnostic information to patients and clinicians. Most effectiveness and cost-effectiveness evaluations are based on one single measure of health outcome (quality-adjusted life expectancy, in the best cases) often neglecting other outcomes, especially those of patients with negative or false results. This approach does not reflect what really matters to health-care decision-makers, patients, or society as a whole.
A willingness-to-pay study of ultrasound investigations in pregnancy, based on focus groups, revealed that 44 percent of the value attached by women to ultrasound investigations bore no relation to medical decisions that might have been taken on the basis of the diagnostic information provided (5). A later study of the value of diagnostic information provided by magnetic resonance imaging (MRI) in which sixty-eight patients with suspected multiple sclerosis were interviewed before and after an MRI scan, revealed that patients' expectations in terms of anxiety reduction were high before the test, and the actual reassurance effect remained important after the test. From the results of the study, the authors were able to infer a trade-off between health outcomes and reassurance (20). Unfortunately, no such studies appear to be available for other forms of diagnostic assessment, including clinical evaluation and in vitro diagnostic testing.
Conjoint analysis may provide the means for assessing the multiple dimensions of the outcome of diagnostic interventions, thus allowing a more meaningful comparison with other health-care interventions in the pursuit of an efficient allocation of resources. It is a technique aimed at supporting complex decision making involving multiattribute preferences, providing an alternative framework to multiattribute utility theory, which has been used in health care since the 1970s (12;17;37).
Conjoint analysis was initially developed in the 1960s (18). Since then, it has been applied widely in marketing and, less frequently, in economic analyses of public expenditure programs (e.g., in transport economics, see reference 40; and environmental economics, see references 19 and 35). The approach has attracted the interest and increasingly gained the acceptance of researchers and policy-makers in the health-care field, as indicated by Ryan (30) and Farrar and Ryan (13). Several studies are now available, exploring preferences for a wide range of health-care interventions (3;6;8;10;11;13–16;21–23;25;29;31–33;36;38;41). However, there is still scope for significant methodological improvements. Ratcliffe (24), and Slothuus Skjoldborg and Gyrd-Hansen (34) illustrate several potential problems in using conjoint analysis to elicit willingness-to-pay, but the conceptual and empirical issues involved in applying the conjoint analysis framework in the health-care domain extend beyond this.
There have been few attempts to apply conjoint analysis in the field of medical diagnosis. Probably the earliest example is a study assessing the relative importance attached by physicians to different items of clinical information in the diagnosis of pulmonary embolism (41). A study of preferences for the use of MRI in the diagnosis of knee injuries (8) assessed health and process outcomes using discrete choice to elicit preferences. The results were analyzed by means of a random effects probit model, following Propper (23). A discrete choice experiment was also used by Phillips et al. (22) in a conjoint analysis study of preferences for human immunodeficiency virus testing in San Francisco, California. A six-attribute model was used to assess the willingness to pay of test users, and the results emphasized the importance of diagnostic accuracy. A recent study by Bishop et al. (6) compared preferences of women and health professionals for Down's syndrome screening during pregnancy, highlighting differences in patterns of preferences for process outcomes relative to health outcomes.
The study aimed at investigating the value of information generated in the assessment of cardiac risk in primary care, focusing on both physicians potentially delivering the assessment, and members of the public potentially undergoing such assessment. The study was expected to determine the trade-off between test characteristics that have a direct influence on health outcomes and those that do not.
The study was designed to test the use of conjoint analysis in the context of complex diagnostic interventions, devising suitable questionnaires, interview methods, and statistical tools for this purpose. It entailed a comparison between two European countries, the United Kingdom and Italy. This was expected to provide indications on the cross-country validity of conjoint analysis models and to provide preliminary indications about the role played by cultural factors in determining preferences for the use of diagnostic interventions.
The study focused on the assessment of chronic cardiac risk in asymptomatic subjects in primary care, after increasing pressure toward the use of effective measures for the prevention of cardiac events in subjects at risk (e.g., statins) and a growing debate on the possible use of novel approaches to the assessment of cardiac risk. For instance, the use of markers of cardiac inflammation (such as C-reactive protein, C-rp) has been advocated. C-rp has long been used as a marker of acute cardiac events, but with the development of high-sensitivity assays, it may be possible to use C-rp as a way to identify patients at risk before any symptoms arise. Ridker et al. (27) re-examined blood samples collected during the Physicians Health Study (1989) and concluded that C-rp is a good predictor among seemingly healthy people, those with acute coronary symptoms, and those with chronic stable angina. Later studies seem to support this conclusion, both in relation to cardiovascular risk (2;28) and risk of stroke (9). Evidence is becoming available from randomized trials that the use of statins may help to reduce C-rp levels (1), and a projection based on a decision modeling exercise indicates that this finding may produce a life expectancy gain of over 6 months in a 58-year-old man or woman (7). The evidence seems to point to a clear rationale for screening and primary prevention (4;26), of which cardiac risk assessment in primary care would necessarily represent the first step.
The following factors also made cardiac risk assessment a particularly appropriate intervention for this study: there are no major ethical or social concerns regarding its use; there is evidence of its effectiveness, as well as of the effectiveness of treatments that may follow; groups of individuals and physicians who may be using the intervention can be easily identified and accessed; there is scope for a wider use of the intervention.
The setting selected for the physician study is that of a routine health check for a male patient between 40 and 50 years of age. In the component focusing on the general public, the scenario involved an active choice on the part of the person as to whether they should request a health check. Here, the focus was on the subjects' perception of the value of several characteristics and how attractive such characteristics would make the health check.
The UK samples consisted of 29 general practitioners and 49 members of the public. The Italian samples consisted of 15 general practitioners (medici di medicina generale) and 26 members of the public. These were all unstratified convenience samples. Members of the public were all 40- to 50-year-old men. This selection was to ensure reduced heterogeneity and a focus on a group of individuals young enough to be concerned about assessing long-term health risks and, at the same time, sufficiently at high risk to justify such concerns.
General practitioners were recruited through primary care research networks and direct contacts predominantly in London (UK) and Cassino (IT). A pilot of the physician study was undertaken on general practitioners recruited at a primary care conference in Bournemouth (UK). Members of the public were recruited among librarians working in public libraries and technical and administrative university staff in London (UK) and Rome and Cassino (IT). Additionally, administrative staff from a private company based near London were recruited, through their occupational health service.
The physician model comprised three attributes (perception of resource commitment, prognostic value, expected risk reduction after a preventive intervention) with 2, 3, and 3 levels, respectively. The general public model comprised four attributes (modality of the assessment, preventive interventions, accuracy [prognostic value], expected risk reduction after preventive intervention) with 3, 2, 3, and 3 levels. The attributes were identified through the use of focus groups and test interviews.
Using a fractional factorial design, two series of nine scenarios, or permutations of levels on the model attributes, were devised as a basis for a ranking exercise that subjects were required to complete during their interviews. Ranking is the most commonly used elicitation method in conjoint analysis, as it has the best reliability (33) and is most comprehensible and least time consuming for interviewees, although it has been argued that the discrete choice approach is the most consistent with economic principles (23;30).
All subjects (general practitioners [GPs] and public) were interviewed personally. Interviews had a core component, requiring subjects to rank nine scenarios representing alternative forms of chronic cardiac risk assessment. Subjects were guided through the ranking exercise by means of a series of slides. In addition, they were asked to answer several questions about how they actually conduct the assessment of chronic cardiac risk (physician study) and about their experience with cardiac risk assessment (public).
Interview results were analyzed using multivariate analysis of variance, based on ordinary least squares regressions (SPSS 10.00). Summary scores for each sample were obtained by calculating means of individual coefficients. A (nonlinear) ordered Probit model (STATA 8) was used to investigate in further detail the effect of personal characteristics on preferences, and to test interactions between the latter and characteristics of cardiac risk assessment. A formal specification of the analysis of variance models is as follows:
where β0 is a constant term, u1 is the utility component (part-worth) associated with different degrees of resource commitment for the assessment (accounted for through a series of dummy variables in the analysis); β1 and β2 are the coefficients associated, respectively, with the prognostic value (x1) and the expected risk reduction after a preventive intervention (x2).
where β0 is a constant term, u1 is the part-worth associated with different assessment modalities; u2 is the part-worth associated with different types of preventive interventions; β1 and β2 are the coefficients associated, respectively, with the prognostic value (x1) and the expected risk reduction after a preventive intervention (x2).
The characteristics of the samples are illustrated in Tables 1 and 2. Most GPs in the two countries considered cholesterol testing accurate or very accurate for the assessment of cardiac risk. Use of the test appeared unrelated to perceived accuracy but was related to previous participation in trials. C-rp testing for the assessment of cardiac risk in a primary prevention context was used only by one (UK) GP in the study. Seven GPs in the UK sample and all Italian GPs rated the accuracy of this test. Most of them seemed to believe that the test is, at best, only broadly indicative of the patient's risk.
In the general public sample, a significant correlation was found between receiving a health check or lifestyle advice and having a cholesterol test, although approximately a third of those who claimed not to have had any of the former did have a cholesterol test. Five general practitioners and eleven members of the public expressed counterintuitive preferences for certain characteristics of the risk assessment (reversals), possibly reflecting actual preference patterns or an erroneous interpretation of the questions.
Relative preferences for different characteristics of the risk assessment, based on multivariate analysis of variance, are reported in Table 3. General practitioners in the two countries showed very similar preferences with regard to the prognostic value and the clinical value of the risk assessment. British GPs attached higher preference to a less resource-intensive form of assessment, involving only a consultation and possible follow-up, compared with options involving the use of blood tests. The opposite was true for Italian GPs, with the same strength of preference, approximately equivalent to the value attached to a 2 percent risk reduction achieved through preventive interventions.
The trade-off (or marginal rate of substitution) between prognostic and clinical value was calculated on the basis of the ordered probit model as the ratio between the coefficients for the two characteristics of the risk assessment. The resulting value (−0.3309; after adjusting for age) indicates that GPs consider a reduction of ±3 percent of the margin of error in the estimation of cardiac risk to provide a utility approximately equivalent to a 1 percent risk reduction obtained through a preventive intervention. Possible differences between the two countries were tested using an interaction term, which was not significant.
Most of the personal characteristics considered in the analysis were not associated with GP preferences. However, age and experience (the latter in particular) had a complex interaction with preferences for the prognostic and the clinical value of cardiac risk assessment. The interaction pattern was similar in the two samples but more evident in the UK sample, perhaps due to its larger size (Figure 1). Individual GP's valuations of the trade-off between prognostic and clinical value clustered in a relatively narrow range, with a small number of younger and less-experienced outliers who attached a much greater relative importance to prognostic value. When these were excluded from the analysis, the remaining subjects showed a trend toward an increasing relative importance of prognostic value versus clinical value with increasing experience. This was mainly related to a changing perception of the clinical value of the risk assessment, whereas the perception of the prognostic value was relatively stable. The ordered probit model confirmed that the utility associated with clinical value decreases as experience increases (p<.001). Experience also had a significant interaction with the utility associated with the resources committed in the assessment. More-experienced GPs attached a smaller disutility to the use of a blood test as part of the assessment (p=.002).
Relative preferences for different characteristics of the risk assessment, based on multivariate analysis of variance, are reported in Table 4. Members of the public in the two countries expressed similar preferences for the prognostic and the clinical value of cardiac risk assessment. Individuals in the United Kingdom attached a greater utility to an assessment modality involving only a visit (and possible follow-up) compared with one also involving a blood test, whereas individuals in Italy seemed to prefer the latter. The strength of preference was equivalent to the value of a 4 percent risk reduction after a preventive intervention, in both countries. Individuals in the United Kingdom attached a much greater utility to a form of assessment leading solely to lifestyle changes compared with one leading to a drug therapy (preference equivalent to a 7 percent risk reduction after a preventive intervention). The opposite was true in Italy (equivalent to a 1 percent risk reduction).
The trade-off between prognostic and clinical value was calculated as previously described. The resulting value (−1.0976, after adjusting for prior health check experience) indicates that individuals consider a reduction of±1 percent of the margin of error in the estimation of cardiac risk to provide a utility approximately equivalent to a 1 percent risk reduction obtained through a preventive intervention. Possible differences between the two countries were tested using an interaction term, which was not significant.
The preferences expressed by members of the public were affected by personal characteristics in different ways. In the United Kingdom, subjects who had received lifestyle advice from their GPs attached a relatively smaller utility to the prognostic value (or accuracy) of the risk assessment (p=.023). The same type of association was observed for subjects who had undergone a health check, but this value was not statistically significant. Finally, subjects who had had a cholesterol test attached a greater utility to the inclusion of a blood test in the risk assessment (p=.001). In Italy, subjects who had received lifestyle advice from their GPs attached a greater utility to both the prognostic value (p=.014) and the clinical value (p=.001) of the risk assessment and attached a relatively smaller utility to the inclusion of a blood test in the assessment (p<.001). Subjects who had had a cholesterol test in Italy attached a greater utility to the prognostic value (or accuracy) of the assessment (p=.003).
This study sheds light on several important aspects of the measurement of preferences for diagnostic interventions, focusing on the assessment of cardiac risk in primary care. A key aim of the investigation was to test whether conjoint analysis is an appropriate technique for measuring such preferences. The results obtained in the two countries, particularly a small number of reversals, ease of administration, comprehensive coverage of attributes, and consistent patterns of choice, seem to indicate that conjoint analysis has a significant potential as a method for eliciting preferences for diagnostic interventions. To give further strength to this conclusion, it should be noted that, when conjoint analysis is applied on larger samples, each subject may be required to compare a smaller number of scenarios, possibly using a discrete (pair-wise) choice method instead of the ranking exercise used in this study. This strategy may increase substantially the validity and reliability of the values elicited through conjoint analysis. The following findings warrant further discussion.
Consistent trade-offs between prognostic and clinical values were found across the two countries, but the size of such trade-offs for GPs and members of the public is different. In particular, trade-offs shown by members of the public were at least three times greater than those shown by general practitioners, meaning that the emphasis that members of the public place on the accurate prediction of their cardiac risk is much greater than that placed by GPs. Whether resources should be invested in developing methods for increasing the prognostic value of cardiac risk assessment (e.g., high-sensitivity C-rp testing) remains to some extent an open question. The preferences elicited from the samples analyzed in this study indicate that, in terms of prognostic value, the utility gain achievable with the use of C-rp testing compared with cholesterol testing alone is significantly smaller than the gain achieved by using cholesterol testing compared with a purely clinical assessment. Moreover, compliance to drug treatments is an important determinant of the outcomes of alternative forms of assessment and of the gains that can be achieved with more accurate forms of assessment. This latter aspect should be studied carefully in a context-specific manner.
Clear differences emerged between the two countries with regard to preferences for the use of blood tests and drug therapies. Whereas general practitioners and members of the public in the United Kingdom expressed a certain degree of aversion to both (other things being equal, that is, for any given level of prognostic and clinical value of the assessment), their counterparts in Italy expressed a preference for assessments involving a blood test and possibly leading to the prescription of a drug therapy. Whether the opposite findings in the two countries reflect a truly different psychological attitude toward the use of blood tests and drug therapies or a different degree of comprehension of the conjoint analysis exercise by the subjects interviewed is difficult to determine with the data gathered within this study. Further investigation of this aspect is required.
An important finding of this study is that several personal characteristics of the subjects interviewed seem to have an influence on the subjects' perception of the value of certain attributes of cardiac risk assessment. Several issues require further analysis on a larger sample of subjects. First, the complex effect of general practitioners' experience on the trade-off between prognostic and clinical value. The pattern described in the results section may be interpreted in at least two different ways. It may be argued that less-experienced GPs have more extreme values, which tend to converge toward an average as they build up experience; or it may be argued that a certain number of GPs of the more recent generation (or who have ended their studies more recently) have a greater appreciation of the prognostic value of risk assessment, compared with its clinical value. In any case, it is necessary to assess the statistical significance of the patterns identified in this study on a larger sample.
A further finding that requires more investigation is the contradictory effect of lifestyle advice received by members of the public on their perception of the prognostic value of the risk assessment (the former is associated with decreased utility in the United Kingdom and increased utility in Italy). In general, health-care experiences related to the assessment of cardiac risk (e.g., having a health check, receiving lifestyle advice, undergoing a cholesterol test) seem to be strongly associated with utilities perceived by individuals in relation to characteristics of the risk assessment. It may be appropriate to seek a more precise definition of such health-care experiences than those used in this study to identify exactly what factors affect people's perceptions, and the direction of the causal link.
Conjoint analysis appears to have significant potential for widespread use in the assessment of utilities for health services, including diagnostic interventions. However, several methodological limitations need to be noted, such as the limited comparability of the utility values derived from conjoint analysis across different health-care interventions, the assumption of additive independence between the utility attributes, and the difficulties involved in assessing the validity of elicitation methods. These limitations may sometimes reduce the potential of this technique in supporting resource allocation decisions.
Both physicians and the general public in this study attached significant utility to aspects of cardiac risk assessment other than its clinical value, and particularly to prognostic accuracy. This finding strengthens the case, made by previous studies in other diagnostic areas, for the use of broader forms of evaluation of the impact of diagnostic and screening tests, extending well beyond their impact on clinical outcomes, normally the only dimension taken into consideration. Otherwise, the value of investments in diagnosis may be substantially underestimated, penalizing such investments in effectiveness or cost-effectiveness comparisons across broad ranges of health-care services. This study provides evidence that conjoint analysis is a valid and useful instrument for undertaking a multidimensional assessment of the value of diagnostic interventions.
In the specific area of cardiac risk assessment, members of the general public attached a substantially greater importance to prognostic accuracy than did general practitioners, indicating an area of potentially unmet need. On the assumption that the perceived accuracy of the assessment has a bearing on individual compliance with preventive interventions, increasing the prognostic value of cardiac risk assessment would lead to substantial improvements in health outcomes, not merely through better clinical decisions. This finding is of particular importance given the increased policy emphasis on prevention in long term health system development, as seen in the recent report commissioned by the Treasury in England on Securing Good Health for the Whole Population (39), which highlighted a critical need for stronger evidence on the impact of preventive strategies.
Franco Sassi, PhD (f.sassi@lse.ac.uk), Lecturer, Department of Social Policy, The London School of Economics and Political Science, David McDaid, MSc (d.mcdaid@lse.ac.uk) Research Fellow, LSE Health and Social Care, The London School of Economics and Political Science, Houghton Street, London WC2A 2AE, UK
Walter Ricciardi, MD, MPH, MSc, FPH (wricciardi@rm.unicatt.it), Professor and Director, Institute of Hygiene, Catholic University of the Sacred Heart, Largo Francesco Vito 1, Rome 00168, Italy
The study was supported by a grant from the European Diagnostic Manufacturers Association (EDMA). The sponsors were given an opportunity to comment on the results of the study but the authors bear sole responsibility for the contents of this paper. The authors also acknowledge the contributions of Elena de Hoghton, who helped in the design of the study and in the development of the interview tools; Julien De Naurois, Emily De Naurois, and Damian Pitman, who conducted interviews in the United Kingdom; and Giuseppe La Torre and Marco Oradei, who conducted interviews in Italy.