In industrialized countries, the number of multiple pregnancies is increasing (Reference Blondel, Kogan and Alexander4). Preterm birth (<37 weeks gestation) tends to be more prevalent in twin pregnancies (Reference Blondel, Kogan and Alexander4). It is estimated that twins are over five times more likely to be born preterm than singletons and over 8 times more likely to be born <33 weeks gestation (Reference Papiernik, Keith, Oleszczuk and Cervantes17). That preterm birth contributes to long-term neurocognitive deficits, pulmonary dysfunction, and ophthalmologic disorders is well established in the literature (Reference Hintz, Kendrick and Vohr11;Reference Repka21;Reference van Baar, van Wassenaer and Briet24). This, in turn, is associated with sizable healthcare costs from severe morbidity and neurological handicaps among preterm infants, not to mention the personal suffering of the families affected (Reference Petrou, Mehta and Hockley18;Reference Petrou19). For example, Petrou et al. (Reference Petrou, Mehta and Hockley18) found that the duration of hospital admissions over the first 5 years of life for infants born at <28 and at 28 to 31 gestational weeks was eighty-five and sixteen times that for term infants, respectively. Cost differences persisted throughout the first 5 years of life of the child; the adjusted mean cost difference was estimated at over £14,000 for infants born at <28 weeks compared with term infants, and almost £12,000 for infants born at 28 to 31 weeks compared with term infants (£ sterling, 1998 prices). The conditions associated with preterm birth also impose a substantial financial long-term burden on special education and other services.
Given the clinical and economic burden of preterm birth, it is surprising that, to date, there are few effective measures to prevent the condition and interventions are mainly aimed at reducing neonatal morbidity and mortality once preterm birth has occurred. Although research exists on the use of progesterone for the prevention of preterm birth in selected high-risk singleton pregnancies (Reference Dodd, Flenady, Cincotta and Crowther8), there remain unresolved issues in its use in twin pregnancies (Reference Rouse, Caritis and Peaceman22) and progesterone appears to be ineffective in higher order multiple pregnancies (Reference Caritis, Rouse and Peaceman5). To our knowledge, there are no published economic evaluations of progesterone for the use of preterm birth in twin pregnancies.
In this study, we present the first results of their kind reporting an economic evaluation of vaginal progesterone gel for the prevention of spontaneous preterm birth in twin pregnancies based on the STudy of Progesterone for the Prevention of Preterm Birth In Twins (STOPPIT) trial (Reference Norman, Mackenzie and Owen15).
METHODS
STOPPIT Trial
The aim of the STOPPIT trial (Reference Norman, Mackenzie and Owen15) was to determine whether vaginal progesterone gel reduces the rate of spontaneous preterm delivery in women with twin pregnancies. In brief, 500 women before 20 weeks gestation with a twin pregnancy were enrolled into a randomized controlled trial (RCT) from specialized antenatal clinics at nine UK National Health Service (NHS) hospitals between December 2004 and April 2008. Women were randomized either to daily progesterone gel (90 mg) (Crinone®) or placebo gel; both were administered vaginally at home and starting at 24 weeks gestation. The primary clinical outcome was delivery or fetal death before 34 weeks gestation. Ethical approval was obtained from the West Glasgow Ethics Committee.
Cost-effectiveness Analysis
We performed a cost-effectiveness analysis that examined the incremental costs and incremental effectiveness of vaginal progesterone gel compared with placebo. Incremental costs (ΔC) were measured from the healthcare payer perspective (hospital costs only). Incremental effectiveness (ΔE) was estimated as the number of deliveries or fetal deaths before 34 weeks gestation prevented (hereafter summarized as the number of preterm births prevented). The primary outcome of the economic evaluation was the net benefit (NB) statistic, NB = λ·ΔE-ΔC, where λ represents decision-makers’ willingness to pay threshold for a preterm birth prevented (Reference Stinnett and Mullahy23). The time horizon for costs and effects extended from randomization to hospital discharge of both mother and infant.
Measurement of Resource Use
Data were collected on all resources consumed in the hospital setting by each woman and infant during the period between randomization and hospital discharge. The use of other services provided by the community in the weeks between randomization and birth were not recorded in the STOPPIT trial and thus not amenable for our costing purposes.
The data collection included duration of antenatal ward stays, labor ward stay, postnatal ward stay, neonatal care admissions, and maternal intensive care admissions. The trial data collection forms recorded the intensity of maternal and neonatal care, with the latter based on standard levels of care. The trial data collection forms also captured the method of delivery (spontaneous vaginal delivery, vaginal breech, forceps or ventouse, or caesarean section) and ambulance transfers.
The resources used for different modes of delivery were not captured through the trial data collection forms. We, therefore, conducted face-to-face interviews with clinicians and midwives, in conjunction with examination of clinical protocols to gauge the specific resources consumed and variations among centers with regard to staffing ratios, equipment used, drugs administered, and any estimates of disposable items used. This was imperative due to the high risk of caesarean section and instrumental delivery associated with twin pregnancies and enabled potential deviations from normal clinical care to be captured. Resource inputs were dependent upon the number of occupied bed-days in each inpatient ward, as occupancy rates affected the level of support that medical and nursing staff could provide to individual patients, which needed to be reflected in our cost estimates. The average cost per episode of care was therefore adjusted to reflect these rates, which were obtained from local hospital finance departments included in the trial.
Valuation of Resource Use
Unit costs for each resource item were obtained from a variety of sources. All unit costs followed guidelines on costing healthcare services as part of economic evaluation (Reference Drummond, O'Brien, Stoddart and Torrance9). An average per diem cost for nonintensive forms of maternity care was calculated by sending detailed questionnaires to each trial-participating hospital, requesting cost data for the main resource categories of staff, drugs, disposables, equipment and overheads, and then apportioning these to different categories of care using a “top-down” methodology (Reference Mugford, Hutton and Fox-Rushby13). Salary configurations were derived from national datasets as this source provided the most inclusive data (Reference Curtis6). An average per diem cost for each level of neonatal care, as well as intensive care for the mother, and costs of ambulance transfers were derived from the national Department of Health schedule of reference costs (7). Reference costs are collected nationally for the vast majority of hospitals in England and Wales and are presented as average figures. They are calculated on a full absorption costing basis so that costs for each level of care include staff salaries, equipment, consumables and capital overheads. The cost of the progesterone gel was provided by Serono. Unit costs were combined with resource volumes to obtain a net cost per mother and infant during the trial period. This was then averaged across all trial participants, irrespective of the treatment center in which they were recruited, representing a common approach for multicenter economic evaluations (Reference Raikou, Briggs, Gray and McGuire20). All costs were expressed in UK pounds sterling valued at 2008 prices.
Data Analysis
The data analysis was conducted on the basis of intention to treat. All results are reported as mean values with standard deviations and as mean differences in costs and effects with 95 percent confidence intervals (CIs) where applicable. We tested for differences in resource use and costs between the comparator groups using the independent-samples t-test procedure. Differences in costs and effects were considered significant if two-tailed p values were .05 or less. As the data for costs were skewed, we used nonparametric bootstrap estimation to derive 95 percent CIs for mean cost differences between the comparator groups (Reference Barber and Thompson1). Using a large number of simulations, and based on sampling with replacement from the original data, the bootstrap method estimates the sampling distribution of a statistic (Reference Barber and Thompson1). Each of these confidence intervals was calculated using 1,000 bias-corrected bootstrap replications. The joint uncertainty surrounding the costs and effects was represented graphically on a cost-effectiveness plane (Reference Black3). The probability that progesterone prophylaxis in twin pregnancies is cost-effective at alternative willingness to pay thresholds for the primary unit of health outcome was represented using the cost-effectiveness acceptability curve (CEAC) (Reference Fenwick, Claxton and Sculpher10).
We conducted sensitivity analyses (Reference Drummond, O'Brien, Stoddart and Torrance9) on key variables to ascertain the impact of certain assumptions on estimates of cost-effectiveness. Sensitivity analyses involved repeating the analysis while varying the assumptions. The variables chosen for the sensitivity analyses are explained in the results section.
We also conducted a value of information analysis (Reference Fenwick, Claxton and Sculpher10) to determine the value of collecting more information to inform decision makers as to the cost-effectiveness of vaginal progesterone gel to prevent preterm birth in twin pregnancies. The expected value of perfect information (EVPI) per woman was estimated by subtracting the total net benefit for the option we would choose based on current information from the maximum net benefit we would obtain with perfect information (the average of the maximum net benefit for each bootstrap replicated). All analyses were performed with a microcomputer running Excel version 2003 and STATA 9.0.
RESULTS
A total of 500 women were randomized in the trial: 250 to the active group and 250 to the placebo group (Reference Norman, Mackenzie and Owen15). The groups were similar in terms of mean maternal age, mean gestational age at delivery, and number of previous pregnancies (Reference Norman, Mackenzie and Owen15). Three mothers in each of the progesterone and the placebo groups were lost to follow-up (due to withdrawal of consent or not being traced); thus data from 494 mothers were available (Reference Norman, Mackenzie and Owen15). The proportions of women delivering before 34 weeks in the progesterone and placebo groups were 24.7 percent (61/247) and 19.4 percent (48/247), respectively, [odds ratio (OR) 1.36, 95 percent CI 0.89–2.09; p = .16], meaning that, for this population of women, vaginal progesterone gel did not reduce the incidence of preterm delivery (Reference Norman, Mackenzie and Owen15).
Health Service Resource Use
Women allocated to the progesterone group spent, on average, a fewer number of days on the antenatal, labor, and postnatal wards than women allocated to the placebo group (Table 1), but these differences were not statistically significant. The number of neonatal days was greater, on average, for infants whose mothers received progesterone, but this was not statistically significant (p = .65). Although there were lower rates of both caesarean and instrumental delivery in the progesterone group, this also did not reach statistical significance.
aSource: Primary research.
bSource: Department of Health Reference Costs.
SD, standard deviation.
Health Service Costs
There were no statistically significant differences between the two groups for any cost category or total costs (Table 2). Mean health service costs between the period of randomization and discharge for mother and infant were £28,031.33 in the progesterone group and £25,972.07 in the placebo group, generating a mean cost difference of £2,059.25 (bootstrap mean cost difference £2,334.01; 95 percent CI: −£5,023.01, £9,142.52) that was not statistically significant (p = .33). The cost of neonatal stays in special care, high dependency care, or intensive care, was the major cost component in the trial, largely because of the high staff and equipment costs associated with the care of preterm infants. Indeed, overall cost differences between the two groups can be largely explained by the additional care received by infants in the progesterone group during the neonatal period (bootstrap mean cost difference of £3,016.21; 95 percent CI: −£1,212.30, £7,244.30; p = .37), due to these infants’ greater length of stay. Conversely, more infants in the placebo arm were transferred to other units within the hospital or transferred by ambulance to other specialist centers. The cost of maternal intensive care was also higher among women who received progesterone rather than placebo, although the difference was not statistically significant. The cost of antenatal ward stays, labor ward stays, postnatal care, and complicated deliveries (caesarean sections or instrumental deliveries) were similar between the two arms of the trial, amounting to no statistically significant differences.
aThe p values were calculated using Student t-test.
bNonparametric bootstrap estimation using 1000 replications, bias corrected.
cIncludes operative or instrumental delivery (forceps or ventouse).
SD, standard deviation; CI, confidence interval.
Sensitivity Analyses
We performed sensitivity analyses to determine the impact that uncertainty surrounding individual parameter values might have. Varying hospital ward costs had no effect on the overall conclusions, because the progesterone arm remained more costly. For this, we used costs specific to two major hospitals in the trial and compared the cost-effectiveness findings with the original estimate which had used average standard unit costs across all centers. We found that the mean cost per woman was lower in both hospitals for the progesterone and placebo groups compared with our original analysis (between £23,460 and £24,561 in the placebo group and between £26,300 and £27,374 in the progesterone group) rendering vaginal progesterone gel to remain more costly than placebo. For our second sensitivity analyses, the duration of neonatal hospitalization in the placebo arm was set at the duration in the progesterone arm of the trial, as this represented the greatest cost difference between the two arms of the trial, despite limited clinical explanation for this difference. This did reduce the cost margin between the two arms of the trial but was insufficient to alter the overall finding, which is that vaginal progesterone gel is more costly overall.
Cost-effectiveness Plane
The uncertainty around the estimates of incremental costs and incremental effects is displayed in the cost-effectiveness plane (Figure 1). Although it is evident that the majority of bootstrapped samples of incremental cost and effect pairs are located in the northwest quadrant (where progesterone is less effective and more costly than placebo), samples fall in all four quadrants, resulting in a problem when interpreting a standard incremental cost-effectiveness ratio (ICER; ΔC/ΔE). That is, a negative ICER might represent improved outcomes and lower costs as a result of using progesterone, or worse outcomes and higher costs, but we are uncertain. This means that a meaningful ordering of the bootstrapped samples, which is required to make the confidence interval surrounding the ICER interpretable, is very difficult.
CEAC and Net Benefit Statistic
Under these circumstances, the CEAC and the net benefit statistic are the appropriate approaches to representing uncertainty surrounding the joint distribution of cost and effects (Reference Fenwick, Claxton and Sculpher10).
Re-examining Figure 1, the northeast quadrant, with positive costs and positive effects, and the south-west quadrant, with negative costs and negative effects, involve trade-offs. These two quadrants represent situations where vaginal progesterone gel may be cost-effective compared with placebo, depending upon whether the ICER is above or below a threshold λ. The purpose of the CEAC is to summarize this uncertainty and is constructed by plotting the proportion of the cost and effect pairs that are cost-effective for a range of monetary values, λ’s. Points in the northwest quadrant are not considered cost-effective whereas points in the south-east are always considered cost-effective. As the slope of the ray is increased, points in the northeast and southwest quadrants may or may not be considered cost-effective depending upon the threshold. We can deduce from our data that there is a 20 percent chance that vaginal progesterone gel is cost-effective in preventing preterm birth if society is willing to pay £30,000 per prevention.
The net benefit statistic confirms these findings. Assuming that the willingness to pay threshold is £30,000 per preterm birth prevented, this generates a mean net benefit to the health services attributable to progesterone of −£3,637 (95 percent CI: −£3,853, −£3,420), meaning that there is a net loss to the health services in monetary terms. We found that at any given level of willingness to pay for preventing a preterm birth, there is a mean net loss as a result of using vaginal progesterone gel. This ranged from −£2,059 when λ = £0.00, to −£4,688 when λ = £50,000.
Expected Value of Perfect Information
There was no change in our findings after conducting the EVPI, which is that placebo currently has a higher expected value, and consistently using placebo would create a higher expected benefit. The expected value of perfect information was estimated at almost £100.00 per woman at a willingness to pay threshold of £30,000 per prevention of preterm birth (Figure 2). This translates into a nationwide EVPI of £1,033,400 based on an estimated number of twin pregnancies per annum in England and Wales (16).
DISCUSSION
Our study suggests that the probability of vaginal progesterone gel being cost-effective is low in women with twin pregnancies, where no study has preceded these health economic results, although future trials are intended to incorporate economic evaluation methodology (Reference Lim, Bloemenkamp and Boer12). Our cost-effectiveness estimates remained relatively robust following the sensitivity analyses that accounted for the uncertainty surrounding the values of parameters incorporated into the economic evaluation. Our EVPI analysis addressed the value of future research in this area and found that there is little value to conduct further economic research. Although we do not know decision-makers’ willingness to pay for the prevention of preterm birth, the EVPI should be a lot higher than the cost of future research in this area to justify it on economic grounds.
Our study has the following strengths. We used advanced methods of economic evaluation, including methods for handling uncertainty and valuing future research in this area. In addition, the economic evaluation was based on a large RCT in which bias was eliminated through the random allocation of women to the study. A total of nine centers participated in the trial, making the study population fairly generalizable to other obstetric practices in the United Kingdom.
There are three possible limitations of our study that should be borne in mind by readers. First, although our primary outcome measure has clinical relevance when evaluating the benefits of vaginal progesterone gel, decision makers such as the National Institute for Health and Clinical Excellence (NICE) in England and Wales, usually recommend preference based measures, such as quality-adjusted life-years (QALYs), for economic evaluation purposes (14). It was considered that the QALY metric would not be relevant for our study given the methodological problems associated with utility measurement in the perinatal context. However, stated preference discrete choice experiment and contingent valuation methods could, in principle, be used to inform decision-makers’ willingness to pay values for the prevention of preterm birth. We would expect willingness to pay values to be high given the significant contribution of preterm birth to neonatal mortality and morbidity. Future research would benefit from concentrating on soliciting societal valuations for the benefits of preventing preterm birth.
A second limitation of our study relates to the perspective and time horizon. Although we included hospital costs, we did not include costs outside of this setting, such as general practitioner and community visits, and we did not cost resources incurred after discharge from hospital. Although healthcare costs generally associated with preterm birth may be significant over the longer term, collecting this data would have been beyond the scope of the STOPPIT trial and we would not expect any significantly different results from our analysis here, given the nonsignificant clinical findings from the STOPPIT trial.
Finally, our approach of using standard unit costs for key resource inputs may be considered to be a further limitation, given that these costs may be an inadequate reflection of costs specific to the treatment centers in which women and infants were recruited. An alternative, albeit less common approach, would have been to combine center specific unit cost data with all resource volume data for each patient, thereby calculating a treatment cost per patient, before averaging across patients (Reference Raikou, Briggs, Gray and McGuire20). Using this approach could lead to a more accurate analysis because individual centers may respond to relative changes in unit cost of inputs to operate at a least cost input combination or technical efficiency. As a consequence, the calculations based on the use of standard unit costs, as opposed to center-specific unit costs, may have systematically overestimated costs. We addressed this issue, in part, in our study by conducting a parallel analysis, as part of our sensitivity analysis, which used resource unit costs specific to two hospitals in the trial, but there was no change in the conclusion of the study, which is that progesterone is more costly overall.
POLICY IMPLICATIONS
This study collected highly detailed resource use data amongst women randomized to receiving either progesterone or placebo for prophylactic treatment of preterm birth in twin pregnancies. Findings from the STOPPIT trial reveal that progesterone is neither clinically effective nor cost-effective with placebo consistently maintaining a higher net benefit than progesterone for the prevention of preterm birth in this sample of women. Other preventive treatments are urgently required to prevent preterm birth in twin pregnancies. Current evidence suggests that promising candidates in singleton pregnancies [progesterone (Reference Dodd, Flenady, Cincotta and Crowther8) and cervical cerclage (Reference Berghella, Odibo and To2)] are ineffective in twin pregnancies. Trials of future therapies should be powered for relevant outcomes in women with multiple pregnancies: economic evaluation is likely to be required for clinically effective therapies before introduction into routine practice.
CONTACT INFORMATION
Oya Eddama, PhD (oya.eddama@npeu.ox.ac.uk), Health Economist, National Perinatal Epidemiology Unit, University of Oxford, Old Road Campus, Headington, Oxford OX3 7LF, United Kingdom
Stavros Petrou, PhD (stavros.petrou@npeu.ox.ac.uk), Health Economist, National Perinatal Epidemiology Unit, University of Oxford, Old Road Campus, Headington, Oxford OX3 7LF, United Kingdom
Dean Regier, PhD (dean.regier@npeu.ox.ac.uk), Health Economist, National Perinatal Epidemiology Unit, University of Oxford, Old Road Campus, Headington, Oxford OX3 8PF, United Kingdom
John Norrie, BSc, MSc (j.norrie@stats.gla.ac.uk), Professor, Robertson Centre for Biostatistics, Glasgow University, University Avenue, Glasgow G12 8QQ, Scotland
Graeme MacLennan, MSc (g.maclennan@abdn.ac.uk), Health Services Research Unit, Health Services Research Building, University of Aberdeen, Foresterhill, Aberdeen AB25 2ZD, Scotland
Fiona Mackenzie, MB ChB (fiona.mackenzie2@ggc.scot.nhs.uk), Consultant Obstetrician, Princess Royal Maternal, Level 4, 16 Alexandra Parade, Glasgow G31 2ER, Scotland
Jane E. Norman, Prof (jane.norman@ed.ac.uk), Professor of Maternal and Fetal Health, CRB, University of Edinburgh, Queens Medical Research Institute, Edinburgh EH16 4TJ, Scotland