Published online by Cambridge University Press: 26 April 2005
Objectives: This study aimed to validate the accuracy of data retrieved in a prospective multicenter trial, the purpose of which was an economic evaluation of two techniques of surgery for colon cancer.
Methods: Within the Swedish contribution of the COLOR trial (Colon Cancer Open or Laparoscopic Resection), an economic evaluation of open versus laparoscopic surgical techniques was conducted. Data were collected by case record forms (CRF), patient diaries, and telephone surveys every 2 weeks. The study period was 12 weeks, and the perspective was societal. Data from the first consecutive forty patients to complete the health economic study protocol were validated. Retrieved data were compared with data from medical records and data from local social security offices for agreement.
Results: Statistically significant differences were found for duration of anesthesia, length of surgery, number of outpatient consultations by doctors and district nurses, complication rate, and the use of central venous lines. No significant differences were observed concerning length of hospital stay, disposable instruments cost, and time off work, all of which heavily influence total costs.
Conclusions: The present method of data collection regarding resources used in this setting seems to produce accurate data for economic evaluation; however, relative to complication rates, the method did not retrieve accurate data.
In the evaluation of new medical interventions or routines of treatment, prospective randomized trials are regarded as the gold standard for evidence-based decisions on treatment of choice. For decades, this strategy has been an established practice in the introduction of new pharmaceutical treatments but less established in the introduction of new surgical techniques. With the rapid introduction of laparoscopic technique into clinical practice, prospective randomized studies have been requested with greater frequency.
After the introduction of laparoscopic surgery at the end of the 1980s, several prospective randomized studies have been performed comparing outcomes of conventional and laparoscopic approaches in gastrointestinal, gynecological, and thoracic surgery. Studies have demonstrated that laparoscopic surgery in general is associated with longer operations and the use of more nonreusable instruments, but there have been expectations that total costs would be reduced due to shorter hospital stay, faster recovery, and shorter sick leave for laparoscopically operated patients relative to those operated with the open technique(3;30;39). Among health-care providers, there has been fear that the costs would be augmented using the new technique. Thus, cost-effectiveness or cost-utility analyses were sought after as a complement to medical evaluations. Indeed, several studies have been performed, evaluating economic aspects of different applications of laparoscopic surgery. Laparoscopic colorectal surgery previously has been subject to economic evaluation (5;12;17;18;27;32;33;37). These results, however, are conflicting or difficult to interpret, which is partly due to appreciable differences in methodology and perspectives in analyses (21;28).
Designing and conducting economic evaluations alongside clinical trials offer difficulties that have been elucidated in several studies (6;7;11;14;19;25;26;34). One specific problem, which will be addressed in this study, is the issue of accuracy and validity of collected data, an issue often neglected in guidelines for health economic evaluations.
In more recent years, accuracy of self-reported health-care utilization has been evaluated in selected populations (2;35;36) and patient groups (4;24;29;31). Accuracy of self-reported sick-leave as a function of recollection time period was investigated among employees in a Dutch pharmaceutical company (38). Validations of self-reported data come from a variety of cultural backgrounds, age groups, severity of disease, recall periods, and modes of data collection and the results cannot be generalized. The importance of validation of self-reported data internally in prospective studies has been put forward by some authors (10;26).
Some data, obviously, cannot be reported by patients, and these elements usually concern the in-hospital period. Other data, such as length of stay and duration of surgery, can usually be extracted from the clinical case record forms (CRFs). Moreover, for example, the use of costly equipment, radiological investigations, and time spent in the intensive care unit, cannot be found in clinical CRFs; therefore, special CRFs for economic evaluations may have to be constructed (26). In pharmacoeconomic trials, commonly conducted alongside phase II and III trials, validity of these data are to be checked by the study monitor by source document verification (26). In economic evaluations of surgical procedures, though, there seems to be very little documentation of the validity of data reported on CRFs; as a matter of fact, the authors only found one reference on this issue (22). Given this background, we found it valuable to validate the accuracy of data collected in this trial.
The aim of this study was to perform a validation of the methods of cost data collection used in the Swedish subset of patients included in the COLOR trial (Colon Cancer Open or Laparoscopic Resection; 1;16).
The objective of the COLOR trial is to evaluate conventional open surgery and laparoscopic surgery for potentially curable colon cancer. The primary end point is cancer-free 3-year survival. Morbidity, length of stay, and blood loss are some secondary end points within the main study protocol. The number of inclusions was based on the primary end point and a 7 percent difference in survival, with α = 0.05 and β = 0.20, power 80 percent (16). Because of differences in social security systems, traditions, and so on, between countries, optional end points to study from nation to nation were health economy and health-related quality of life. The COLOR trial stopped inclusion at 1,258 patients from centers in seven countries by February 28, 2003; approximately 40 percent of these patients come from Sweden.
The economic evaluation was “piggy-backed” onto the Swedish subset of patients included in the COLOR trial. The perspective of the study was societal, that is, inclusion of direct and indirect costs. The selected analytical technique was cost-minimization, as the clinical outcome in the two groups was assumed to be equal at 12 weeks postoperatively. Direct and indirect costs were collected during a 12-week period, starting on the day of admission for surgery. The results of the economic evaluation have been published elsewhere (18). The study included 210 patients, and of the studies mentioned previously, this is the only randomized study and the only study using a societal perspective. In short, the result was that laparoscopic surgery, relative to open surgery, was associated with a significantly higher mean cost to the health-care system (2243€; 95 percent confidence interval [CI], 426–4060€; p = 0.018), but there was no significant difference in total cost to society (18).
CRFs were designed for the study after identification of all possible events associated with a significant cost during hospital stay and the 12-week period of follow-up, taking into account the possibility of complications, readmissions, and reoperations after primary surgery. CRFs were completed by the staff at the participating departments of surgery and sent to the study coordinator immediately after completion of the study period. At discharge from hospital, the patients were asked to record in a special diary all contacts with the health-care services, days off work, and so on. These data were collected by telephone surveys, conducted by a research nurse every 2 weeks throughout the study. The same person performed all telephone surveys. All data from the CRFs and telephone surveys were entered into a single database (Microsoft Access™). The study was approved by the local ethics committee of the Karolinska Institute (KI 97-169) and by all other local ethics committees concerned.
The health economic study was initiated nationwide in August 1999, with the intention to include 300 patients. This validation of the study's data accuracy included the first forty consecutive patients who completed the protocol. Variables selected for validation were to fulfill these criteria: variables associated with a significant cost, variables possible to validate, and variables assumed to occur frequently (Table 1).
For the validation, complete medical records were requested for each patient from the participating hospitals for the 12-week study period. To facilitate retrieval of the patients' medical records for the study period, each patient was queried about what general practitioner and district nurse they had been consulting. Information about retirement and sick leave was retrieved from local social security offices (Table 1).
Information from the database (i.e., not information directly from the completed CRFs), which included possible errors at data transfer, was entered into a new data sheet by a research nurse. The corresponding data were retrieved from the medical and social security records and entered into a separate data sheet by one research fellow (M.J.) blinded to the information in the database.
Validation of the use of disposable instruments merits special consideration. In the CRFs, these instruments were to be specified by, for example, manufacturer, model, and number of extra disposable loading units for staplers. In Sweden, no registration of used instruments is routinely undertaken in medical records or elsewhere. For validation, the medical records have been reviewed thoroughly. Only occasionally were details about the manufacturer and model of the instruments noted in the surgical reports. In all other instances, conclusions about the number and type of instruments, as well as the number of extra disposable loading units for staplers, were drawn from the surgical reports. This information, retrieved from the surgical reports, was entered into the data sheet. The information was then compared with the corresponding data from the CRFs. Information in the validation data sheet pertaining to specifications of used instruments was complemented with information from the CRFs data sheet as needed. For instance, a typical surgical report stated that a 45-mm endoscopic stapler was used and reloaded once. For the same patient, the CRF stated that one Ethicon stapler TSW45 was used, but no extra loading units were noted. In this latter case, the validation data sheet was filled out with one Ethicon TSW45 plus one reloading unit.
To make further comparisons possible, information from the original database and the validation data sheet was subsequently transferred to monetary units using the Swedish price lists of January 2001 from the companies Autosuture™ Company, Tyco Healthcare Nordic, Stockholm Sweden, and Ethicon Endo Surgery, Stockholm, Sweden.
Validating time off work was performed on all patients, even those who were retired. This process was done to make the tests sensitive for those patients leaving incorrect information on this topic.
Data from the two sets of data sheets were entered into a statistical software program for further analysis (JMP™, SAS Institute, Inc., Cary, NC). Because no validated variable showed a normal distribution, nonparametric statistical tests (Wilcoxon signed-rank test as a test of systematic error and Spearman's correlation coefficient for continuous variables) were applied. Results are shown as mean (95 percent CI) and median value. Statistically significant differences between collected data and data found at validation were assumed when p<0.05. For nominal variables, agreement tests (McNemar's test of symmetry and Kappa statistics) were used.
Patients from eight Swedish centers were included in this material, with each center contributing from one to ten patients. There were twenty men and twenty women, with a mean age of 73 (range, 49–87 years) and 69 (range, 46–89 years) years, respectively. Twelve patients were still employed and twenty-eight were retired. Of the forty patients, twenty-one had undergone laparoscopic and nineteen conventional open surgery. When a variable could not be found in the database or at validation, the patient was excluded from the analysis of that particular variable.
When analyzing continuous variables from the CRFs, the only significant differences noted were duration of surgery, duration of anesthesia, and number of outpatient visits (Table 2). For these variables, the value retrieved from medical records was significantly higher but the differences were numerically small. Analysis of the nominal variables indicated that accordance was good, if we exclude from the analysis the use of central venous lines and the occurrence of complications (Table 3). Complications detected in medical records but not found in the CRFs included one case of deep vein thrombosis, two cases of postoperative fever given antibiotic treatment, one case of postoperative confusion, one case of anastomotic insufficiency, one case of urinary retention, two cases of prolonged postoperative paralytic ileus, and one case of postoperative bleeding.
Analysis of continuous variables collected during the telephone survey revealed no significant differences, except for number of consultations with a district nurse (Table 4). Two events—number of outpatient nurse consultations and number of home consultations by the district nurse—occurred so infrequently, four times each at validation, that these variables were not subjected to further analysis. The number of outpatient doctor consultations gave somewhat different results when retrieved through telephone surveys versus when collected from CRFs.
Data on costs of disposable instruments are demonstrated as a total and separately for the open and laparoscopic subgroup, because instrument use was more extensive in the laparoscopic group, theoretically leaving more room for mistakes when filling out the CRF. For all other items, analysis was performed in the laparoscopic and open subgroups, demonstrating no significant differences from the demonstrated results.
Before new surgical procedures are introduced into generalized use, health economic evaluations should be performed to serve as part of the information on which to base decisions. Although guidelines are available about how to perform such trials (8;9;15;20;23) there are no standardized methods to collect data for economic analyses, whether from the in-hospital period or from the postoperative follow-up. One reason is that circumstances differ between study settings. As long as data quality in randomized multicenter trials is not validated, one could argue that the foundations of evidence-based medicine are somewhat uncertain. When data for economic evaluation is collected in different settings and with different methodologies for a specific occasion, it is important to validate these methods to ensure high quality data. The present study validates one method of data collection for an economic evaluation within a large prospective randomized multicenter study. The results presented here, however, only reflect the validity of data in this specific trial and setting, but the principle of validation of data quality in health economic trials should be more widely adopted, so that standards for good data quality could be set.
In all validation studies, the question arises as to the gold standard, that is, what is recognized as the most valid method? In this study, we regarded medical records as the gold standard for economic evaluation of laparoscopic treatment of colon cancer, as did Katz et al. (22). In the Swedish health-care system, there are no other records readily available that yield reliable information of the resources used. Yet, how accurate are medical records concerning these topics? To our knowledge, no investigation has looked into this issue in clinical surgery; even so, one can expect appreciable variations between hospitals in different nations as well as between hospitals in the same country. Local culture, tradition, and legal requirements are important factors in determining what information is included and what information is excluded from medical records.
As mentioned above, validation of the use of disposable instruments, which was associated with a substantial cost, in this setting was flawed by uncertainty and possibly some bias, even though figures showed good correspondence. At validation of data concerning time off work, information from local social insurance offices must be held as the gold standard in the Swedish social security system, where it is linked to the payout of sick leave benefits and includes all Swedish citizens.
Although significant discrepancies on some points were noted, these generally concerned variables of relatively low economic impact. However, the large discrepancy on postoperative complications merits special attention. The answer “yes” in the CRFs led to a request to fill out an extended questionnaire concerning specific resources used because of the complication. No definition of complication was given in the CRFs, which probably accounted for some of the discrepancy between CRFs and medical records. Furthermore, reluctance or negligence among medical professionals to report complications could have contributed to the observed discrepancy. It should be noted that the medicolegal system in Sweden does not allow for litigation of the individual medical personnel, which could be taken as a further indication that medical records are relatively trustworthy.
We conclude that our method of data collection for the purpose of economic evaluation in the present setting—a potentially complex series of events in a surgical multicenter trial in Sweden during a 12-week period—produced data with approximate accuracy, with exception of the report of complications. Because complications often carry substantial costs, special attention should be paid concerning the methods to collect data on complications when deciding on the protocol in health economic studies.
Our experience demonstrates the importance of validation of methods of data collection in health economic research. We encourage other investigators in the field to validate their methods and, thereby, contribute to setting standards that will facilitate further research.
Martin Janson, MD (martin.janson@karolinska.se), Department of Surgery, Karolinska Institutet, K 53 Karolinska University Hospital at Huddinge, S-141 86 Stockholm, Sweden
Per Carlsson, PhD (Per.Carlsson@ihs.liu.se), Center for Medical Technology Assessment, Linköping University, 581 83 Linköping, Sweden; Director, The National Center for Priority Setting in Health Care, Östergötland County Council, 581 91, Linköping, Sweden
Eva Haglind, MD, PhD (eva.haglind@vgregion.se) , Professor, Department of Surgical Sciences, The Sahlgrenska Academy, Göteborg University; Senior Medical Advisor, Sahlgrenska University Hospital, Bruna Stråket 11, SE-413 45 Göteborg, Sweden
Bo Anderberg, MD, PhD (bo.anderberg@cfss.ki.se), Professor, Department of Surgery, Karolinska Institutet and Karolinska University Hospital at Huddinge, 141 86 Stockholm, Sweden
This work was supported by grants from the Swedish Cancer Foundation (grant no. 4287-B01-03XCC), the County Council of Stockholm, Assar Gabrielsson's Foundation for Clinical Research, Jubileumskliniken Research Foundation, Sahlgrenska University Hospital, and the Swedish Society of Medicine. The authors are grateful to Mrs. Gunilla Walldin for expert assistance.
Variables Chosen for Validationa
Validation of the Continuous Variables from the CRFs Completed by the Operating Clinics
Validation of the Nominal Variables from the CRFs Completed by Operating Clinics
Validation of the Continuous Variables from the Telephone Surveysa