Cost-effectiveness analysis or economic evaluation is a type of study that compares costs and outcomes of at least two healthcare alternatives; hence, it can provide data to inform the decision making under budget constraints (Reference Drummond, Sculpher, Torrance, O'Brien and Stoddart1). Against a backdrop of scarcity of resources, the public healthcare system should allocate resources in those technologies with demonstrated effectiveness and cost-effectiveness.
Conducting economic evaluations alongside randomized clinical trials (RCT) is common because it enables to collect data on resource use at the same time as the effectiveness data are collected. Because economic evaluation is dependent on the quality of the medical evidence, clinical trials are considered a natural vehicle for economic analysis (Reference Drummond, Sculpher, Torrance, O'Brien and Stoddart1), although this approach has also some limitations; there can be competing interests between clinical and economic study objectives, which can affect subjects and sites selection, duration of follow-up or study comparators (Reference O'Sullivan, Thompson and Drummond2).
Systematic reviews and meta-analyses of RCTs and other studies are needed to evaluate effectiveness and cost-effectiveness of health technologies (Reference Drummond, Sculpher, Torrance, O'Brien and Stoddart1;3;Reference Armijo-Olivo, Cummings and Fuentes4). The evaluation of the methodological quality of studies included in the reviews is essential for the correct synthesis and interpretation of results, because only the good quality evidence can be considered for decision making. Assessment of methodological quality can be made by means of different tools that can cover a wide range of items, but the main aim should be always to assess internal validity (i.e., risk of bias) and external bias of a given study (3;Reference Armijo-Olivo, Cummings and Fuentes4).
Although it seems that the reporting of the methodology improves over time (Reference Falagas, Grigori and Ioannidou5;Reference Dechartres, Charles, Hopewell, Ravaud and Altman6), the methodological quality of a study can be underestimated if the report does not mention key methodological features that appear in the protocol (Reference Dechartres, Charles, Hopewell, Ravaud and Altman6;Reference Mhaskar, Djulbegovic, Magazin, Soares and Kumar7). Similarly, the reported quality of an economic evaluation can fail to describe some important methodological features of the RCT that provides effectiveness data.
The objective of this article is to discuss issues related to the methodological quality assessment of economic evaluations performed alongside clinical trials to highlight the need to assess studies rather than papers. To achieve this, we present the methodological quality assessment of economic evaluations alongside clinical trials on physical therapy for knee osteoarthritis included in a previous systematic review (Reference García Pérez, Arvelo Martín and Guerra Marrero8).
METHODS
A systematic review of cost-effectiveness studies of physical therapy interventions for patients with knee osteoarthritis commissioned by the Spanish Ministry of Health was used as the basic material for the discussion (Reference García Pérez, Arvelo Martín and Guerra Marrero8). The method was the established for systematic reviews (3). Briefly, the methods were the following. Electronic databases were searched: MEDLINE and MEDLINE in process, EMBASE, CINAHL, Cochrane Central Register of Controlled Trials, Cochrane Database of Systematic Reviews, CRD (DARE, HTA, NHS-EED) and Physiotherapy Evidence Database (PEDro). An example of the search strategy can be seen in the Supplementary Materials, which include Figure 1, Tables 1–3, and Methods of the Systematic Review. Studies were selected independently by two economists; compliance with the inclusion criteria was verified by physiotherapists. Studies that fulfilled the following criteria were selected:
Table 1. Characteristics of Studies Included in the Systematic Review

Note. ADAPT, Arthritis, Diet, and Activity Promotion Trial; CCA, cost-consequence analysis; CEA, cost-effectiveness analysis; CMA, cost-minimization analysis; CUA, cost-utility analysis; DMC, direct medical costs; DNMC, direct non-medical costs; FAST, Fitness Arthritis and Seniors Trial; N, number of recruited or randomized patients; NHS, National Health Service; QALY, quality-adjusted life-years; TKR, total knee replacement; WOMAC, Western Ontario and McMaster Universities Arthritis Index.
Table 2. Methodological Quality of Studies Included in the Systematic Review as Clinical Trials According to PEDro Scale

Note. 0, No; 1, Yes. One point is only awarded when the criterion is clearly satisfied. Criterion 1 related to the external validity does not contribute to the total score. The punctuations were obtained from PEDro database, except the paper by Lord et al. (1999) that is not included in the database, so it was assessed by the authors of this review.
HTA, health technology assessment; PEDro, Physiotherapy Evidence Database.
Table 3. Methodological Quality of Studies Included in the Systematic Review as Economic Evaluations according to Drummond et al. Criteria

NA, Not applicable; 0, No achieved or not informed; 1, Partially achieved; 2, Yes, it was achieved.
HTA, health technology assessment.
Types of intervention: All types of interventions based on physical therapy to relieve pain and/or maintain or regain mobility of the knee with osteoarthritis. Studies that assessed educational interventions when the physiotherapy component was an important part of the content of the intervention were also included. The comparator could be a control group, a different physical therapy, or any other surgical or pharmacologic intervention.
Types of study: economic evaluations developed alongside clinical trials or based on economic models. All type of economic evaluations were included: cost-utility analysis (CUA), where the outcomes are quality-adjusted life-years (QALYs); cost-effectiveness analysis (CEA), where the outcomes are natural units; cost-minimization analysis (CMA), when there are no differences in effectiveness between arms so the identification of the less costly option is enough; or cost-consequence analysis (CCA), when several outcomes are used.
Types of participant: men and women with knee osteoarthritis. Studies in which patients had been undergoing knee arthroplasty before commencing the study were excluded. Studies that included patients with osteoarthritis of the knee or hip but did not provide the results separately, and studies where the patients with osteoarthritis were less than 80 percent were also excluded.
Types of outcome: costs, effectiveness results, and incremental cost-effectiveness ratios (ICER) reported separately. Effectiveness measures were included when they were considered valid such as QALYs or Western Ontario and McMaster Universities Arthritis Index (WOMAC), or being accepted in the field of assessment of physical therapy interventions.
Methodological quality of included studies was assessed independently by two reviewers according to accepted criteria for economic evaluations (Reference Drummond, Sculpher, Torrance, O'Brien and Stoddart1) and for clinical trials (9) and based on publicly available data.
Because the economic evaluations included in the review were performed alongside RCTs in all studies, the PEDro scale for the assessment of clinical trials was used (9). The total PEDro score is reliable and acceptable for use in systematic reviews of physical therapy RCTs (Reference Maher, Sherrington, Herbert, Moseley and Elkins10). The PEDro scale is comprised of eleven items, although only ten items are scored: random allocation, concealed allocation, similarity at baseline, subject blinding, therapist blinding, assessor blinding, >85 percent follow-up for at least one key outcome, intention-to-treat analysis, between-group statistical comparison for at least one key outcome, and point and variability measures for at least one key outcome. Items are scored as either present (1 point) or absent (0 points) and a score out of 10 is obtained by summation. Maher et al. reported an inter-rater reliability generalized kappa statistic of between 0.40 and 0.75 (Reference Maher, Sherrington, Herbert, Moseley and Elkins10). For this review, PEDro's scores on the PEDro Web site were used when available; otherwise the reviewers assessed the quality of the clinical trial by following the instructions for the PEDro scale (9).
The 10-question check-list by Drummond et al. is a widely used tool to assess the quality of economic evaluations (Reference Drummond, Sculpher, Torrance, O'Brien and Stoddart1;3). This tool includes items related to the description of the compared alternatives, the effectiveness of the technologies, the identification, measurement and value of the costs and consequences of each alternative, the discounting of costs and consequences in the long term, and the incremental and sensitivity analyses.
We compared the PEDro scores based on different documents when more than one document reported the clinical trial. In addition, we compared the results of the Drummond's criteria assessments for those economic evaluations reported in several documents.
RESULTS
The systematic review, conducted in February 2015, included six economic evaluation studies with published results reported in seven documents (Reference Lord, Victor, Littlejohns, Ross and Axford11–Reference Sevick, Miller, Loeser, Williamson and Messier17). All economic evaluations were performed alongside RCTs (Table 1). We searched for the main papers reporting the design and results of the RCTs and found three papers that were retrieved to complete data (Reference Ettinger, Burns and Messier18–Reference McCarthy, Mills and Pullen20). Results of one economic evaluation were reported in two documents, a health technology assessment (HTA) report (Reference McCarthy, Mills and Pullen14) and a paper (Reference Richardson, Hawkins and McCarthy15); the clinical results were reported in a previous paper (Reference McCarthy, Mills and Pullen20). The FAST study was reported in two documents, the economic evaluation (Reference Sevick, Bradham and Muender12) and the RCT (Reference Ettinger, Burns and Messier18); the ADAPT study was reported in two documents as well, the economic evaluation (Reference Sevick, Miller, Loeser, Williamson and Messier17) and the RCT publication (Reference Messier, Loeser and Miller19).
Different exercise programs for patients with knee osteoarthritis at different settings and by different professionals were evaluated in four studies (Reference Lord, Victor, Littlejohns, Ross and Axford11;Reference Sevick, Bradham and Muender12;Reference McCarthy, Mills and Pullen14;Reference Richardson, Hawkins and McCarthy15;Reference Sevick, Miller, Loeser, Williamson and Messier17). The other two studies compared exercise programs before arthroplasty with no exercise before arthroplasty (Reference Beaupre, Lier, Davies and Johnston13;Reference Mitchell, Walker and Walters16). Some of the interventions were cost-effective according to the authors (Reference Sevick, Bradham and Muender12;Reference Richardson, Hawkins and McCarthy15;Reference Sevick, Miller, Loeser, Williamson and Messier17), while three RCTs did not find differences in the primary outcome (Reference Lord, Victor, Littlejohns, Ross and Axford11;Reference Beaupre, Lier, Davies and Johnston13;Reference Mitchell, Walker and Walters16). The studies were too heterogeneous to be easily synthesized, and some of them suffered of some limitations that prevented drawing robust conclusions (Reference Lord, Victor, Littlejohns, Ross and Axford11;Reference Beaupre, Lier, Davies and Johnston13;Reference Mitchell, Walker and Walters16). According to a good quality study, albeit not free of some uncertainties, the combination of supervised exercise in classes in a clinical center and exercises at home is cost-effective compared with exercise alone at home from a NHS perspective in United Kingdom (Reference McCarthy, Mills and Pullen14;Reference Richardson, Hawkins and McCarthy15).
The study with the best methodological quality scores was the ADAPT study in patients with knee osteoarthritis and overweight or obesity (Reference Sevick, Miller, Loeser, Williamson and Messier17). This study revealed that the combination of diet and physical exercise sessions supervised in a clinical center was cost-effective from the perspective of a third-party payer in the United States, in comparison to diet and exercise as separate interventions or patient education (Reference Sevick, Miller, Loeser, Williamson and Messier17). Further information on the systematic review and the results of the included studies are reported in the Supplementary Materials or can be requested to the authors (Reference García Pérez, Arvelo Martín and Guerra Marrero8).
Given that all studies included in the systematic review were economic evaluations alongside RCTs, their methodological quality was assessed as both clinical trials (Table 2) and economic evaluations (Table 3). For the assessment of methodological quality of RCTs as such, we resorted to scores included in the PEDro database if available; it was only necessary to assess the quality of one study not found in the PEDro database (Reference Lord, Victor, Littlejohns, Ross and Axford11). Three studies obtained different scores depending on the document used to assess its quality. The PEDro score for the FAST study based on the economic evaluation paper (Reference Sevick, Bradham and Muender12) was 2 while it was 6 when based on the RCT paper (Reference Ettinger, Burns and Messier18). We found a similar pattern for the ADAPT study where the scores were 3 or 8 depending on the analyzed document, the economic evaluation (Reference Sevick, Miller, Loeser, Williamson and Messier17) or the RCT (Reference Messier, Loeser and Miller19). In a third study, the score was even higher when the document used for the assessment was a HTA report (Reference McCarthy, Mills and Pullen14) instead of the economic evaluation (Reference Richardson, Hawkins and McCarthy15) or the RCT paper (Reference McCarthy, Mills and Pullen20) (see Table 2).
Although the PEDro scale attains a maximum of 10 points, the studies included could only score 8 points at most given that it was not possible to blind either subjects or therapists. Only one study achieved a score of 8 points (Reference Messier, Loeser and Miller19). The assessors of the results in three of the studies were blinded (Reference Beaupre, Lier, Davies and Johnston13;Reference McCarthy, Mills and Pullen14;Reference Messier, Loeser and Miller19;Reference McCarthy, Mills and Pullen20). All studies notified selection criteria except one (Reference Lord, Victor, Littlejohns, Ross and Axford11). All of them made random assignment to group, although two of them did not conceal the assignation or did not provide information on this issue (Reference Lord, Victor, Littlejohns, Ross and Axford11;Reference Sevick, Bradham and Muender12). All studies attained sufficient follow-up (85 percent) except for two which evaluated physiotherapy before arthroplasty (Reference Beaupre, Lier, Davies and Johnston13;Reference Mitchell, Walker and Walters16). The remaining items were positively assessed in all studies.
The assessment of the quality of the studies as economic evaluations is set out in Table 3. Two studies did not manage to reveal the effectiveness of procedures and, therefore, expressly carried out a cost-minimization analysis (Reference Lord, Victor, Littlejohns, Ross and Axford11;Reference Beaupre, Lier, Davies and Johnston13). The studies did not perform an incremental analysis of costs and effects given that they did not find differences in effectiveness and/or had the aim of performing a cost-consequences analysis, that is, there was no main outcome measurement. Only the studies by Sevick et al. calculated ICERs but in both cases the comparison between some of the alternatives was overlooked (Reference Sevick, Bradham and Muender12;Reference Sevick, Miller, Loeser, Williamson and Messier17). The costs included were identified in most of the studies. Only in the studies by Beaupre et al. and Mitchell et al. were there insufficient details provided on the costs comprising the aggregates for which information was provided (Reference Beaupre, Lier, Davies and Johnston13;Reference Mitchell, Walker and Walters16).
Discounting was not necessary in any study because of the short term. However, Sevick et al. (2009) discounted the costs (5 percent) (Reference Sevick, Miller, Loeser, Williamson and Messier17). Pain and physical capacity in general were the measurements most commonly used. All studies except one (Reference Sevick, Bradham and Muender12) used the WOMAC questionnaire. Only one study used the EQ-5D, a generic HRQOL questionnaire that enables obtaining utilities and QALYs for a cost-utility analysis (CUA) (Reference Richardson, Hawkins and McCarthy15). Three studies performed notable sensitivity analyses with the purpose of analyzing the uncertainty of their results (Reference Lord, Victor, Littlejohns, Ross and Axford11;Reference Richardson, Hawkins and McCarthy15;Reference Sevick, Miller, Loeser, Williamson and Messier17). The study with cost-effectiveness results reported in two documents achieved similar results independently of the document used for the quality assessment (Reference McCarthy, Mills and Pullen14;Reference Richardson, Hawkins and McCarthy15). The assessment was different for only one item: the compared alternatives were detailed in the HTA report (Reference McCarthy, Mills and Pullen14), while they were not described in the paper reporting exclusively on the economic evaluation (Reference Richardson, Hawkins and McCarthy15).
Eight ongoing clinical trials met the inclusion criteria of our review (see Supplementary Materials). Upon requesting information by email, investigators of completed studies confirmed the intention of publishing results in the medium term. The methodological quality of the ongoing studies was assessed although it has to be interpreted with caution because the methods described in protocols and registers of clinical trials could suffer changes not published at the moment of data collecting for this review. The eight studies seem to include appropriate patients and to perform randomization although only four studies mention the concealment of the allocation or the intention-to-treat analysis. The methods of the economic evaluation are better described in papers of protocols than in the clinical trials registries where the main intention is the description of the effectiveness study. However, seven of eight studies mention the intention to perform an incremental analysis.
DISCUSSION
This study analyses and discusses the difficulties of the methodological quality assessment of economic evaluations performed alongside clinical trials. An example of a systematic review on cost-effectiveness of physical therapy for knee osteoarthritis (Reference García Pérez, Arvelo Martín and Guerra Marrero8) was used to illustrate how incomplete reporting can affect the quality assessment of an economic study.
Osteoarthritis is a chronic disease most common in the hip and knee joints. It is a major cause of pain, disability, loss of health-related quality of life (HRQOL), and social isolation in older people (Reference Mahon, Bourne and Rorabeck21–Reference Quintana, Escobar and Arostegui23). For care and management of knee osteoarthritis, guidelines recommend, among nonsurgical and nonpharmacologic interventions, education, advice and access to information, aerobic exercise and strengthening, in addition to weight loss in case of overweight or obese individuals (Reference Zhang, Moskowitz and Nuki24;25). While the Osteoarthritis Research Society International (OARSI) makes recommendations based on effectiveness and safety criteria (Reference Zhang, Moskowitz and Nuki24), the National Institute for Health and Care Excellence (NICE) in the United Kingdom also includes cost-effectiveness criteria (25).
Our systematic review on cost-effectiveness of physical therapy interventions for patients with osteoarthritis of the knee included six economic evaluations alongside RCTs (Reference Lord, Victor, Littlejohns, Ross and Axford11–Reference Sevick, Miller, Loeser, Williamson and Messier17). The interventions included in these studies were heterogeneous. While some studies showed the cost-effectiveness of their interventions (Reference Sevick, Bradham and Muender12;Reference Richardson, Hawkins and McCarthy15;Reference Sevick, Miller, Loeser, Williamson and Messier17), other suffered of some limitations (Reference Lord, Victor, Littlejohns, Ross and Axford11;Reference Beaupre, Lier, Davies and Johnston13;Reference Mitchell, Walker and Walters16) and did not find differences in effectiveness (Reference Lord, Victor, Littlejohns, Ross and Axford11;Reference Beaupre, Lier, Davies and Johnston13;Reference Mitchell, Walker and Walters16) or total costs (Reference Beaupre, Lier, Davies and Johnston13;Reference Mitchell, Walker and Walters16) given the lack of statistical power among other reasons. Eight ongoing studies claimed to carry out a cost-effectiveness analysis of different programs, some of them quite innovative, for example, a Tai Chi mind-body exercise (NCT01258985) or a spa therapy using mud packs (NCT01538043). It will be interesting to see if these therapies prove to be effective and cost-effective.
Given the results of this systematic review the main conclusion is that it is necessary to perform further research on the cost-effectiveness of physiotherapy for knee osteoarthritis compared with other physiotherapy techniques, medication, and surgery by means of acceptable designs. The development of economic evaluations should adhere to established principles of high quality research (Reference Drummond, Sculpher, Torrance, O'Brien and Stoddart1;Reference Hiligsmann, Cooper and Arden26;Reference Sabharwal, Carter, Darzi, Reilly and Gupte27). There are specific guidelines with recommendations about conducting economic evaluations alongside clinical trials (Reference Ramsey, Willke and Glick28) and economic evaluations in the field of rheumatology and osteoarthritis (Reference Hiligsmann, Cooper and Arden26;Reference Drummond, Maetzel, Gabriel and March29–Reference Hiligsmann, Cooper and Guillemin31). Some of these recommendations have the purpose of making the studies more transparent to allow the comparability across them (Reference Hiligsmann, Cooper and Arden26;Reference Drummond, Maetzel, Gabriel and March29;Reference Gabriel, Drummond and Maetzel30), for example, encouraging the enumeration of costs and reporting results in a 3-step approach, separating cost of the intervention, all healthcare costs, and societal costs (Reference Hiligsmann, Cooper and Arden26).
None of the studies included in this review reported that they followed any of these guidelines or recommendations to conduct their cost-effectiveness studies. However, some of them cited several methodological papers to support their investigations (Reference Lord, Victor, Littlejohns, Ross and Axford11;Reference McCarthy, Mills and Pullen14), while other authors cited key books to justify some methodological decisions (Reference Beaupre, Lier, Davies and Johnston13;Reference Mitchell, Walker and Walters16;Reference Sevick, Miller, Loeser, Williamson and Messier17). For instance, Beaupre et al. (Reference Beaupre, Lier, Davies and Johnston13) used the textbook by Drummond et al. (Reference Drummond, Sculpher, Torrance, O'Brien and Stoddart1) to justify the cost-minimization analysis.
Apart from mortality, other measures, such as pain, functional disability, and patient's global assessment, are recommended in osteoarthritis trials. Disease-specific tools such as WOMAC are convenient although the use of QALYs has been endorsed by several experts (Reference Drummond, Sculpher, Torrance, O'Brien and Stoddart1;Reference Hiligsmann, Cooper and Arden26;Reference Ramsey, Willke and Glick28;Reference Hiligsmann, Cooper and Guillemin31). Generic questionnaires to assess quality of life such as EQ-5D, SF-6D, or HUI can be used to estimate QALYs. Finally, costs and effectiveness should be combined through the estimation of the ICER, and this should be interpreted according to the local guidelines and cost-effectiveness thresholds.
Despite all these guidelines, some authors consider that the recommendations are not heeded (Reference Hiligsmann, Cooper and Arden26). More recently, a reference case of economic studies in osteoarthritis has been published (Reference Hiligsmann, Cooper and Guillemin31). This study presents recommendations to define the standard optimal care as the alternative comparator. Clearly, the number of potential interventions waiting to be evaluated could be boundless but all of them, not only medicines and surgery but also the interventions to help patients to adopt beneficial lifestyles, such as the assessed in this review, deserve to be evaluated; however, some studies have to be prioritized. It is especially necessary to assess the effectiveness and cost-effectiveness of those physiotherapy interventions already implemented (Reference Hiligsmann, Cooper and Arden26) and to avoid the extensive implementation of new programs without demonstrated effectiveness and cost-effectiveness.
The reporting of scientific studies in general and also of economic evaluations has been the object of discussions. The most recent guideline is the ISPOR Consolidated Health Economic Evaluation Reporting Standards (CHEERS), which provide recommendations in a checklist to optimize reporting of health economic evaluations (Reference Husereau, Drummond and Petrou32). However, the first attempt to guide the reporting of economic evaluations dates from 1996 (Reference Drummond and Jefferson33), the same year when the first CONSORT Statement was published (Reference Begg, Cho and Eastwood34). The reported methodological quality of randomized controlled trials of physiotherapy interventions has improved over time according to a review that included 10,025 trials from the PEDro database in 2008 (Reference Moseley, Herbert, Maher, Sherrington and Elkins35). This study identified improvements in eight of eleven items. Only blinding of therapists and subjects did not improve from 1960 to 2008. The authors concluded that there is room for improvement despite the slow increase in the quality of reporting of physiotherapy trials (Reference Moseley, Herbert, Maher, Sherrington and Elkins35).
In 2009, the proportion of trials meeting the random allocation item and the comparison between groups item on the PEDro scale was near 90 percent while the items with a lower proportion of trials were the blinding of subjects and therapists, the concealment of the allocation and the intention to treat analysis (Reference Sherrington, Moseley, Herbert, Elkins and Maher36). We found similar limitations in the studies included in our review. Moreover, while we did not find the protocols of the studies with results included in our review, we were able to find the published protocols or entries in clinical trials registries of eight ongoing RCTs with planned cost-effectiveness analysis. We can interpret this as a signal of the improvement over time on the reporting of trials.
A recent review of economic evaluations for the management of patients with hip fractures has found that some domains of methodology performed poorly. These were the use of an appropriate time horizon, incremental analysis of costs and outcomes, future discounting, sensitivity analysis, declaration of conflicts of interest, and discussion of ethical considerations; and most of them did not adopt a societal perspective (Reference Sabharwal, Carter, Darzi, Reilly and Gupte27). The economic evaluations included in our review had varied quality and none of them was free of drawbacks. Recently, Pinto et al. investigated the cost-effectiveness of nonpharmacologic, nonsurgical interventions for the treatment of hip and/or knee osteoarthritis with review up until October 2010 (Reference Pinto, Robertson, Hansen and Abbott37). Our review is more specific (we excluded studies with patients in pain without demonstrated arthritis), updated to February 2015, and broadly discusses the methodological quality of the studies published. Both reviews coincide in the concern of possible bias and heterogeneity of methodologies; more high-quality economic evaluations are required to further inform practice (Reference Pinto, Robertson, Hansen and Abbott37).
The results of this review merit discussion about the methodological quality assessment of studies to highlight the need to assess studies rather than papers. A limitation of the assessment of methodological quality is that rather than assessing a study, we assess the information given to us in a paper, that is, the quality of reporting. When we assessed the quality of trials by means of the PEDro scale, we found that a same clinical trial obtained different scores according to the paper used for its assessment. To correctly assess studies such as RCTs, we had to review papers where the clinical trial was explained, because papers which mainly report the economic evaluation dedicate less space to reporting on the trial design, procedures, patients, and clinical outcomes to give more space to the costs method and economic outcomes.
This is the case for the article reporting the economic evaluation for the ADAPT study, in which a score of 3 on the PEDRo scale is reported (Reference Sevick, Miller, Loeser, Williamson and Messier17). However, the original paper which provides greater detail on the ADAPT trial attains a score of 8 of 10 with the same scale (Reference Messier, Loeser and Miller19). Moreover, it can be concluded that a HTA report shows more information about the design and clinical effectiveness than a paper where the RCT or the economic evaluation are reported. This is probably due to space restrictions. A good example of this are the scores for the three papers related to the study by McCarthy et al. and Richardson et al. (Reference McCarthy, Mills and Pullen14;Reference Richardson, Hawkins and McCarthy15;Reference McCarthy, Mills and Pullen20). If we limited the assessment of the trial to the information reported in the economic evaluation (Reference Richardson, Hawkins and McCarthy15), where the score is 3, instead of the RCT paper (Reference McCarthy, Mills and Pullen20) or the HTA report (Reference McCarthy, Mills and Pullen14), where the scores are 6 and 7, respectively, we would underrate the trial. In general, the difference in methodological quality for trials between the RCT paper/HTA report and the economic evaluation paper was 4 to 5 points according to PEDro scores (Table 2).
Moreover, assessment of the quality of the economic evaluation as such should be possible as of the economic evaluation's paper itself without having to resort to papers where the clinical trials are reported. However, there remain cases in which the procedures and characteristics of patients are poorly explained in a paper where the main objective was to inform about the economic evaluation. For the analyzed cases, most papers that reported economic evaluations were sufficiently informative, whereby the overall assessment of economic evaluations did not vary, despite the reading of the trials being informative. As an exception, the paper by Richardson et al. is an example of an extreme case where interventions are not reported at all (Reference Richardson, Hawkins and McCarthy15); the authors refer at all times to other papers for further information (Reference McCarthy, Mills and Pullen14;Reference McCarthy, Mills and Pullen20). This same paper only writes up the results from the HRQoL questionnaire EQ-5D as this is the only result of interest for the purposes of economic evaluation and it does not mention the results for the other variables analyzed in the trial (Reference Richardson, Hawkins and McCarthy15). Therefore, if we wished to ascertain the characteristics of the interventions and the trial results we would have to resort to studies published earlier.
This review has some limitations. First, the systematic review included a limited number of economic evaluations. A review with more included studies would have allowed a richer analysis. Second, the analysis was conditioned by the tools selected to assess the methodological quality. Economists and physiotherapists are familiar with the Drummond's criteria and the PEDro scale, respectively; therefore, these tools seemed the most appropriate to reach the target audience of this study. Other tools could highlight other interesting features. Finally, our ultimate aim was to illustrate and discuss the issues related to the methodological quality assessment of economic evaluations performed alongside clinical trials based on limited data. This is a topic for further discussion as we researchers, not only physiotherapists, must progress and improve how to communicate our work.
Generally, a RCT paper reports more clinical data than the paper where the economic evaluation is reported. As mentioned on the PEDRo Web site, “the PEDro scale should not be used as a measure of the ‘validity’ of a study's conclusions” but other considerations must be taken into account (9). Although in our review the overall assessment of economic evaluations was affected by the available information only in one case (Reference McCarthy, Mills and Pullen14;Reference Richardson, Hawkins and McCarthy15), this review shows how a study can be assessed differently according to the source of information. In consequence, not only researchers should follow the guidelines to appropriately perform and report economic evaluations, but also other researchers such as doers of systematic reviews should take their time to review the whole information published related to the economic evaluations included in their reviews, if possible, to do justice to the studies’ quality.
SUPPLEMENTARY MATERIAL
Supplementary Figure 1: https://doi.org/10.1017/S0266462317000757
Supplementary Table 1: https://doi.org/10.1017/S0266462317000757
Supplementary Table 2: https://doi.org/10.1017/S0266462317000757
Supplementary Table 3: https://doi.org/10.1017/S0266462317000757
Supplementary Methods of the Systematic Review: https://doi.org/10.1017/S0266462317000757
CONFLICTS OF INTEREST
The authors have nothing to disclose.