Prostate cancer (PCa) is the fourth most common cancer type overall and the second most common cancer among men. In the year 2012, an estimated 1.1 million men worldwide were diagnosed with PCa, accounting for 15 percent of all cancers diagnosed in men. Most cases (around 70 percent) are diagnosed in more developed regions of the world, and the burden of PCa in both economic and clinical terms is remarkable (1).
Health-related quality of life (HRQoL) is an important patient-related outcome for studying the success of treatment. Information on HRQoL is also vital to the ability to make informed choices about treatment for PCa, especially because the increased use of prostate-specific antigen (PSA) screening now exposes more patients to the knowledge of their early prostate cancer, when there are multiple treatment choices to choose from.
HRQoL instruments can be divided into two categories: generic and disease-specific instruments. Disease-specific instruments are used for studying the most important effects on patients’ HRQoL for a given disease. However, they are not suitable for comparison of different health care interventions across different disease entities. Thus, they are useful for providing insights into patients’ symptoms and functionality, and are well suited for clinical decision making. Good examples of disease-specific instruments in the PCa setting are the 27-item Functional Assessment of Cancer Therapy (FACT-G) scale (Reference Esper, Mo and Chodak2), the 12-item prostate cancer specific tools (FACT-P) scale (Reference Cella, Tulsky and Gray3), the International Prostate Symptoms Score (IPSS) (Reference Barry, Fowler and O'Leary4), the UCLA Prostate Cancer Index (UCLA-PCI) (Reference Litwin, Hays, Fink, Ganz, Leake and Brook5) and the Expanded Prostate Cancer Index Composite (EPIC) (Reference Wei, Dunn, Litwin, Sandler and Sanda6).
The advantage of generic instruments is that they can be used across different patient groups with diverse underlying diseases or disabilities. The generic instruments can be classified into two groups: profile and single index score instruments. The profile instruments measure, depending on the instrument, a broad scale of physical and emotional dimensions. The Short Form 36 (SF-36) (Reference Brazier, Roberts and Deverill7), for instance, uses dimensions such as vitality, role emotional, and social function. The single index instruments provide a single index score usually between 0 and 1, although other scales also exist.
Utilities can be elicited by direct or indirect valuation methods. The direct valuation methods include such approaches as the Standard Gamble (SG) (Reference Gafni8), the Time-Trade-Off (TTO) (Reference Dolan, Gudex, Kind and Williams9), the Rating Scale (RS) (Reference Bleichrodt and Johannesson10), and the Visual Analogue Scale (VAS) (Reference Gudex, Dolan, Kind and Williams11). In indirect valuation methods, such as the 15D (Reference Sintonen12), the EuroQol (EQ-5D) (Reference Rabin and de Charro13), the Health Utilities Index, Mark II/Mark III (HUI) (Reference Torrance, Feeny and Furlong14), the Quality of Well-Being scale (QWB) (Reference Kaplan, Ganiats, Sieber and Anderson15), the Rosser-Kind Index (Reference Rosser and Kind16), the Short Form 6D (SF-6D) (Reference Brazier, Roberts and Deverill7), and the Assessment of Quality-of-Life (AQoL) (Reference Hawthorne and Richardson17), utilities are elicited by a questionnaire in which a person chooses, from a set of predefined health states, the most suitable one for his own perceived health state. The weights of the different health states are derived from the general population to represent the values of the community regarding the appreciation of different health states.
From a health economics perspective, any decision regarding resource allocation should be based on maximizing welfare for society. In real life, uncertainty is always present and optimal conditions are never achieved. Health economic analysis, or more precisely cost-effectiveness or cost-utility analysis of various interventions, takes into consideration both the quality and the costs of organizing treatment. Currently, the most commonly used framework to compare different interventions in a health economic assessment from a quality perspective involves the use of quality-adjusted life-years (QALYs), which combines both the quality and the length of life in certain stages of a disease.
The aim of this systematic review was to assess PCa studies in which HRQoL was collected from patients by using generic, validated instruments in such a way that the results could be used to estimate QALYs and, consequently, were directly useable for health economic evaluations. The review was done to provide consolidated data for this purpose. Another objective of this study was to describe, which were the most used instruments and to qualitatively describe the nature of these studies (country scope, follow-up period, study population, etc).
METHODS
Literature Search
Computerized literature searches were performed without language restrictions using prostate cancer and quality-of-life as key words according to Medical Subject Heading (MeSH) terminology. Systematic literature searches were conducted on March 16th 2013 for the years 2002–13 and on June 18th 2015 for the years 2013–15 from the Medline, Cochrane Library, PsycINFO, and CINAHL databases. The most recent publications that had not yet been indexed were searched for manually among the Pubmed in Process references. The searches were restricted to meta-analyses, systematic reviews, randomized controlled trials, and observational studies. Systematic reviews and meta-analyses were searched because we wanted to manually double-check from this material that all relevant studies were included in our literature search. Congress abstracts were not included. The results in Medline were filtered with the filters developed by SIGN (Scottish Intercollegiate Guidelines Network). In addition, bibliographies of potential articles that, for example, included HRQoL/utility data as inputs of cost-effectiveness analyses were reviewed manually by the authors.
The detailed search strategy and results are availlable in Table 1 and Supplementary Tables 1–4.
EQ-5D; EuroQoL, HUI; Health Utilities Index, QWB; Quality of Well being, SF-6D; Short Form 6D; AQoL-8D, Assessment of Quality of Life;
TTO, Time-Trade-off; SG, Standard Gamble; VAS, Visual Analog Scale;
PCa; Prostate Cancer, BrCa; Breast Cancer, CRC; colorectal cancer.
Inclusion Criteria
Initial screening of the articles that were identified was based on their abstracts, which were reviewed independently by at least two of the authors (S.T. and S.B. read all the abstracts, and K.T. and R.R. read a part of them), and the selection of relevant articles was agreed upon in discussion between the reviewers. When an abstract did not give sufficiently precise information about the study, or this information was not available at all, the full article was obtained for further review.
Full-text articles obtained for closer evaluation were read independently by at least two of the authors (S.T. and S.B. read all the full-texts, and K.T. and R.R. a selection of them). Included studies were randomized controlled studies or observational studies, in which (i) HRQoL data were collected from prostate cancer patients, (ii) the results were reported as single index utility scores, and (iii) validated HRQoL instruments were used (either direct valuation using TTO, SG, VAS, or RS or indirect valuation using 15D, EQ-5D, SF-6D, HUI, AQoL, QWB, or Rosser-Kind).
RESULTS
Articles
The literature search identified a total of 2,171 references of which 190 were duplicates and, therefore, were eliminated. Based on the screening of abstracts, 237 studies were obtained for full-text assessment. Most of the studies were published in English, but the final selection of publications included two non-English language articles (one in Japanese [18], and one in Spanish [19]). Articles that reported only a planned study outline but that had no HRQoL data collected or reported were also excluded.
After the review of the full-text articles, thirty-three studies were judged to fulfill the inclusion criteria and were thus included in qualitative synthesis of this systematic review (Figure 1).
Country Scope of the Studies
Eleven articles (33 percent) came from the United States (Reference Smith, Krygiel, Nease, Sumner and Catalona20–Reference Freytag, Stricker and Lu30), but there were also six multinational studies and U.S. patients were included in four of these (Reference Saad, Gleason and Murray31–Reference Loriot, Miller and Sternberg34). Three (9 percent) of the articles came from Canada (Reference Krahn, Ritvo and Irvine35–Reference Cameron, Springer, Fox-Wasylyshyn and El-Masri37), and in addition all four multinational studies included Canadian patients (Reference Saad, Gleason and Murray31–Reference Loriot, Miller and Sternberg34). Four studies (12 percent) came from the United Kingdom (Reference Pearcy, Wandron, O'Boyle and MacDonagh38–Reference Diels, Hamberg and Ford41), three (9 percent) from Finland (Reference Booth, Rissanen and Tammela42–Reference Färkkilä, Torvinen and Roine44), three from Japan (Reference Namiki, Ishidoya and Saito18;Reference Loriot, Miller and Sternberg34;Reference Shimizu, Fujino and Ito45), and one study (3 percent) from Norway (Reference Ruland, Andersen and Jeneson46), Spain (Reference Fernández-Arjona, de la Cruz, Delgado, Malet and Portillo19), Turkey (Reference Soyupek, Soyupek, Perk and Ozorak47), the Netherlands (Reference Korfage, Essink-Bot and Borsboom48), and Lithuania (Reference Mickevičienė, Vanagas, Jievaltas and Ulys49). The six multinational studies (Reference Saad, Gleason and Murray31–Reference Loriot, Miller and Sternberg34;Reference Skaltsa, Longworth, Ivanescu, Phung and Holmstrom40;Reference Diels, Hamberg and Ford41) included, in addition to the U.S. and Canadian patients mentioned above, patients from Australia, Argentina, Canada, France, Brazil, Germany, United Kingdom, New Zealand, Italy, Chile, Switzerland, Austria, Belgium, Peru, Sweden, Russia, Israel, and Uruguay (Table 1 and Supplementary Table 1).
HRQoL instruments
Of the thirty-three articles, twenty-four (73 percent) used an indirect valuation and sixteen (48 percent) a direct valuation method (some of the studies included instruments using both approaches). The most commonly used instrument was the EQ-5D, which was used in twenty-one (64 percent) studies (Reference Namiki, Ishidoya and Saito18;Reference Fernández-Arjona, de la Cruz, Delgado, Malet and Portillo19;Reference Reed, Radeva, Glendenning, Saad and Schulman22;Reference Wu, Cooperberg, Sadetsky and Carroll27;Reference Freytag, Stricker and Lu30–Reference Loriot, Miller and Sternberg34;Reference Krahn, Bremner and Tomlinson36;Reference Cameron, Springer, Fox-Wasylyshyn and El-Masri37;Reference Glazener, Boachie and Buckley39–Reference Shimizu, Fujino and Ito45;Reference Korfage, Essink-Bot and Borsboom48;Reference Wang and Eriksson50). The VAS was also common as it was used in ten (30 percent) studies (Reference Fernández-Arjona, de la Cruz, Delgado, Malet and Portillo19;Reference Meghani, Lee, Hanlon and Bruner28;Reference Pickard, Lin and Knight29;Reference Weinfurt, Li and Castel32;Reference Cameron, Springer, Fox-Wasylyshyn and El-Masri37;Reference Pearcy, Wandron, O'Boyle and MacDonagh38;Reference Torvinen, Färkkilä and Sintonen43;Reference Färkkilä, Torvinen and Roine44;Reference Korfage, Essink-Bot and Borsboom48;Reference Mickevičienė, Vanagas, Jievaltas and Ulys49). The EQ-5D and the VAS were used all over the world, which was not the case for the TTO. TTO was used in six (18 percent) studies which all originated from the United States (Reference Smith, Krygiel, Nease, Sumner and Catalona20;Reference Knight, Siston and Chmiel21;Reference Volk, Cantor and Cass23;Reference Elstein, Chapman and Knight25;Reference Sommers, Beard and D'Amico26;Reference Meghani, Lee, Hanlon and Bruner28) (Table 2).
a Some of the studies utilized multiple instruments.
The 15D (15 percent) was used in five studies. These were the three studies that were carried out in Finland (Reference Booth, Rissanen and Tammela42–Reference Färkkilä, Torvinen and Roine44), the one in Norway (Reference Ruland, Andersen and Jeneson46) and the one in Turkey (Reference Soyupek, Soyupek, Perk and Ozorak47). The Health Utilities Index (HUI) and the Quality of Well-Being scale (QWB) were used in two of the Canadian studies (Reference Krahn, Ritvo and Irvine35;Reference Krahn, Bremner and Tomlinson36). SG was used in two U.S. studies (Reference Smith, Krygiel, Nease, Sumner and Catalona20;Reference Stewart, Lenert, Bhatnagar and Kaplan24), and SF-6D was used in one study conducted in Finland (Reference Booth, Rissanen and Tammela42). There were no studies that reported HRQoL being measured by the AQoL, Rosser-Kind, or RS instruments (Table 2).
Utility
Only articles that reported results were included, and thus utility values or QALYs were available in all the studies although mean/median values were missing from one article (Reference Wang and Eriksson50). However, it was not within the scope of this study to pool mean utilities from the studies, or to perform a meta-analysis, because both the instruments used and the study populations varied greatly from study to study. For localized and early stage disease, the HRQoL scores varied from 0.63 to 0.91 (Reference Smith, Krygiel, Nease, Sumner and Catalona20;Reference Knight, Siston and Chmiel21;Reference Elstein, Chapman and Knight25;Reference Sommers, Beard and D'Amico26;Reference Krahn, Bremner and Tomlinson36;Reference Glazener, Boachie and Buckley39;Reference Shimizu, Fujino and Ito45;Reference Korfage, Essink-Bot and Borsboom48). The impact of radical prostatectomy on HRQoL was studied in five of the articles (Reference Smith, Krygiel, Nease, Sumner and Catalona20;Reference Krahn, Ritvo and Irvine35;Reference Glazener, Boachie and Buckley39;Reference Korfage, Essink-Bot and Borsboom48;Reference Wang and Eriksson50), and the HRQoL scores after surgery varied between 0.68 and 0.91. For advanced or metastatic stage disease the HRQoL scores varied between 0.50 and 0.87 (Reference Namiki, Ishidoya and Saito18;Reference Fernández-Arjona, de la Cruz, Delgado, Malet and Portillo19;Reference Reed, Radeva, Glendenning, Saad and Schulman22;Reference Volk, Cantor and Cass23;Reference Sommers, Beard and D'Amico26;Reference Wu, Cooperberg, Sadetsky and Carroll27;Reference Saad, Gleason and Murray31;Reference Sullivan, Mulani, Fishman and Sleep33;Reference Loriot, Miller and Sternberg34;Reference Skaltsa, Longworth, Ivanescu, Phung and Holmstrom40;Reference Diels, Hamberg and Ford41;Reference Torvinen, Färkkilä and Sintonen43;Reference Färkkilä, Torvinen and Roine44). The variance in HRQoL scores between the various disease stages is most probably also a consequence of variance in the HRQoL instruments used and variance in the study methods.
Follow-up Period
Of the reviewed studies, seventeen (52 percent) were longitudinal (Reference Namiki, Ishidoya and Saito18;Reference Knight, Siston and Chmiel21;Reference Reed, Radeva, Glendenning, Saad and Schulman22;Reference Elstein, Chapman and Knight25;Reference Wu, Cooperberg, Sadetsky and Carroll27;Reference Freytag, Stricker and Lu30–Reference Loriot, Miller and Sternberg34;Reference Krahn, Bremner and Tomlinson36;Reference Cameron, Springer, Fox-Wasylyshyn and El-Masri37;Reference Glazener, Boachie and Buckley39;Reference Skaltsa, Longworth, Ivanescu, Phung and Holmstrom40;Reference Booth, Rissanen and Tammela42;Reference Ruland, Andersen and Jeneson46;Reference Korfage, Essink-Bot and Borsboom48) and sixteen (48 percent) cross-sectional (Reference Fernández-Arjona, de la Cruz, Delgado, Malet and Portillo19;Reference Smith, Krygiel, Nease, Sumner and Catalona20;Reference Volk, Cantor and Cass23;Reference Stewart, Lenert, Bhatnagar and Kaplan24;Reference Sommers, Beard and D'Amico26;Reference Meghani, Lee, Hanlon and Bruner28;Reference Pickard, Lin and Knight29;Reference Krahn, Ritvo and Irvine35;Reference Pearcy, Wandron, O'Boyle and MacDonagh38;Reference Diels, Hamberg and Ford41;Reference Torvinen, Färkkilä and Sintonen43;Reference Färkkilä, Torvinen and Roine44;Reference Shimizu, Fujino and Ito45;Reference Soyupek, Soyupek, Perk and Ozorak47;Reference Mickevičienė, Vanagas, Jievaltas and Ulys49;Reference Wang and Eriksson50). In most of the longitudinal follow-up studies (n = 15; 88 percent), an indirect instrument was used (Reference Namiki, Ishidoya and Saito18;Reference Reed, Radeva, Glendenning, Saad and Schulman22;Reference Wu, Cooperberg, Sadetsky and Carroll27;Reference Freytag, Stricker and Lu30–Reference Loriot, Miller and Sternberg34;Reference Krahn, Bremner and Tomlinson36;Reference Cameron, Springer, Fox-Wasylyshyn and El-Masri37;Reference Glazener, Boachie and Buckley39;Reference Skaltsa, Longworth, Ivanescu, Phung and Holmstrom40;Reference Booth, Rissanen and Tammela42;Reference Ruland, Andersen and Jeneson46;Reference Korfage, Essink-Bot and Borsboom48). In contrast, most of the cross-sectional studies used a direct instrument to measure HRQoL (n = 11; 69 percent) (Reference Fernández-Arjona, de la Cruz, Delgado, Malet and Portillo19;Reference Smith, Krygiel, Nease, Sumner and Catalona20;Reference Volk, Cantor and Cass23;Reference Stewart, Lenert, Bhatnagar and Kaplan24;Reference Sommers, Beard and D'Amico26;Reference Meghani, Lee, Hanlon and Bruner28;Reference Pickard, Lin and Knight29;Reference Pearcy, Wandron, O'Boyle and MacDonagh38;Reference Torvinen, Färkkilä and Sintonen43;Reference Färkkilä, Torvinen and Roine44;Reference Mickevičienė, Vanagas, Jievaltas and Ulys49). Of the longitudinal follow-up studies, nine studies had a follow-up period equal to or less than 1 year (Reference Namiki, Ishidoya and Saito18;Reference Knight, Siston and Chmiel21;Reference Elstein, Chapman and Knight25;Reference Wu, Cooperberg, Sadetsky and Carroll27;Reference Sullivan, Mulani, Fishman and Sleep33;Reference Krahn, Bremner and Tomlinson36;Reference Cameron, Springer, Fox-Wasylyshyn and El-Masri37;Reference Glazener, Boachie and Buckley39;Reference Ruland, Andersen and Jeneson46), seven had a follow-up period of more than 1 year (Reference Freytag, Stricker and Lu30–Reference Loriot, Miller and Sternberg34;Reference Skaltsa, Longworth, Ivanescu, Phung and Holmstrom40;Reference Booth, Rissanen and Tammela42;Reference Korfage, Essink-Bot and Borsboom48), and in one study the reporting concerning the follow-up period was not clear (Reference Reed, Radeva, Glendenning, Saad and Schulman22). The longest follow-up period was 13 years (Reference Booth, Rissanen and Tammela42).
Study Populations
Most of the studies included exclusively PCa patients and only two studies included patients with other cancer types (breast cancer, colorectal cancer) as well (Reference Färkkilä, Torvinen and Roine44;Reference Ruland, Andersen and Jeneson46). All disease stages were well represented, from early/localized disease (n = 9) (Reference Smith, Krygiel, Nease, Sumner and Catalona20;Reference Knight, Siston and Chmiel21;Reference Elstein, Chapman and Knight25;Reference Sommers, Beard and D'Amico26;Reference Krahn, Bremner and Tomlinson36;Reference Glazener, Boachie and Buckley39;Reference Torvinen, Färkkilä and Sintonen43;Reference Shimizu, Fujino and Ito45;Reference Korfage, Essink-Bot and Borsboom48) to advanced/metastatic disease (n = 15) (Reference Namiki, Ishidoya and Saito18;Reference Fernández-Arjona, de la Cruz, Delgado, Malet and Portillo19;Reference Reed, Radeva, Glendenning, Saad and Schulman22;Reference Volk, Cantor and Cass23;Reference Wu, Cooperberg, Sadetsky and Carroll27;Reference Saad, Gleason and Murray31–Reference Loriot, Miller and Sternberg34;Reference Krahn, Bremner and Tomlinson36;Reference Skaltsa, Longworth, Ivanescu, Phung and Holmstrom40;Reference Diels, Hamberg and Ford41;Reference Torvinen, Färkkilä and Sintonen43;Reference Färkkilä, Torvinen and Roine44;Reference Soyupek, Soyupek, Perk and Ozorak47). The disease stage was mixed in fourteen of the papers and could not be exclusively categorized into either of the above-mentioned groups (Reference Volk, Cantor and Cass23;Reference Stewart, Lenert, Bhatnagar and Kaplan24;Reference Meghani, Lee, Hanlon and Bruner28–Reference Freytag, Stricker and Lu30;Reference Krahn, Ritvo and Irvine35–Reference Pearcy, Wandron, O'Boyle and MacDonagh38;Reference Torvinen, Färkkilä and Sintonen43;Reference Shimizu, Fujino and Ito45;Reference Ruland, Andersen and Jeneson46;Reference Mickevičienė, Vanagas, Jievaltas and Ulys49;Reference Wang and Eriksson50). Populations that had been identified by PSA screening were found in two studies (Reference Volk, Cantor and Cass23;Reference Booth, Rissanen and Tammela42). Three studies included patients after radical prostatectomy (Reference Smith, Krygiel, Nease, Sumner and Catalona20;Reference Glazener, Boachie and Buckley39;Reference Wang and Eriksson50). The number of subjects in each study ranged from 20 to 5,516, with a total of 16,327 subjects in all the studies combined.
Missing Data
Fewer than half of the studies (n = 13; 39 percent) reported information concerning missing data (Reference Reed, Radeva, Glendenning, Saad and Schulman22;Reference Elstein, Chapman and Knight25;Reference Wu, Cooperberg, Sadetsky and Carroll27;Reference Weinfurt, Li and Castel32;Reference Loriot, Miller and Sternberg34–Reference Krahn, Bremner and Tomlinson36;Reference Glazener, Boachie and Buckley39;Reference Booth, Rissanen and Tammela42–Reference Färkkilä, Torvinen and Roine44;Reference Ruland, Andersen and Jeneson46;Reference Korfage, Essink-Bot and Borsboom48). Five of these reported having used instrument-specific procedures to replace the missing data (Reference Krahn, Bremner and Tomlinson36;Reference Glazener, Boachie and Buckley39;Reference Booth, Rissanen and Tammela42–Reference Ruland, Andersen and Jeneson46). Eleven studies reported information about the response rates of individual subjects, and these response rates varied between 37 percent and 95 percent (Reference Reed, Radeva, Glendenning, Saad and Schulman22;Reference Sommers, Beard and D'Amico26;Reference Loriot, Miller and Sternberg34;Reference Glazener, Boachie and Buckley39;Reference Diels, Hamberg and Ford41–Reference Färkkilä, Torvinen and Roine44;Reference Korfage, Essink-Bot and Borsboom48–Reference Wang and Eriksson50).
Study Setting and Publication
Most of the studies (n = 21) were done in the setting of clinical practice or were observational by nature (Reference Fernández-Arjona, de la Cruz, Delgado, Malet and Portillo19–Reference Knight, Siston and Chmiel21;Reference Volk, Cantor and Cass23–Reference Wu, Cooperberg, Sadetsky and Carroll27;Reference Pickard, Lin and Knight29;Reference Sullivan, Mulani, Fishman and Sleep33;Reference Krahn, Ritvo and Irvine35–Reference Pearcy, Wandron, O'Boyle and MacDonagh38;Reference Diels, Hamberg and Ford41;Reference Torvinen, Färkkilä and Sintonen43–Reference Shimizu, Fujino and Ito45;Reference Soyupek, Soyupek, Perk and Ozorak47;Reference Mickevičienė, Vanagas, Jievaltas and Ulys49;Reference Wang and Eriksson50). A clinical trial setting was found in approximately one-third of the studies (n = 12) (Reference Namiki, Ishidoya and Saito18;Reference Reed, Radeva, Glendenning, Saad and Schulman22;Reference Meghani, Lee, Hanlon and Bruner28;Reference Freytag, Stricker and Lu30–Reference Weinfurt, Li and Castel32;Reference Loriot, Miller and Sternberg34;Reference Glazener, Boachie and Buckley39;Reference Skaltsa, Longworth, Ivanescu, Phung and Holmstrom40;Reference Booth, Rissanen and Tammela42;Reference Ruland, Andersen and Jeneson46;Reference Korfage, Essink-Bot and Borsboom48). HRQoL data from real-life clinical practice seem to be the most popular form of study design, which is in line with expectations. Clinical trials (especially randomized controlled trials design) usually reflect an ideal setting and, consequently, do not provide information about real-life situations; such information is needed in order for the cost-effectiveness of treatment in everyday clinical practice to be evaluated. The vast majority of studies (n = 30) elicited the patient's current health state, and only four studies (Reference Stewart, Lenert, Bhatnagar and Kaplan24–Reference Sommers, Beard and D'Amico26;Reference Meghani, Lee, Hanlon and Bruner28) elicited preferences for hypothetical health states predefined by investigators.
Twenty-one (64 percent) of the studies were published in a clinical journal (Reference Namiki, Ishidoya and Saito18–Reference Volk, Cantor and Cass23;Reference Elstein, Chapman and Knight25;Reference Sommers, Beard and D'Amico26;Reference Freytag, Stricker and Lu30–Reference Weinfurt, Li and Castel32;Reference Loriot, Miller and Sternberg34;Reference Cameron, Springer, Fox-Wasylyshyn and El-Masri37;Reference Pearcy, Wandron, O'Boyle and MacDonagh38;Reference Booth, Rissanen and Tammela42;Reference Torvinen, Färkkilä and Sintonen43;Reference Ruland, Andersen and Jeneson46–Reference Wang and Eriksson50) and twelve (36 percent) in a journal dedicated to health economics, assessment of healthcare technologies, healthcare administration, or decision making (Reference Stewart, Lenert, Bhatnagar and Kaplan24;Reference Wu, Cooperberg, Sadetsky and Carroll27–Reference Pickard, Lin and Knight29;Reference Sullivan, Mulani, Fishman and Sleep33;Reference Krahn, Ritvo and Irvine35;Reference Krahn, Bremner and Tomlinson36;Reference Glazener, Boachie and Buckley39–Reference Diels, Hamberg and Ford41;Reference Färkkilä, Torvinen and Roine44;Reference Shimizu, Fujino and Ito45).
DISCUSSION
Strengths and Weaknesses of this Review
The strengths of this review are that we included all HRQoL instruments that produce single index utility scores and that we covered all disease stages of PCa. Although there are other systematic reviews related to this subject, many of them have focused on disease-specific measurements of QoL, or focused on certain disease stages only, or are outdated (Reference Eton and Lepore51–Reference Ware and Sherbourne59). We hope that this systematic review can introduce the reader to the topic and provide utility data for future economic evaluations.
There are certain limitations in our work. First, we chose to focus only on generic HRQoL instruments, and therefore a broad spectrum of studies using disease-specific instruments were excluded. However, there is a recent systematic review that focuses on the psychometric properties of the twenty most often used HRQoL instruments in prostate cancer (Reference Hamoen, De Rooij, Witjes, Barentsz and Rovers58). These instruments do not translate into utilities/QALYs, but they do provide important information, and may be more sensitive in assessing some of the HRQoL impacts. Also, we did not include the Short Form 36 (SF-36/RAND-36) instrument, which is a generic HRQoL measurement that yields an eight-scale profile of health (Reference Brazier, Roberts and Deverill7). The SF-36 instrument does not provide a single utility score directly, and there are limitations in its use for economic evaluations because the scoring system is not based on preferences (Reference Ware and Sherbourne59).
Instead, studies that used the SF-6D instrument, a HRQoL instrument derived from SF-36 that uses preferences from the general public, were included in this review. Another limitation of this study is its inability to draw quantitative conclusions from utilities/QALYs due to the large variance in study settings, populations and methods. A meta-analysis of collected utilities would be interesting, but not within the scope of this study. This would also require a broader sample of studies to be included for example by taking into account a longer time perspective. In addition, we did not include congress abstracts; therefore, some information has been missed from this review.
Evidence Base for Generic HRQoL
Of the 2,171 abstracts only thirty-three (1.5 percent) articles fulfilled the inclusion criteria. We were surprised that only a limited number of studies were really based on actual measurements of patients’ generic HRQoL in PCa, considering that PCa is one of the most common cancers and that the aging of the population will result in an increasing prevalence of the disease and, consequently, a major burden for health care systems and society. Although we did not record the number of disease-specific HRQoL instruments, the review of all HRQoL-related references found in the original search made it evident that there is more research on the symptoms of the disease evaluated by using disease-specific instruments. One of the main reasons for assessing HRQoL and QALYs is their usability in the health economic analysis needed for sound resource allocation decisions. Moreover, they provide insights into patients’ well-being. Lack of published evidence on HRQoL in different stages of the disease and on the number of QALYs gained by different interventions may jeopardize a reliable health economic assessment.
Disease Stage and Effects for Different Domains of HRQoL
The HRQoL in PCa encompasses both disease-specific and general aspects. The disease and its treatments certainly affect both of these aspects, but the effects might be very different depending on the stage of disease, type of treatment, etc. In localized disease, disease-specific domains like urinary, sexual, and bowel function are the most profoundly affected domains, whereas, with some exceptions, general HRQoL usually remains mostly unaffected (Reference Torvinen, Färkkilä and Sintonen43;Reference Eton and Lepore51). Some of the studies took into consideration the HRQoL impact of PSA screening and the impact of early diagnosis. There were findings both in support of and against early diagnosis in terms of HRQoL impact. In a broad population-based trial, a slight HRQoL advantage in favor of screening among men with a diagnosis of PCa was seen during the follow-up period of 13 years and was strongest among men with early stage disease (Reference Booth, Rissanen and Tammela42).
In another study, the substantial disutility of asymptomatic disease was thought to reflect the anxiety caused by the uncertainty of not knowing whether the cancer would spread, rather than the current actual state of health (Reference Stewart, Lenert, Bhatnagar and Kaplan24). Krahn et al. (Reference Krahn, Ritvo and Irvine35) concluded that, although sexual, urinary, and bowel dysfunction are common in prostate cancer, their impact on overall health status may have been overestimated if utility scores have been derived from hypothetical scenarios or from individuals without the disease, and this weakens the major argument against PSA screening and the aggressive treatment of early prostate cancer. Longitudinal follow-up studies on HRQoL are needed to draw the most accurate conclusions on the HRQoL impact of the side effects of the treatments in localized and early PCa (Reference Korfage, Essink-Bot and Borsboom48).
In the advanced or metastatic disease stage many reviewed articles focused on the HRQoL effects of skeletal-related events (SREs). Significant impacts on HRQoL were related to SREs (Reference Reed, Radeva, Glendenning, Saad and Schulman22;Reference Weinfurt, Li and Castel32;Reference Sullivan, Mulani, Fishman and Sleep33), although in the study by Saad et al. (Reference Saad, Gleason and Murray31) changes in HRQoL due to SREs were not statistically significant. One might argue that this may be due to the insensitivity of the generic EQ-5D instrument that was used, but Weinfurt et al. (Reference Weinfurt, Li and Castel32) and Sullivan et al. (Reference Sullivan, Mulani, Fishman and Sleep33) did find significant HRQoL impact related to SREs using the same instrument. Pain is a frequent symptom associated with SREs, and many HRQoL studies therefore incorporate disease-specific instruments such as the Brief Pain Inventory or the EORTC QLQ-C30, which includes a pain domain. In a recent publication HRQoL impacts of SREs were measured by the TTO method, and the study showed significant disutility due to SREs (Reference Matza, Chung and Van Brunt60). Only one of the studies covering the advanced/metastatic stage disease specifically focused on the HRQoL impact of palliative care (Reference Färkkilä, Torvinen and Roine44).
Whose Evaluation?
Regarding direct valuation instruments, the overall limitation of the approach is that the utility theory suggests that a utility assessment should be done in the general population who pay for health care (Reference Torrence, Thomas and Sackett61;Reference Gold, Patric, Torrence, Gold, Siegel, Russell and Weinstein62). It has also been suggested that population-based preferences are used in economic analyses. However, another view supported by many clinicians and researchers is that patients who have undergone the experience of a specific health condition are the best evaluators of the HRQoL impact (Reference Gold, Patric, Torrence, Gold, Siegel, Russell and Weinstein62). In indirect valuation methods, where weights from population-based preferences are used, this issue does not exist in the same sense.
Few of the articles examined how well the HRQoL results reported by PCa patients and by care-givers/significant others are correlated (proxy approach). One of the reviewed articles concluded that utility scores derived from the patients’ own health were higher than community-derived utilities (Reference Krahn, Ritvo and Irvine35). Stewart et al. (Reference Stewart, Lenert, Bhatnagar and Kaplan24) found that men who had experienced impotence or urinary incontinence rated these conditions somewhat better than men who had not experienced these symptoms. In addition, Pearcy et al. (Reference Pearcy, Wandron, O'Boyle and MacDonagh38) found that patients’ estimates of their HRQoL were higher than the estimates of their spouses or clinicians. These findings support the thinking that adaptation to a current health condition means that patients report higher utilities in comparison to the population.
In contrast, Volk et al. (Reference Volk, Cantor and Cass23) found that patients estimated lower utilities for the same health states than did their wives, and Elstein et al. (Reference Elstein, Chapman and Knight25) found that utilities estimated by patients were lower than those estimated by their clinicians. Methodological issues, such as the HRQoL instrument used in each study, probably have an impact on the conclusions reached on this matter. Common approach is that the patients themselves assess their own current state of health, which was also supported by the findings of this review. Only four studies included spouses or clinicians/caregivers who estimated the health state specific to prostate cancer (Reference Volk, Cantor and Cass23;Reference Elstein, Chapman and Knight25;Reference Pickard, Lin and Knight29;Reference Pearcy, Wandron, O'Boyle and MacDonagh38).
Instrument Recommendation for Health Economic Evaluations?
In health economical evaluation, the focus is often to relate the health outcomes to the average cost needed to produce them. Variance in these factors can be taken into account in these analyses. The approach taken by Meghani et al. (Reference Meghani, Lee, Hanlon and Bruner28) was to understand heterogeneity among patients in terms of how patients value HRQoL versus survival. Although policy making and cost-utility analyses need to be based on how patients on average behave, variance in patients should also be understood (Reference Meghani, Lee, Hanlon and Bruner28). In addition, this aspect is surely important from the perspective of patients and clinicians to help them to choose the most suitable treatment modality.
Multiple HRQoL instruments have been developed during the past 3 decades, and some of them can be used directly to estimate QALYs, but thus far none of them has emerged as a preferred option or as a gold standard. Out of all the instruments included in this review, the EQ-5D was the most commonly used. Although there are known features which pose limitations to the use of the EQ-5D, it is still useable and is also very easy for patients to use (Reference Torvinen, Färkkilä and Sintonen43;Reference Färkkilä, Torvinen and Roine44). However, more HRQoL research is needed in the area of PCa, and it is important to acknowledge that HRQoL instruments are different in their empirical, theoretical, and technical characteristics, and, therefore, special attention needs to be paid to choosing an instrument. The scope of this review was not to make recommendations of which instrument to choose. In practice, a researcher doing a health economic evaluation is often obliged to choose from existing evidence without possibility at this stage to influence the instrument.
In two of the papers HRQoL/QALYs had been used to estimate the cost-effectiveness of treatment. It was outside the scope of this study to assess how many cost-effectiveness analyses had used the HRQoL data from these studies, but this would certainly be of interest for further analysis.
CONCLUSIONS
HRQoL assessment in PCa is an evolving field but, especially in the context of single index measures that can be used directly for QALY estimations, the literature is scarce. Given the fact that PCa is one of the most common solid tumors, it is important to focus on the treatment options and on their unique effects on the quantity and quality of life, while not forgetting the evaluation of the cost-effectiveness of these options.
SUPPLEMENTARY MATERIAL
Supplementary Tables 1–4 http://dx.doi.org/10.1017/S0266462316000118
ACKNOWLEDGMENTS
All authors participated in the study design, data collection, and drafting of the manuscript. ST and SB had joint responsibility for writing the manuscript and share primary authorship.
CONFLICTS OF INTEREST
ST is employee of Teva Pharmaceuticals Europe BV. SB is employee of Amgen Finland. HS is the developer of the 15D, and have received royalties from electronic versions of the 15D. KT have received honoraria for speaking at national meetings from GlaxoSmithKline, Astellas and Abbvie. The authors report no other conflicts of interest in this work.