Introduction
The aging population is growing worldwide (Li et al. Reference Li, Han and Zhang2019). This phenomenon has been expedited by the development of advanced technologies and breakthroughs in medicine that have prolonged the lifespans of individuals, and it represents an important issue because older adults are anticipated to grapple with protracted periods of managing chronic and age-associated disorders (Woods et al. Reference Woods, Palmarini and Corner2022; World Health Organization 2020). Although healthcare systems strive to support older adults for a long and healthy life, they will inevitably face frailties at some point and require end-of-life care. Nevertheless, a considerable number of elderly individuals nearing the end of life find themselves incapable of articulating their preferences due to incapacitation at the critical decision-making juncture. Family members frequently assume the responsibility of decision-making on behalf of their ailing loved ones in these cases, thus resulting in heightened distress and interpersonal conflicts as they endeavor to navigate the complex situations of selecting optimal choices for the individual (Institute of Medicine 2014). Consequently, many older adults who did not officially express their own decisions regarding their end-of-life in advance experience prolonged death while relying on life-sustaining treatments in hospitals, which are associated with high medical costs at the end-of-life and which deteriorates the quality of both end-of-life and death (The Lancet Respiratory Medicine 2016).
Therefore, it is imperative for older individuals to proactively plan for end-of-life care before they lose the ability to communicate their preferences. The Institute of Medicine (2014) has recommended that individuals express their end-of-life care preferences while they are still in good health and cognitively sound. Since the late 20th century, researchers and politicians in Western countries have developed specific programs, interventions, or policies to facilitate end-of-life communication among older adults (Park et al. Reference Park, Jo and Park2021; Thomas et al. Reference Thomas, Lobo and Detering2018). These are collectively described as advance care planning (ACP).
In the initial phase, ACP was conceptualized as the documentation of advance directives based on provided information (Carr and Luth Reference Carr and Luth2017). While it had until recently been used without a commonly shared definition, ACP has undergone multifaceted developments over time. The Institute of Medicine (2014) attempted to comprehensively define ACP based on its experience in America over the prior decade. Scholars in Europe (Rietjens et al. Reference Rietjens, Sudore and Connolly2017) and North America (Sudore et al. Reference Sudore, Lum and You2017b) used Delphi consensus processes in attempts to comprehensively define ACP. Those efforts uniformly delineated ACP as a process aimed at identifying one’s personal values, goals, and preferences regarding future medical treatments. This process involves engaging in discussions with both family members and healthcare providers, maintaining records of one’s end-of-life care preferences, and appointing a proxy decision-maker. Finally, those defining works were linked to identifying ACP outcomes (McMahan et al. Reference McMahan, Tellez and Sudore2021; Sudore et al. Reference Sudore, Heyland and Lum2018). Based on the definition of ACP, the outcomes reported in the previous studies were categorized along 4 domains, including process, action, quality of care, and health status/utilization (McMahan et al. Reference McMahan, Tellez and Sudore2021; Sudore et al. Reference Sudore, Heyland and Lum2018). ACP interventions primarily target a patient’s behavior changes like attitude, decision-making, and preferences, potentially leading to direct outcomes in the process or action domain. Outcomes related to the quality of care (QOC) may be associated with patients’ evaluation of ACP as care itself. Finally, health status outcomes might reflect the indirect, secondary effects of ACP interventions on a patient’s mental well-being or life expectancy.
The growing elderly population underscores the importance of effective ACP interventions for older adults. It is necessary to have a deep understanding of older adults’ experiences with ACP as a process to accurately evaluate these interventions. Patient-reported outcome measures (PROMs) offer invaluable insights into the impact that ACP interventions have on this population. While numerous studies have used various PROMs to assess ACP intervention effectiveness, there is a crucial gap regarding the appropriate selection and evaluation of these measures for the specific context of older adults. Prior research may not have adequately considered factors such as the reliability and validity of these PROMs when applied to the elderly population. To address this gap, it is necessary to conduct a systematic review of PROMs specifically focused on ACP for older adults. Such a review could leverage established tools like the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) to identify high-quality PROMs suitable for both research and clinical settings (Prinsen et al. Reference Prinsen, Mokkink and Bouter2018). Therefore, the purpose of the current review was to provide an overview of the PROMs used to measure the effects of ACP among older adults and to evaluate the psychometric properties of these PROMs using the COSMIN methodology.
Methods
Design
A systematic review of the measurement properties of ACP instruments in older adults was conducted while following the COSMIN methodology for systematic reviews of PROMs (Mokkink et al. Reference Mokkink, Prinsen and Patrick2018; Prinsen et al. Reference Prinsen, Mokkink and Bouter2018).
Search strategy
The search strategy for this study comprised 2 main steps aimed at efficiently managing the extensive body of literature on ACP programs and assessing various PROMs.
Step 1: Identification of studies via umbrella reviews (∼2018)
To efficiently manage the extensive body of research on ACP interventions and the various PROMs associated with these interventions, we employed a 2-step search strategy focused on umbrella reviews. Given the large volume of existing studies on the effectiveness of ACP interventions, we initially prioritized umbrella reviews, which synthesize findings from multiple systematic reviews and meta-analyses, to provide a broad and comprehensive spectrum of evidence (Fusar-Poli and Radua Reference Fusar-Poli and Radua2018). Specifically, we leveraged our previously published umbrella review (Park et al. Reference Park, Jo and Park2021), which synthesized reviews on ACP interventions for older adults in community-based settings published up to 2019, serving as a foundational reference point. To ensure a current and comprehensive capture of evidence, we expanded our search to include additional umbrella review published up to 2022 that addressed similar topics. We conducted searches across several databases, including PubMed, Cochrane Library, and CINHAL. The search strategy used keywords such as [“Advance Care Planning” OR “ACP”] AND [“Umbrella Review”] AND [“Older Adults” OR “Elderly”], covering studies published up to February 2022. During the initial screening, titles and abstracts were reviewed with inclusion criteria that required systematic reviews specifically addressing ACP interventions, assessing the quality of included studies using a standardized tool, and being published in English or Korean within peer-reviewed journals. A full-text assessment then evaluated the comprehensiveness of the included reviews, their methodological rigor, and their alignment with our study objectives, particularly concerning PROMs in ACP programs. Based on these criteria, 3 umbrella reviews were selected: Jimenez et al. (Reference Jimenez, Tan and Virk2018), (Park et al. Reference Park, Jo and Park2021), and Wendrich-van Dael et al. (Reference Wendrich-van Dael, Bunn and Lynch2020), as they collectively provided a broad overview of the effectiveness of ACP interventions and the variety of PROMs used. From the selected umbrella review, we included only studies with original data collection and analysis. This approach streamlined the identification and synthesis of relevant studies, allowing us to compile a focused list of research and measures used to evaluate the outcome of ACP interventions in older adult population (Fig. 1).
Step 2: Identification of studies via databases (2018∼2022)
We conducted a literature search using electronic databases including PubMed, EMBASE, Cochrane Library, CINHAL, and PsycINFO. These databases were searched for published studies from 1st January 2018 to 28th February 2022. We set the starting point as 2018 because the umbrella reviews that were selected in the first step included studies until 2018. Initially, we established keywords based on prior studies. With the assistance of a professional librarian, we searched each database using the keywords and medical subject headings (MESH) terms related to these keywords. The keywords and MESH terms were combined together with the “OR” and “AND” Boolean operators (Supplementary 1). Given the potential availability of additional relevant studies, we reviewed the references of all the included studies.
Inclusion and exclusion criteria
The inclusion criteria were as follows: (1) studies that targeted older adults (65 years or older); (2) studies using of any type of measurement tools that measure patient-reported ACP program outcomes; and (3) studies published in English or Korean. In our study, ACP was defined by following Sudore et al. (Reference Sudore, Heyland and Barnes2017a) who stated that “ACP is a process that supporting adults at any age or stage of health in understanding and sharing their personal values, life goals, and preferences regarding future medical care.” Based on this definition, we have considered ACP programs to be interventions focused on individualized support and preparation for communication and in-the moment decision-making. Programs that involve discussions between healthcare professionals or researchers and individuals regarding personal goal of care, values, etc., as well as the provision of information or education related to end-of-life treatment or medical decision-making, have been considered to be ACP programs. However, the provision of palliative care received by terminally ill patients in hospital or at home for pain management and comfort enhancement has not been regarded as part of ACP programs. The exclusion criteria were as follows: (1) nonexperimental studies; (2) studies that only provide outcomes of family caregivers or surrogates; (3) studies that include dementia patients; (4) studies for which the full text is unavailable; and (5) studies published in non-peer-reviewed journals.
Study screening and selection
We imported the articles (citations, abstracts, and full texts) identified from umbrella reviews and relevant databases into EndNote X9, a bibliographic manager. After removing duplicates, 4 researchers individually conducted an initial review of the titles and abstracts of the remaining articles for their relevance to the review topic. These 4 researchers were divided into 2 groups: Two of them reviewed articles that were identified through umbrella review search, while the other 2 reviewed articles that had been identified from databases. Studies that potentially or completely met the inclusion criteria were kept, and their full-texts were reviewed to decide whether to keep the record in the review. In cases of confusion or the need for exclusion, comments were noted separately. Upon conclusion of the initial review, researchers from each group proceeded to conduct additional reviews of articles from the opposing group that had not been previously examined. Discrepancies regarding article eligibility were resolved through discussion among researchers to achieve consensus. Subsequently, all researchers reviewed the ultimately included studies to ensure agreement.
Data extraction
Four researchers individually extracted information using 3 standardized information forms. The following information was extracted from each included study: (1) characteristics of included studies including author, year, type of ACP intervention, patient-reported outcomes category, name of PROMs, and presence of intervention effects (Supplementary 2); (2) PROMs characteristics including name, target population (size, age, and setting), number of items, subscales, response options, mode of administration (self-reported/researcher-reported), and original language (Supplementary 3); and (3) psychometric properties of PROMs including content validity, structural validity, internal consistency, cross-cultural validity, reliability, measurement error, criterion validity, hypotheses testing for construct validity, and responsiveness (Supplementary 3; Table 1). The ACP intervention outcome category in supplementary 2 was based on the standardized outcome framework of ACP by McMahan et al. (Reference McMahan, Tellez and Sudore2021): process, action, QOC, and health status.
Notes: ADAS = Advance Directive Attitude Survey; CANHELP = CANadian Healthcare EvaLuation Project questionnaire; CSQ-8 = Client Satisfaction Questionnaire 8; DCS = Decisional Conflict Scale; EORTC IN-PATSAT32 = European Organization for Research and Treatment of Cancer IN-PATient SATisfaction with care questionnaire 32; FACIT-TS-G = Functional Assessment of Chronic Illness Therapy Treatment Satisfaction General; IA = Included Article; LSPQ = Life-Support Preferences Questionnaire; ODCNF = Openness to Discuss Cancer in the Nuclear Family scale; OA = Orignial Article; P = Process; PA = Process and Action; QOC = quality of care; SHARED = patient experience of SHARED decision making questionnaire. Y = measurement properties were rated as “Y” if conducted, Yes(+) = if the information was appropriate, and N = No, if not conducted or information was not provided.
* The lists (number) of original and included articles are provided in Supplementary 2.
To ensure accurate understanding of each included measure, the original development articles were identified using the reference information provided. Any discrepancies arising during information extraction were then discussed among researchers to reach consensus. Finally, all researchers reviewed the final extracted information from the included studies to confirm agreement.
Appraisal of psychometric properties
Psychometric validation data specific to older people who received an ACP program were searched for each measure by reviewing all studies citing the original development article. The psychometric properties of the included measure, a tool measuring patient-reported ACP program outcomes, were assessed using the COSMIN checklist (Prinsen et al. 2018). Each measure was evaluated against the following COSMIN criteria: content validity, structural validity, internal consistency, cross-cultural validity, reliability, measurement error, criterion validity, hypotheses testing for construct validity, and responsiveness. The qualities of each measure were established according to criteria developed by Prinsen et al. (Reference Prinsen, Mokkink and Bouter2018). However, the research team modified the rating system of measurement properties. According to this framework, measurement properties were rated as “Yes” if conducted, “Yes(+)” if provided information is appropriate, and “No” if not conducted or no information was available. For example, internal consistency is rated as “Yes(+)” where Cronbach’s alpha is equal to or greater than 0.70; criterion validity is rated as “Yes(+)” where the correlation with the gold standard is equal to or greater 0.70 (Mokkink et al. Reference Mokkink, Prinsen and Patrick2018; Prinsen et al. Reference Prinsen, Mokkink and Bouter2018). In our study, ratings of the psychometric properties of each measure were confirmed by 1 researcher while 4 members of the review team independently rated all measurement properties. Any discrepancies were resolved through team discussion until reaching a consensus.
Results
Literature search
In total, 19,503 articles (95 from the umbrella review and 19,408 from the database) were initially identified. After removing duplicates, completing preliminary screening, and reviewing full texts, we ultimately included 74 articles (31 from the umbrella review and 43 from the database). The study search and selection process are presented in Ϝigure 1.
Study characteristics and ACP intervention outcomes
Supplementary 2 presents the description of included ACP intervention studies. The years of publication ranged from 2000 to 2022. Of 74 studies, ACP interventions were categorized into 3 types: 23 were information-based, aiming to influence patient’s thoughts though the provision of ACP information resources; 20 were conversation-based, providing opportunities to discuss the patient’s goal of care and value; and the remaining 31 utilized a combination of different approaches, including education and support for advance directives. In total, 202 PROM outcomes were identified from 74 ACP intervention studies. Approximately 80% (n = 165) of the outcomes indicated that the ACP intervention had positive or negative effects, while the remaining 20% did not provide specific information about the outcomes. Among the tools measuring 202 outcomes, 55 (27.2%) were tools developed by researchers either previously or during the course of their studies that were specifically tailored to their research. The remaining 147 (72.4%) tools were previously developed by other researchers.
The identified 202 outcomes were categorized into 4 domains: process, action, QOC, and health status (McMahan et al. Reference McMahan, Tellez and Sudore2021; Sudore et al. Reference Sudore, Heyland and Lum2018). The outcomes of process category specify how an effect of ACP intervention occurs. In our study, the included outcome variables were ACP behavior’s change, self-efficacy, attitude, knowledge (including Cardiopulmonary resuscitation [CPR] and Advance Directive [AD]), autonomy, and readiness (intention). ACP-specific action outcomes indicate an individual’s completion of specific components of ACP such as discussing or documenting goal of care or treatment preferences. Some outcomes were measured in conjunction with whether actions were taken, as well as ACP behavior such as attitudes and self-efficacy. Therefore, we added the category of “process/action” to the existing 4 domains. The outcomes of QOC category address the quality of ACP such as perceived satisfaction with ACP, communication, and decision making, and congruence between patients and surrogates. Meanwhile, the outcomes of health status category were the impact of ACP on health outcomes, such as mental status (anxiety, depression, and stress) or quality of life. Further among the 202 outcomes of ACP interventions, those corresponding to the QOC category were the largest group (n = 63), followed by process (n = 56), health status (n = 49), action (n = 18), and process and action (n = 16). The outcomes in the categories of QOC, process, and health status accounted for more than 80% of identified outcomes of ACP intervention.
Characteristics of PROMs
To assess the properties of the PROMs of ACP interventions, a selection process was conducted among the 202 tools measuring outcomes. For the selection, tools measuring outcomes classified under the health status domain were excluded (n = 49). Based on the definition of ACP (Sudore et al. Reference Sudore, Lum and You2017b), as a process supporting and facilitating decision-making that reflects an individual’s values and treatment preferences, variables related to the health status category were deemed to be unlikely to be directly influenced by ACP intervention. Consequently, they may not be the most suitable outcome indicators for evaluating the specific effects of ACP interventions. Moreover, tools were excluded if their names were provided but their measurement properties were not reported, or if it was not possible to find references regarding the tools (n = 29). After removing duplicate tools (n = 38), 86 measures were ultimately selected for psychometric appraisal of PROMs in this study.
The characteristics of the 86 PROMs are presented in Supplementary 3. Approximately 71% of the PROMs (61 out of 86) measured the process and action categories either individually or in combination. The PROMs that have been used in studies either targeting older people or involving more than half of the participants as the elderly were analyzed. The range of years in which the tools were developed spanned from 1979 to 2021. This review identified 15 tools that were used repeatedly across the included articles, with usage frequencies ranging from 2 to 14 times. The original languages of 86 PROMs were English (n = 61), Taiwanese (n = 12), Japanese (n = 3), Korean (n = 3), Chinese (n = 2), and Dutch (n = 1). Four PROMs were developed not only in English but also in French (Heyland et al. Reference Heyland, Jiang and Day2013; Légaré et al. Reference Légaré, Kearing and Clay2010) and Chinese (Hinderer and Lee Reference Hinderer and Lee2019; Lee et al. Reference Lee, Hinderer and Friedmann2015). Most of the PROMs were self-administered assessment tools (n = 67 out of 86). The selected PROMs were used for older people from a diverse setting, including both inpatients and outpatients from hospitals and clinics, as well as from community centers. The participants had an age distribution ranging from an average of 39–87.5 years. The PROMs had a diverse range of item numbers, ranging from 1 to 82 questions. Response scales included options such as multiple choices, yes/no, true/false, agree/disagree, numeric rating scale, and Likert scales (4,5,6, and 11 points). Moreover, a variety of response scales were used to indicate preferences, such as who the decision-maker wishes to be, the extent of medical treatment desired, the amount of desired information, and the level of preparedness for ACP communication.
Appraisal of PROMs psychometric properties
The 9 psychometric properties (content validity, structural validity, internal consistency, cross-cultural validity, reliability, measurement error, criterion validity, hypotheses testing for construct validity, and responsiveness) of the selected 86 PROMs were assessed according to the COSMIN criteria. Supplementary 3 presents an overview of the psychometric properties of the 86 PROMs. It was found that approximately 78% (n = 67) of the 86 PROMs did not meet half or more of the 9 COSMIN criteria for measurement properties; 14 measures had no psychometric validation data documented. Therefore, we selected 29 PROMs that met 4 or more of the 9 COSMIN criteria, and their properties are specified in Table 1. The most commonly addressed COSMIN criteria were evidence for internal consistency (n = 29 measures), hypotheses testing construct validity (n = 29 measures), and content validity (n = 27 measures). By contrast, there was limited evidence regarding measurement error, cross-cultural validity, and reliability. Out of the 29 tools, none of them presented evidence of measurement error. Data on cross-cultural validity was unavailable for 22 tools, and data on reliability was absent for 15 tools. Among the 29 PROMs, Decisional Conflict Scale (DCS) were originally in English (O’Connor Reference O’Connor1995) and had translated versions in languages such as French (Ferron Parayre et al. Reference Ferron Parayre, Labrecque and Rousseau2014; Légaré et al. Reference Légaré, Kearing and Clay2010) and Korean (Kim et al. Reference Kim, Kim and Hong2017). Advance Directive Attitude Survey (ADAS) were also originally in English (Nolan and Bruder Reference Nolan and Bruder1997) and had translated versions in languages Spanish (Nolan and Bruder Reference Nolan and Bruder1997), Chinese (Hinderer and Lee Reference Hinderer and Lee2019), and Korean (Joung Reference Joung2012). DCS had evidence of cross-cultural validity for the English and French versions, whereas there was no evidence for ADAS in this regard. Evidence for cross-cultural validity was present for 7 (Bekker et al. Reference Bekker, Légaré and Walker2012; Engelberg et al. Reference Engelberg, Downey and Curtis2006; Ferron Parayre et al. Reference Ferron Parayre, Labrecque and Rousseau2014; Fried et al. Reference Fried, Redding and Robbins2010; Heyland et al. Reference Heyland, Cook and Rocker2010, Reference Heyland, Jiang and Day2013; Larsen et al. Reference Larsen, Attkisson and Hargreaves1979; Légaré et al. Reference Légaré, Kearing and Clay2010; O’Connor Reference O’Connor1995) out of 29 PROMs. None had evidence for all 9 psychometric properties, while 11 PROMs had evidence for 7 of the 9 properties. Eleven PROMs include an ACP Engagement Survey (with 4, 9, 15, 34, 55, and 82 items) (Sudore et al. Reference Sudore, Heyland and Barnes2017a), the Japanese version of ACP Engagement Survey (with 15 items) (Okada et al. Reference Okada, Takenouchi and Okuhara2021), Canadian Health Care Evaluation Project Questionnaire (CANHELP) (Heyland et al. Reference Heyland, Cook and Rocker2010, Reference Heyland, Jiang and Day2013), Openness to Discuss Cancer in the Nuclear Family scale (ODCNF) (Mesters et al. Reference Mesters, Van Den Borne and McCormick1997), quality of communication (QOC) (Engelberg et al. Reference Engelberg, Downey and Curtis2006), and SURE test (Légaré et al. Reference Légaré, Kearing and Clay2010).
Discussion
This review study has provided an overview of the PROMs used to assess the impact of ACP interventions on older adults. While the identified PROMs categorized into 4 domains aligned with previous research (McMahan et al. Reference McMahan, Tellez and Sudore2021; Sudore et al. Reference Sudore, Heyland and Lum2018), a wide range of instruments were employed, thus suggesting a lack of consensus on the most suitable PROM for this area and the existence of potential challenges in integrating ACP outcomes. Several factors may contribute to this heterogeneity: First, despite the fact that studies on ACP have been emerging since the 1990s, a consistent definition of the concept only recently gained consensus among scholars (Rietjens et al. Reference Rietjens, Sudore and Connolly2017; Sudore et al. Reference Sudore, Lum and You2017b). Second, some studies utilized unspecified measures for ACP, employing generic PROMs instead. Notably, most health status/utilization outcomes relied on generic PROMs, causing the present review to exclude them from further analyses of psychometric properties to focus on ACP-specific instruments.
This review also assessed the psychometric properties of instruments used to measure the outcomes of ACP interventions. None of the ACP-specific PROMs fully met the COSMIN quality standards for measurement development. No measurement error was reported in any PROM, and evidence for several other key properties was limited. While the COSMIN checklist could not be strictly applied to all included PROMs, less than half met at least 50% of its criteria. This suggests that many PROMs used in ACP research might not be optimal, which could potentially lead to inaccurate conclusions about intervention effectiveness due to insufficient data quality. Therefore, for the evaluation of the effectiveness of ACP interventions in future research, it is imperative to first assess the appropriateness of the psychometric properties of the PROMs to measure such outcomes.
However, it must be acknowledged that the repeated use of certain PROMs across diverse populations, timeframes, and studies can contribute to accumulating evidence for their validity, even in the absence of complete formal psychometric evaluations. This ongoing application in various contexts over time strengthens the validity argument.
Eighteen PROMs, specifically those focusing on the process domain or both process and action domains, exceeded 4 points out of 9 on the COSMIN checklist. Among these, the ACP Engagement Survey (Sudore et al. Reference Sudore, Heyland and Barnes2017a) achieved higher COSMIN scores compared to others. Notably, the ACP Engagement Survey demonstrates a broader perspective by not only capturing the conceptualization of ACP as a process but also examining participants’ action changes following interventions. Furthermore, translations and adaptations into various languages including Chinese (Wei et al. Reference Wei, Hsu and Wu2022), Dutch (van der Smissen et al. Reference van der Smissen, van der Heide and Sudore2021), Japanese (Okada et al. Reference Okada, Takenouchi and Okuhara2021), and Spanish (Sudore et al. Reference Sudore, Heyland and Barnes2017a) highlight its potential for diverse applications across populations. The review’s identification of the 4-item version as frequently used further underscores its potential value for concise and efficient data collection in research and clinical settings.
Among the PROMs evaluated in this review for QOC assessment, 11 achieved scores exceeding 4 points out of 9 on the COSMIN checklist. Interestingly, 8 of these PROMs were generic satisfaction measures. Four of the PROMs demonstrated higher utilization frequencies: the QOC (Engelberg et al. Reference Engelberg, Downey and Curtis2006), the DCS (O’Connor Reference O’Connor1995) including the SURE test (Légaré et al. Reference Légaré, Kearing and Clay2010), and the Life-Support Preferences Questionnaire (LSPQ) (Coppola et al. Reference Coppola, Bookwala and Ditto1999). The QOC assesses patient satisfaction with communication regarding end-of-life care with clinicians and demonstrated evidence for 78% (7 points) of the COSMIN criteria. Both the DCS and SURE test assess patient satisfaction with their own value-based decision-making, with the SURE test being a shorter version of the DCS. While the DCS was initially developed as a generic measure, its frequent use in ACP research reflects its applicability in evaluating patient decision-making based on personal values (O’Connor Reference O’Connor1993; 2010), i.e., the key outcome of ACP. Both measures demonstrated evidence for 67% (6 points) of the COSMIN criteria in this review. This review included the Korean version of DCS (Kim et al. Reference Kim, Kim and Hong2017), and several language versions for the DSC have been developed (O’Connor Reference O’Connor1993; 2010). The LSPQ (Coppola et al. Reference Coppola, Bookwala and Ditto1999), which is specifically designed to measure the outcome of ACP interventions, is frequently used to assess the congruence of treatment preferences between patients and their surrogates (Chan et al. Reference Chan, Ng and Chan2018; Ditto et al. Reference Ditto, Danks and Smucker2001; Jo et al. Reference Jo, Park and Park2021; Ke et al. Reference Ke, Hu and Chen2021; Song et al. Reference Song, Donovan and Piraino2010, Reference Song, Ward and Happ2009). However, it exhibited low evidence for meeting the COSMIN criteria. PROMs employing hypothetical vignettes, such as the LSPQ, may not fully align with the COSMIN criteria for psychometric properties like structural validity, cross-cultural validity, measurement error, or criterion validity.
Of the 86 PROMs included in this review, approximately 51% demonstrated limited or no evidence for meeting COSMIN criteria (i.e., scoring less than 4 points), which encompassed all PROMs within the action domain and lacked consistent usage across studies. Notably, the PROMs in this domain primarily serve as tools that patients can use to express preferences or make end-of-life care decisions. In ACP research, action outcomes are typically assessed through the presence of documented forms (e.g., goal of care, advance directives, physician orders for life-sustaining treatment) in patient charts, rather than relying on PROMs (Jimenez et al. Reference Jimenez, Tan and Virk2018; Park et al. Reference Park, Jo and Park2021). This suggests that the PROMs reviewed within the action domain may be perceived more as documents, which could potentially explain the scarcity of validation studies confirming their psychometric properties. Consequently, to evaluate the action outcomes of ACP interventions, utilizing existing standardized advance directive forms specific to each region might be more effective than developing new measurement tools.
This review offers valuable insights into selecting appropriate PROMs for assessing ACP outcomes in older adults based on their psychometric properties. The review also identifies key considerations for researchers and clinicians to ensure effective data collection in this population. First, the self-reported nature of PROMs, while advantageous in capturing patient perspectives, may pose challenges for older adults. This review highlights potential challenges involved in the self-reported format. The finding that 22% of included studies (n = 19 out of 86) involved researcher assistance, often through the form of interviews, suggests that self-reporting may be difficult for some older adults. Moreover, the frequent use of shortened versions of PROMs further supports this notion. These challenges likely stem from broader functional decline, which can manifest as fatigue or difficulty reading and comprehending lengthy questionnaires. Therefore, researchers and clinicians could consider implementing shortened versions of PROMs and providing assistance during data collection as potential interventions to mitigate these difficulties and enhance participation rates among older adults. Second, this review underscores the need for further validation of most included PROMs in the context of assessing ACP intervention outcomes in older adults. This highlights the importance of using rigorous psychometric evaluation methods to ensure the reliability and validity of collected data. Researchers are encouraged to adhere to established standards such as the COSMIN checklist (Mokkink et al. Reference Mokkink, Terwee and Patrick2010) when selecting PROMs for use in ACP research, which will ultimately lead to more robust and meaningful data for evaluating intervention effectiveness in the populations.
This review has certain limitations that could potentially have impacted its findings. First, the review’s reliance on references provided by included articles may have excluded relevant validation studies, thus leading to a potential underestimation of the psychometric properties of some measurements. Second, the focus on the elderly population was compromised by the use of average age across studies, which potentially included subjects outside the target demographic. This raises concerns about the suitability of some measurements as specific tools for older adults. Despite these limitations, this review provides valuable insights into PROMs for assessing ACP outcomes in older adults. Notably, it identifies several adequately validated PROMs based on the COSMIN checklist, including the ACP Engagement Survey, QOC, and DCS. However, the review also highlights the need for further validation studies to be conducted on many previously used PROMs while specifically considering the unique conditions faced by older adult participants.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S1478951524002062.
Acknowledgments
We would like to acknowledge NaJin Kim, the professional librarian at the Medical Library of the Catholic University of Korea, for her expert support in literature searching, which was essential for the successful data extraction in this review. We also wish to acknowledge the financial support of the Catholic Medical Center Research Foundation made in the program year of 2022.
Competing interests
The authors declared no competing interests.
Ethical approval
This study was granted exemption by the Institutional Review Board of the Catholic University of Korea (No: MC22ZISI0083).