Introduction
To live independently at home, individuals need to be able to complete everyday instrumental activities of daily living (IADL), including preparing meals, managing money and taking medications. Due to the aging of the population, the number of individuals unable to live independently in their homes because of cognitive impairments is rising rapidly. Consequently, neuropsychologists are increasingly being asked to answer questions regarding the effects of cognitive deficits on everyday functioning (Marcotte, Scott, Kamat, & Heaton, Reference Marcotte, Scott, Kamat and Heaton2010). Currently, however, there is no “gold standard” for measuring everyday functional abilities. In addition, the approach of using neuropsychological tests originally designed for other purposes (e.g., documentation of brain injury severity, localization of brain pathology) to predict everyday functioning has been questioned (Goldstein, Reference Goldstein1996).
Self-report, informant-report, and performance-based measures have commonly been used in the literature as a proxy for real-world functioning. Each method has distinct advantages and disadvantages. For example, while self-report and informant-report questionnaires are easy to administer and may give a reasonably accurate representation of real-world performance given the opportunity for multiple observations, they are subject to reporter bias (Bertrand & Willis, Reference Bertrand and Willis1999; Dassel & Schmitt, Reference Dassel and Schmitt2008; Richardson, Nadler, & Malloy, Reference Richardson, Nadler and Malloy1995). Similarly, while performance-based measures are objective, quantifiable, repeatable, and norm-referenced, the role of the environment is typically subtracted and performance on such measures can fluctuate with motivation, cognition and behavior (Marson & Hebert, Reference Marson and Hebert2006; Myers, Holliday, Harvey, & Hutchinson, Reference Myers, Holliday, Harvey and Hutchinson1993; Zimmerman & Magaziner, Reference Zimmerman and Magaziner1994). Two types of performance-based tests have been developed. Paper and pencil tasks assess everyday cognition or everyday problem-solving by presenting cognitively challenging real-world problems (e.g., Everyday Problems Test [EPT]; Willis & Marsiske, Reference Willis and Marsiske1993; Everyday Cognition Battery; Allaire & Marsiske, Reference Allaire and Marsiske1999, Reference Allaire and Marsiske2002), while behavioral simulations tasks require individuals to complete everyday tasks within a laboratory environment (e.g., write a check, look up a phone number; Revised Observed Tasks of Daily Living [OTDL-R], Diehl et al., Reference Diehl, Marsiske, Horgas, Rosenberg, Saczynski and Willis2005; Texas Functional Living Scale, Cullum et al., Reference Cullum, Saine, Chan, Martin-Cook, Gray and Weiner2001). Currently, the question of which method for assessing functional performance most closely approximates actual behavior in the real world remains debatable (Farias, Harrell, Neumann, & Houtz, Reference Farias, Harrell, Neumann and Houtz2003).
It has been argued that direct observation of individuals in the everyday environment will provide the most valid determination of everyday functional status (Marcotte et al., Reference Marcotte, Scott, Kamat and Heaton2010), as it would allow for observation of behaviors with increased subtlety and across extended periods of time. New technologies, which use sensors and machine learning algorithms to identify and track activities of daily living continuously within the home environment, are being developed (Hayes et al., Reference Hayes, Abendroth, Adamid, Pavela, Zitzelberger and Kaye2008; Rashidi, Cook, Holder, & Schmitter-Edgecombe, Reference Rashidi, Cook, Holder and Schmitter-Edgecombe2011). In this study, we report on behavioral observation data collected while healthy older adults completed activities of daily living within a campus apartment. The apartment had a kitchen, dining room, and living room on the downstairs level. The behaviors that were observed included eight, highly scripted complex everyday activities (e.g., locate and fill a medication dispenser; prepare a cup of noodle soup) that were each performed once by the participant. While the on-campus apartment presented a unique opportunity to collect direct behavioral observation data within an everyday environment, participants were not tested in their own homes nor were repeated measurements of the same everyday activities collected.
One goal of the present study was to examine the relationships between several methods for assessing functional status in a healthy older adult population, including direct behavioral observation, self-report, and performance-based measures. Prior research suggests that self-report measures are generally accurate indicators of daily functioning for older adults who demonstrate insight into their functional abilities (Alexander et al., Reference Alexander, Guire, Thelen, Ashton-Miller, Schultz, Grunwalt and Giordani2000; Farias, Mungas & Jagust, Reference Farias, Mungas and Jagust2005). The performance-based tasks included a measure of everyday problem-solving (i.e., EPT) and a behavioral simulation measure (i.e., OTDL-R). Only a few studies have directly compared more than one method for assessing functional status. These studies have generally found that different methods for assessing functional status do not correlate highly with each other and further can provide different estimates of an individual's ability to perform everyday activities (e.g., Burton et al., Reference Burton, Strauss, Bunce, Hunter and Hultsch2009; Jefferson et al., Reference Jefferson, Byerly, Vanderhill, Lambe, Wong, Oznoff and Karlawish2008, Loewenstein et al., Reference Loewenstein, Arguelles, Bravo, Freeman, Arguelles, Acevedo and Eisdorfer2001). Given the significant implications that poor functional status can have for independent living, it is important to better understand the relationship amongst these varied methods used to assess functional status and, ultimately, to understand the ability of such measures to serve as a surrogate for real-world, everyday functioning.
A second goal of the study was to examine which cognitive correlates, as measured by neuropsychological tests, are most predictive of quality of everyday activity completion across the different methods used. Understanding the cognitive determinants of performance on these proxy measures of everyday functioning may improve our knowledge of what aspects of everyday performance these diverse measures of functional status are capturing. Prior research suggests that there are many cognitive factors that may contribute to functional impairment in older adult populations, including global cognitive functioning (Farias et al., Reference Farias, Harrell, Neumann and Houtz2003; Inzarti & Basile, Reference Inzarti and Basile2003), memory (Farias et al., Reference Farias, Mungas, Reed, Harvey, Cahn-Weiner and DeCarli2006; McCue, Rogers, & Goldstein, Reference McCue, Rogers and Goldstein1990), processing speed (Tuokko, Morris, & Ebert, Reference Tuokko, Morris and Ebert2005), visuoperceptual abilities (Glosser et al., Reference Glosser, Gallo, Duda, de Vries, Clark and Grossman2002; Jefferson, Barakat, Giovannetti, Paul, & Glosser, Reference Jefferson, Barakat, Giovannetti, Paul and Glosser2006), and executive functioning (Bell-McGinty, Podell, Franzen, Baird, & Williams, Reference Bell-McGinty, Podell, Franzen, Baird and Williams2002; Cahn-Weiner, Malloy, Boyle, Marran, & Salloway, Reference Cahn-Weiner, Malloy, Boyle, Marran and Salloway2000; Lewis & Miller, Reference Lewis and Miller2007). Measures of executive functioning have, however, emerged across studies as the strongest and most consistent specific cognitive predictor of everyday functional status (Cahn-Weiner et al., Reference Cahn-Weiner, Farias, Julian, Harvey, Kramer, Reed and Chui2007; Marcotte et al., Reference Marcotte, Scott, Kamat and Heaton2010).
Numerous studies have also provided evidence for non-cognitive risk factors of functional impairment in older adult populations, including advanced age, lower education, depressed mood, and comorbid health conditions (e.g., knee replacement). For example, depression has been found to be a significant predictor of functional disability in some studies (Cho et al., Reference Cho, Alessi, Cho, Aronow, Stuck, Rubenstein and Beck1998; Guccione et al., Reference Guccione, Felson, Anderson, Anthony, Zhang, Wilson and Kannel1994) but not others (Galanos, Fillenbaum, Cohen, & Burchett, Reference Galanos, Fillenbaum, Cohen and Burchett1994). Therefore, when examining the relationship between cognition and functional status, it is important to understand how much variance in functional status can be attributed to cognition independent of major non-cognitive covariates.
In this study, healthy community-dwelling older adults completed self-report and performance-based measures of everyday functioning in a laboratory environment. Observational assessment of participant's ability to complete everyday activities was conducted in a university campus apartment. Given prior research findings (e.g., Jefferson et al., Reference Jefferson, Byerly, Vanderhill, Lambe, Wong, Oznoff and Karlawish2008; Rueda & Schmitter-Edgecombe, Reference Rueda and Schmitter-Edgecombe2011), we did not expect strong correlations between the self-report and performance-based measures. We were especially interested in how performance on the direct observation measure would relate to both the self-report and performance-based measures, which are often used as proxy measures for everyday functional status. Based on prior research (e.g., Cahn-Weiner et al., Reference Cahn-Weiner, Malloy, Boyle, Marran and Salloway2000), we also hypothesized that, after controlling for demographic variables, the neuropsychological measure of executive functioning would emerge as the strongest predictor of quality of everyday activity completion in cognitively healthy older adults.
Method
Participants and Procedure
Participants were 88 community-dwelling healthy older adults. To assess whether limitations in functional status are more common in individuals age 75+ (Lafortune & Balestat, Reference Lafortune and Balestat2007) rather than also being apparent in the young-old, we initially differentiated the healthy older adults into three age groups, middle-aged (N = 22; age 50–59 years; 18 F, 4 M), young-old (N = 44; age 60–74 years; 29 F, 15 M) and old-old (N = 22; age 75–86 years; 16 F, 6 M). Exclusionary criteria included history of head trauma with permanent brain lesion, history of cerebrovascular accidents, current or recent (past year) psychoactive substance abuse, known medical, neurological, or psychiatric causes of cognitive dysfunction (e.g., schizophrenia, epilepsy), and self- or knowledgeable informant report of significant memory complaints or changes in cognitive ability across the past months to years. Initial screening for cognitively healthy older adult participants was conducted over the phone and included: (a) a medical interview to rule out exclusion criteria, (b) the Telephone Interview for Cognitive Status (TICS) to exclude participants who scored below 27 (equivalent of an MMSE of 24) on a measure of global cognitive functioning (Brandt & Folstein, Reference Brandt and Folstein2003), and (c) the Clinical Dementia Rating instrument (CDR) to rule out cognitive impairment suggestive of questionable dementia (i.e., CDR > 0; Hughes, Berg, Danzinger, Coben, & Martin, Reference Hughes, Berg, Danzinger, Coben and Martin1982; Morris, Reference Morris1993). Healthy older adult participants were recruited through community advertisements, community health and wellness fairs, and presentations to community groups consisting of a large numbers of seniors.
Participants who met initial screening criteria completed a laboratory battery of standardized and experimental neuropsychological tests (approximately 3 hr). To rule out participants who might be experiencing mild cognitive impairment, participants were psychometrically determined to be cognitively healthy if they performed within 1.5 standard deviations of age-corrected and education-corrected (when available) standardized test norms on the cognitive tests in Table 1. These participants then completed a variety of complex activities of daily living within an apartment located on the Washington State University campus (e.g., sweep the floor, water plants). This evaluation took approximately 3 hours. Participants were given a report documenting their performance on the neuropsychological tests, as well as pre-paid parking passes, as compensation for their time. Participants who traveled to the laboratory from outside Whitman or Latah County were also provided a $50 voucher for travel reimbursement. This protocol was reviewed and approved by the Institutional Review Board at WSU.
Table 1 Demographic data and mean summary data for middle-aged, young-old, and old-old groups

Note. Unless otherwise indicated, mean scores are raw scores. Norm sources for the cognitive tests are in parentheses following the test. GDS = Geriatric Depression Scale; TICS = Telephone Interview for Cognitive Status (Brandt & Folstein, Reference Brandt and Folstein2003); SDMT = Symbol Digit Modalities Test (Smith, Reference Smith1991); MAS = Memory Assessment Scale (Williams, Reference Williams1991); BNT = Boston Naming Test (Ivnik, Malec, Smith, Tangalos, & Petersen, Reference Ivnik, Malec, Smith, Tangalos and Petersen1996); WAIS-III L-N Seq. = Letter–Number Sequencing subtest of the Wechsler Adult Intelligence Scale—Third Edition (Wechsler, Reference Wechsler1997); D-KEFS = Delis-Kaplan Executive Function System (Delis, Kaplan, & Kramer, Reference Delis, Kaplan and Kramer2001).
aSignificant difference compared with middle-aged group, p < .05.
bSignificant difference compared with young-old group, p < .05.
Measures
Non-cognitive risk factors
Non-cognitive risk factors included participants’ age, level of education, and number of depressive symptoms. The total number of symptoms of depression reported on the Geriatric Depression Scale – Short Form (Yesavage et al., Reference Yesavage, Brink, Rose, Lum, Huang, Adey and Leirer1983) was used as the measure of depressive symptomology.
Cognitive test variables
The cognitive predictor variables were derived from standardized neuropsychological tests administered during the laboratory assessment and represent the following cognitive constructs: global cognitive status, processing speed, memory, visuoperceptual abilities, and executive functioning. A brief description of each task can be found in Appendix A; below we indicate the score from each task that was used as the cognitive predictor variable.
Telephone Interview for Cognitive Status (TICS; Brandt & Folstein, Reference Brandt and Folstein2003)
The total score from this mental status exam was used as the measure of global cognitive functioning.
Symbol Digits Modalities Test (SDMT; Smith, Reference Smith1991)
To reduce the impact that health-related variables, such as arthritis, might have on the data, the total score from the oral version of the SDMT was used as the measure of processing speed.
Memory Assessment Scale (MAS) List Learning subtest (Williams, Reference Williams1991)
The number of 12 words recalled at the long-delay was used as the measure of memory.
Clox 1&2 (Royall, Cordes, & Polk, Reference Royall, Cordes and Polk1998)
Total score on the copy subtask of the clock (Clox 2) was used as the measure of visuoperceptual abilities.
Delis-Kaplan Executive Function System (D-KEFS) Letter Fluency subtest (Delis, Kaplan & Kramer, Reference Delis, Kaplan and Kramer2001)
The total number of correct words produced for the letters F, A, and S was used as the measure of executive functioning.
Functional status measures
The domains of IADLs that were assessed across the self-report, performance-based, and direct observation measures were largely consistent (e.g., medication management, meal preparation). For the purposes of this study, these four different types of measures that have been used in the literature as a proxy for real-world functioning, are labeled collectively as functional status measures.
Self-report IADL
Participants completed the self-report Lawton-Brody IADL scale (Lawton, Moss, Fulcomer, & Kleban, Reference Lawton, Moss, Fulcomer and Kleban1982), indexing nine IADL domains: using the phone, traveling, shopping, preparing meals, doing housework, handyman work, laundry, medication management, and financial management. Participants rated their skill level (capacity) for each IADL domain using a Likert scale, ranging from 1 (completely unable to do) to 3 (can complete without help). The items were summed to provide a total score.
The Revised Observed Test of Daily Living
The OTDL-R (Diehl et al., Reference Diehl, Marsiske, Horgas, Rosenberg, Saczynski and Willis2005) is a performance-based measure of everyday competency. Participants are presented with problem scenarios and real-life materials (e.g., medicine bottles) and asked to perform the necessary steps to find the correct answer to questions presented on cards (e.g., look at three medicine bottle labels and indicate how many days a specific refill will last). The OTDL-R includes a total of nine IADL tasks (28 items, maximum score = 28), three each in the domains of medication use, telephone use, and financial management.
The Everyday Problems Test
The EPT (Willis & Marsiske, Reference Willis and Marsiske1993) is a paper and pencil performance-based measure of everyday cognitive competence. Participants are required to solve problems of daily living using printed real-life tasks drawn from seven domains of functioning stimuli (e.g., examine a chart of taxi rates and choose how much they would have to pay to travel one mile in a suburban area from four responses). For the purposes of this study, four domains were assessed (i.e., shopping, transportation, household, meal preparation), as the remaining three domains were covered by the OTDL-R. The items were summed to obtain a total score (maximum score = 24), with each correct item receiving a score of 1.
Direct observation of everyday activity completion
Participants completed eight activities of daily living within a campus apartment (e.g., filling a 7-day medication dispenser with three types of medications). See Appendix A for a more detailed description of each of the activities. Before beginning each activity, experimenters provided brief verbal instructions and asked that all materials be returned to their original positions after task completion. Participants then carried out each activity using the materials provided to them within the apartment. As the participant completed the activities, two experimenters observed the participant and coded the actions based on the sequence and accuracy of the steps completed. The experimenters also recorded extraneous participant actions (e.g., searching for items in wrong locations). Each activity was coded for six different types of errors: critical omissions, critical substitutions, non-critical omissions, non-critical substitutions, irrelevant actions and inefficient actions. Table 2 provides detailed code assignment information for each error as well as the scoring rubric used to derive an overall score for each activity and scoring reliability. The overall score for each of the eight activities was summed to derive the direct observation score.
Table 2 Coding schema used to derive the direct observation score and scoring reliability

Results
Demographic and Neuropsychological Data
Table 1 shows the demographic and neuropsychological testing data by age group. One-way analyses of variance (ANOVAs) followed by post hoc contrasts using Tukey's HSD (honestly significant difference) revealed that the old-old performed more poorly than the young-old on a test measuring working memory/executive abilities, F(2,85) = 3.73; p < .05, and more poorly than both the middle-aged and young-old on the written, F(2,85) = 11.30, p < .001, and oral, F(2,85) = 14.84, p < .001, trials of a test measuring attention and speeded processing (see Table 1). The age groups did not differ significantly in level of education, self-report of depressive symptoms, or their performances on the remainder of neuropsychological measures displayed in Table 1.
Functional Status Measures
One-way ANOVAs followed by post hoc contrasts were then used to examine whether the mean scores for the middle-aged, young-old, and old-old differed across the four measures of functional status.Footnote 1 Because there were no gender differences in performance on the functional status measures, t's > 1.10, p's < .30, sex was not included as a variable in the analyses. Table 3 displays the means and standard deviations for each functional status measure by age group. Significant group differences were found for all functional status measures, F's > 3.70, p's < .05. As seen in Table 3, the old-old group showed poorer performance than the middle-aged group on all functional status measures, including self-report IADL (d = .92),Footnote 2 OTDL-R (d = 1.31), EPT (d = .74), and direct observation (d = 1.72). The old-old group also differed from the young-old group on self-report IADL (d = .86), OTDL-R (d = .76) and direct observation (d = .84) measures. In addition, the direct observation score of the young-old group was poorer than that of the middle-aged group (d = .80). These data indicate that all four proxy measures of functional status revealed greater functional limitations in the old-old group.
Table 3 Mean summary data for middle-aged, young-old, and old-old groups on the functional status measures

Note. Mean scores are raw scores. IADL = Instrumental Activities of Daily Living; EPT = Everyday Problem-solving; OTDL-R = Revised Observed Test of Daily Living. †n = 21; ‡n = 18; §n = 39; ‡‡n = 43.
aSignificant difference compared with middle-aged group, p < .05.
bSignificant difference compared with young-old group, p < .05.
Intercorrelations between the Measures of Functional Status
Table 4 shows the intercorrelations amongst the four measures of functional status with age partialled out. No significant correlations were found between the self-report IADL and two performance-based measures (i.e., OTDL-R and EPT). The correlation between the OTDL-R and EPT just failed to reach statistical significance (r = .21; p = .066). The direct observation measure correlated with both the self-report IADL (r = −.31) and EPT (r = −.49), indicating that those with poorer direct observation scores tended to self-report more everyday difficulties and performed more poorly on the EPT.
Table 4 Correlation matrix for the four functional status measures (after controlling for age)

Note. IADL = Instrumental Activities of Daily Living; EPT = Everyday Problems Test; OTDL-R = Revised Observed Test of Daily Living.
*p < .005; ** p < .001.
Regression Analyses: Cognitive Determinants
Hierarchical regression analyses were completed to identify cognitive determinants of everyday functional status after controlling for non-cognitive risk factors. To reduce the number of predictor variables included in the regression analyses, we first examined the correlations between the four measures of functional status and the non-cognitive risk factors. All functional outcome measures correlated significantly with age but not with education or symptoms of depression (see Table 5). Subsequently, only age was entered in as non-cognitive risk factors in the first block of the hierarchical regression. In block 2, we entered the cognitive predictors derived from the neuropsychological testing data. The cognitive predictors were chosen to represent different domains of cognitive abilities that have been found to be predictive of functional status in prior studies: TICS total score (global cognitive functioning), MAS delayed total recalled (memory), SDMT oral total correct (processing speed), Clox 2: copy total score (visuoperceptual abilities), and D-KEFS letter fluency (executive functioning). The cognitive predictors were entered simultaneously in the second step to determine if they held any unique and predictive value. There was no significant multicollinearity amongst the five cognitive predictor variables, as the Variance Inflation Factors for each variable were less than 1.6. Correlations amongst the predictor and criterion variables can be found in Table 5.
Table 5 Correlations between the functional status measures and Non-Cognitive Risk Factors and Neuropsychological Testing Data.

Note. Total correct raw score was used for all neuropsychological measures. IADL = Instrumental Activities of Daily Living; EPT = Everyday Problems Test; OTDL-R = Revised Observed Test of Daily Living; GDS = Geriatric Depression Scale; TICS = Telephone Interview for Cognitive Status; SDMT = Symbol Digit Modalities Test; MAS = Memory Assessment Scale; D-KEFS Letter Fluency subtest.
*p < .05; **p < .005.
The cognitive predictors (variance accounted for represented by ΔR 2) were found to account for significant variance over and above that accounted for by age for only the performance-based OTDL-R task [△R 2 = .17; △F(5,75) = 4.41; p = .001; total R 2 = .42]. For the OTDL-R, the SDMT-oral was a unique cognitive predictor, B = .31, t = 2.83, p = .006. The neuropsychological measures did not account for significant variance over and above age for either the self-report IADL [△R 2 = .08; △F(5,76) = 1.36; p = .25; total R 2 = .13], EPT [△R 2 = .06; △F(5,68) = 1.06; p = .39; total R 2 = .18] or direct observation score [△R 2 = .07; △F(5,77) = 1.60; p = .17; total R 2 = .35]. In addition, there were no significant unique cognitive predictors for the self-report IADL, t's < 1.78, p's > .08, EPT, t's < 1.55, p's > .12, and direct observation, t's < 1.60, p's > .10, measures. These findings suggest that while the cognitive variables explained a unique amount of the variance in the performance-based OTDL-R measure, the cognitive predictors did not explain unique variance above that of age for the self-report IADL, EPT, and direct observation measures.Footnote 3
Regression Analyses: Functional Status Proxy Measures
Given that the cognitive determinants did not predict performance on the direct observation measure, we were interested in whether the self-report and performance-based measures, which are considered proxy measures for everyday functioning, would contribute unique variance in predicting the direct observation score. Age was again entered in the first step of the hierarchical regression, followed by the functional status measures (self-report IADL, OTDL-R, EPT) in the second step. The results of this regression analysis revealed that the proxy measures of everyday functioning accounted for significant variance over and above that accounted for by age [△R 2 = .21; △F(3,72) = 9.84; p < .001], with both the measures of everyday functioning and age accounting for 47% of the variance in the direct observation measure, R 2 = .47. In addition, both the EPT, B = −.37, t = −4.07, p < .001, and self-report IADL, B = −.25, t = −2.94, p = .004, were unique predictors. If we make the assumption that the direct observation measure is the better proxy for everyday functional performance, the data suggest that the everyday problem-solving measure and the self-report IADL measure are better at predicting everyday functioning in this group of cognitively healthy older adults than the cognitive predictors and the laboratory behavioral simulation measure (i.e., the OTDL-R).
In further support, when we looked at the ability of the self-report IADL, EPT and direct observation scores to predict OTDL-R performance over and above age, we found that no significant additional variance was accounted for by the functional status measures [△R 2 = .03; △F(3,72) = 1.15; p = 34; total R 2 = .30]. This finding contrasts sharply with the 17% additional variance accounted for by the cognitive variables when predicting OTDL-R performance. This suggests that the OTDL-R may be associated more closely with similar abilities involved in other cognitive tests rather than with functional status measures. Regression analyses for both the EPT [△R 2 = .19; △F(3,72) = 6.51; p = .001; total R 2 = .30] and the self-report IADL [△R 2 = .12; △F(3,72) = 3.07; p = .03; total R 2 = .16] showed that the functional status measures accounted for significant variance over and above that of age. In both analyses, the direct observation measure was found to be a unique predictor (EPT, B = −.57; t = −4.62; p < .001; self-report IADL, B = −.51; t = −4.07; p < .001). In addition, the amount of variance accounted for by age and the functional status measures was greater than that accounted for by age and the cognitive predictors for both the self-report IADL (16% vs. 13%) and EPT (30% vs. 18%).Footnote 4
Discussion
Neuropsychologists routinely use data collected from neuropsychological tests to predict real-world functioning. Evaluation of the ability of traditional neuropsychological tests to predict functional status has, however, been hampered by lack of a “gold standard” measure for the assessment of everyday functional abilities. We compared four types of measures that have been used in the literature as a proxy for real-world functioning, including self-report, performance-based measures and direct observational data in a real-world setting. We also sought to identify the cognitive correlates of these functional status measures after controlling for non-cognitive risk factors.
All four proxy measures showed sensitivity to the healthy aging process. In addition, all four measures differentiated the old-old and middle-age groups, and all but the EPT differentiated the old-old and young-old groups. Only the direct observation measure differentiated the performances of the middle-aged and young-old group. These findings are consistent with prior research (Lafortune & Balestat, Reference Lafortune and Balestat2007), in suggesting that individuals age 75+ are at greater risk for limitations in everyday functioning. Of note, 47% of the older adults endorsed full independence on all self-report IADL domains assessed, limiting the sensitivity of this measure when used with a cognitively healthy aging population.
When the intercorrelations amongst the functional status measures were compared (controlling for age), the correlation between the two performance-based measures just failed to reach significance. This may partially reflect the fact that the two performance-based measures assessed slightly different subsets of IADL domains and represent different ways of assessing functional status. Although both measures make use of stimuli that individuals are exposed to in everyday life, the OTDL-R is a behavioral simulation measure while the EPT primarily taps into everyday problem-solving abilities. The correlational data further revealed that while the self-report IADL and EPT measures did not correlate with each other, both measures correlated with the direct observation measure. Lack of correlation between self-report and performance-based measures has been found in prior studies when these two measures have been compared (e.g., Kempen, Steverink, Ormel, & Deeg, Reference Kempen, Sterverink, Ormel and Deeg1996; Reuben, Valle, Hays, & Siu, Reference Reuben, Valle, Hays and Siu1995). This has led to the question of whether self/informant reports or performance-based measures are a better representation of an individual's actual real-world behaviors. If data from the direct observation measure is considered to be the superior measure of everyday functional status, this may suggest that the self-report IADL and EPT may be measuring different aspects of everyday performance in this cognitively healthy aging population. For example, while the self-report IADL might be tapping into knowledge gained from multiple experiences completing everyday activities and other physical health variables, the EPT might be tapping into the ability to use and apply everyday problem-solving skills. The possibility that these two measures are evaluating different aspects of everyday functioning was further supported by the results of the regression analysis, which showed that both the self-report IADL and EPT contributed unique variance to the prediction of participants’ direct observation scores.
For the self-report IADL, EPT and direct observation measures, after controlling for age, the cognitive variables did not explain significant amounts of additional variance. The data did, however, show that cognitive predictors accounted for an additional 17% of the variance in OTDL-R performance after controlling for age, with processing speed being the only unique cognitive predictor of performance on the OTDL-R. Although the reason for this is not intuitively clear, as the OTDL-R is not a timed measure, psychomotor speed has emerged as a significant predictor of everyday functioning in prior studies (e.g., Tuokko et al., Reference Tuokko, Morris and Ebert2005). Contrary to our hypotheses, performance on the D-KEFS letter fluency was not a unique predictor of performance on any of the functional status measures.Footnote 5 While many prior studies have emphasized a link between executive functioning and IADL performance in older adult populations (e.g., Bell-McGinty et al., Reference Bell-McGinty, Podell, Franzen, Baird and Williams2002; Cahn-Weiner, Boyle, & Malloy, Reference Cahn-Weiner, Boyle and Malloy2002), the literature investigating cognitive predictors of functional status has yielded mixed results (Jefferson et al., Reference Jefferson, Barakat, Giovannetti, Paul and Glosser2006). In addition, the variance in functional status that can be attributed to cognitive correlates has varied widely across prior studies (e.g., 0% to 80%; Royall et al., Reference Royall, Lauterbach, Kaufer, Malloy, Coburn and Black2007). While non-cognitive correlates controlled for in prior studies and neuropsychological variables used as cognitive predictors have differed widely, and may account for some of the varied findings, the current study highlights that variability in findings can also be attributed to choice of functional status measure.
Because the cognitive variables did not account for additional variance above age for the self-report IADL, EPT, and direct observation measures, this suggests that performance on the OTDL-R in this cognitively healthy aging population may be more closely tied with cognitive correlates than performances on the other functional status measures. This is further supported by the findings that while the cognitive predictors accounted for an additional 17% of the variance in OTDL-R performance, the functional status measures explained a non-significant 3% of the variance.
Of interest was the finding that both the self-report IADL and the EPT accounted for unique variance in the direct observation measure, despite the fact that none of the cognitive variables predicted direct observation performance when controlling for age. Of further note, while age and cognitive variables accounted for 35% of the variance in the direct observation score, age and other functional status measures accounted for 47% of the variance. These findings are consistent with other studies which have shown poor links between neuropsychological testing scores and naturalistic action performance (e.g., Giovannetti, Libon, Buxbaum, & Schwartz, Reference Giovannetti, Libon, Buxbaum and Schwartz2002). These findings further suggest that, in the cognitively healthy aging population, use of proxy functional status measures like a self-report IADL or the EPT, may provide more informed information regarding quality of everyday activity completion than neuropsychological testing data.
It has been suggested that cognitive and functional statuses reflect two different dimensions of performance, with individuals varying in both cognitive reserve and functional reserve (Loewenstein & Acevedo, Reference Loewenstein and Acevedo2010). The data indicate that there are other variables that need to be taken into account when assessing/predicting functional status as they may influence the relationship between capacity to complete a task and task implementation. For example, task context (e.g., presence of distractor items), social network, social skills (e.g., inappropriate behaviors), and use of compensatory strategies may significantly influence performances in the real-world environment and remain to be better investigated.
Regarding limitations, our sample of cognitively healthy older adults was predominantly Caucasian, highly educated and reported low rates of depressive symptomology. This contrast with other clinical and community-based samples in the literature and limits the generalizability of our results to other samples of older adults, including those suffering from cognitive or significant health problems. Findings from the regression analyses were also limited by study sample size, the limited battery of neuropsychological tests administered, and the specific neuropsychological measures chosen as the predictor variables to represent the cognitive constructs. In addition, while the data examined the relationship between traditional proxy measures of functional status, the functional domains assessed by each measure did not show complete overlap and this should be considered when interpreting the study findings. Furthermore, although participants in this study were in good physical condition, future studies should more carefully control for non-cognitive physical limitations that might influence everyday functioning and determine whether participants had successfully engaged in the tested activities at some point in their lives. Future studies may also want to include both self-report and informant-based evaluation of functional abilities, as lack of insight into functional performance has been found in some studies with normal aging populations (e.g., Suchy, Kraybill & Franchow, Reference Suchy, Kraybill and Franchow2011). In addition, because variance in neuropsychological and functional status measures tends to be more constricted in cognitively healthy samples, these findings cannot be generalized to neurologic populations and future research is needed.
In summary, the study findings suggest that one must be cautious in making predictions about the quality of everyday activity completion in cognitively healthy older adults from specific cognitive functions. Our data suggest that within this cognitively healthy aging population, cognition may not be as strongly associated with everyday functioning as one might expect, or our neuropsychological tests may not be sensitive enough to show their true relationship. The data did, however, suggest that a self-report IADL and the performance-based EPT, both used as proxy measures for real-world functioning, may be useful measures for assessing everyday functional status in cognitively healthy older adults. Developing reliable and valid measures of functional status that can quickly and efficiently be given in the office and that accurately reflect everyday functioning continues to be an important goal for future research.
Acknowledgments
This study was partially supported by grants from the Life Science Discovery Fund of Washington State; NIBIB (Grant R01 EB009675); and NSF (Grant DGE-0900781) to D.J.C. and M.S.E. No conflicts of interest exist. We thank Chad Sanders, Alyssa Hulbert and Jennifer Walker for their assistance in coordinating data collection. We also thank members of the Aging and Dementia laboratory for their help in collecting and scoring the data.
Appendix A. Description of neuropsychological tests

B. Description of activities and steps for accurate task completion
