INTRODUCTION
According to the projections of the United Nations (2019), the number of individuals aged 65 or over is expected to more than double, reaching 1.5 billion by 2050. The share of persons above 80 years is expected to grow even faster. Within this context, people from low- and middle-income countries will account for most of the growth over the coming decades. These population estimates will have profound social and economic implications worldwide. For instance, this demographic transition will be accompanied by an increased burden of neuropsychiatric disorders such as dementia. Today, around 50 million people live with dementia, and this figure is projected to triple by 2050 (Prince et al., Reference Prince, Wimo, Guerchet, Ali, Wu and Prina2015). This steep increase will again be more pronounced among low- and middle-income countries, where limited epidemiological research on both pathological and non-pathological aging is yet available (World Health Organization, 2017).
Neuropsychological assessment is a basic tool to describe the cognitive performance of individuals and to identify both age-related cognitive changes and cognitive decline that are not a normal part of aging. Furthermore, it helps to evaluate improvements or further deterioration in cognitive performance and predict cognitive trajectories. As the world’s population continues to grow, so will the demand for neuropsychological assessment. As a result, the development of appropriate neuropsychological normative data is essential for descriptive accuracy and diagnosis. Of relevant need are those tests of verbal recall and verbal fluency, which are typically administered in protocols aimed at identifying subjects at higher risk of cognitive decline (Nyberg & Pudas, Reference Nyberg and Pudas2019). Particularly, poor performance on tasks measuring episodic memory has been consistently suggested to be a common manifestation in individuals with Alzheimer’s Disease (AD), even at the very early stages (Backman, Jones, Berger, Jonsson Laukka, & Small, Reference Backman, Jones, Berger, Jonsson Laukka and Small2005; Weissberger et al., Reference Weissberger, Strong, Stefanidis, Summers, Bondi and Stricker2017). On the other hand, verbal fluency tests assess a diverse combination of cognitive domains (i.e. semantic memory, executive function, language, and processing speed) and present high discriminant validity for different types of neurodegenerative diseases, including AD, vascular dementia, and frontotemporal dementia (Jones, Laukka, & Bäckman, Reference Jones, Laukka and Bäckman2006; Van den Berg, Jiskoot, Grosveld, van Swieten, & Papma, Reference Van den Berg, Jiskoot, Grosveld, van Swieten and Papma2017).
Previous research has provided neuropsychological norms for various settings and age-groups (Alenius et al., Reference Alenius, Koskinen, Hallikainen, Ngandu, Lipsanen, Sainio and Hanninen2019; Beeri et al., Reference Beeri, Schmeidler, Sano, Wang, Lally, Grossman and Silverman2006; Cavaco et al., Reference Cavaco, Goncalves, Pinto, Almeida, Gomes, Moreira and Teixeira-Pinto2013; Contador et al., Reference Contador, Almondes, Fernandez-Calvo, Boycheva, Puertas-Martin, Benito-Leon and Bermejo-Pareja2016; Costa et al., Reference Costa, Bagoj, Monaco, Zabberoni, De Rosa, Papantonio and Carlesimo2014; Guruje et al., Reference Guruje, Unverzargt, Osuntokun, Hendrie, Baiyewu, Ogunniyi and Hali1995; Hänninen et al., Reference Hänninen, Pulliainen, Sotaniemi, Hokkanen, Salo, Hietanen and Erkinjuntti2010; Kosmidis, Vlahou, Panagiotaki, & Kiosseoglou, Reference Kosmidis, Vlahou, Panagiotaki and Kiosseoglou2004; Machado et al., Reference Machado, Charchat Fichman, Lucas Santos, Koenig, Amaral Carvalho, Santos Fernandes and Paulo Caramelli2009; Mejía-Arango, Wong, & Michaels-Obregón, Reference Mejía-Arango, Wong and Michaels-Obregón2015; Mitrushina, Boone, & Razani, Reference Mitrushina, Boone and Razani2005; Olabarrieta-Landa et al., Reference Olabarrieta-Landa, Rivera, Galarza-Del-Angel, Garza, Saracho, Rodriguez and Arango-Lasprilla2015; Sosa et al., Reference Sosa, Albanese, Prince, Acosta, Ferri, Guerra and Stewart2009; Vogel, Stokholm, & Jorgensen, Reference Vogel, Stokholm and Jorgensen2019; Yang et al., Reference Yang, Unverzagt, Jin, Hendrie, Liang, Hall and Gao2012). For example, Sosa et al. (Reference Sosa, Albanese, Prince, Acosta, Ferri, Guerra and Stewart2009) showed normative values by sex, age, and education for four components of the 10/66 Dementia Research Group cognitive test battery for 13,649 individuals aged 65+ years in five Latin American countries, China, and India. In the same line, Olabarrieta et al. (2015) generated norms for several verbal fluency tests across 11 Latin America countries with country-specific adjustments for gender, age, and education in 3977 adults from 18 to 95 years of age. Although many widely used cognitive measures have been validated and standardized for evaluation of individuals across the life span, norms from some groups of individuals with diverse socioeconomic and cultural characteristics, especially among those countries where the population aging will be concentrated, are scarce or non-existent. Moreover, the majority of earlier research used convenience and/or small samples (Beeri et al., Reference Beeri, Schmeidler, Sano, Wang, Lally, Grossman and Silverman2006; Cavaco et al., Reference Cavaco, Goncalves, Pinto, Almeida, Gomes, Moreira and Teixeira-Pinto2013; Costa et al., Reference Costa, Bagoj, Monaco, Zabberoni, De Rosa, Papantonio and Carlesimo2014; Guruje et al., Reference Guruje, Unverzargt, Osuntokun, Hendrie, Baiyewu, Ogunniyi and Hali1995; Hänninen et al., Reference Hänninen, Pulliainen, Sotaniemi, Hokkanen, Salo, Hietanen and Erkinjuntti2010; Kosmidis et al., Reference Kosmidis, Vlahou, Panagiotaki and Kiosseoglou2004; Machado et al., Reference Machado, Charchat Fichman, Lucas Santos, Koenig, Amaral Carvalho, Santos Fernandes and Paulo Caramelli2009; Mejía-Arango et al., Reference Mejía-Arango, Wong and Michaels-Obregón2015; Olabarrieta-Landa et al., Reference Olabarrieta-Landa, Rivera, Galarza-Del-Angel, Garza, Saracho, Rodriguez and Arango-Lasprilla2015; Sosa et al., Reference Sosa, Albanese, Prince, Acosta, Ferri, Guerra and Stewart2009; Vogel et al., Reference Vogel, Stokholm and Jorgensen2019; Yang et al., Reference Yang, Unverzagt, Jin, Hendrie, Liang, Hall and Gao2012) rather than nationally or community-based representative samples (Abbott et al., Reference Abbott, Skirrow, Jokisch, Timmers, Streffer, van Nueten and Weimar2019; Alenius et al., Reference Alenius, Koskinen, Hallikainen, Ngandu, Lipsanen, Sainio and Hanninen2019; Contador et al., Reference Contador, Almondes, Fernandez-Calvo, Boycheva, Puertas-Martin, Benito-Leon and Bermejo-Pareja2016; Kenny et al., Reference Kenny, Coen, Frewen, Donoghue, Cronin and Savva2013). Numerous studies have showed the sociodemographic influence on neuropsychological instruments based on methodological approaches that rely on artificial categorization. However, normative values derived from regression modeling provide more precise adjustment to the subject’s cognitive performance and can assist in early identification of cognitive impairment in middle- and older-aged adults (Crawford, Garthwaite, Denham, & Chelune, Reference Crawford, Garthwaite, Denham and Chelune2012; Guardia-Olmos, Pero-Cebollero, Rivera, & Arango-Lasprilla, Reference Guardia-Olmos, Pero-Cebollero, Rivera and Arango-Lasprilla2015). In order to overcome the above-mentioned limitations, the present study aims to generate country-specific norms for two tests of episodic memory and a verbal fluency task among middle- and older-aged adults using nationally representative data from nine countries.
METHODS
The Survey
Data from the Collaborative Research on Aging in Europe (COURAGE in Europe) [http://www.courageineurope.eu/] and the WHO Study on Global AGEing and Adult Health (SAGE) [https://www.who.int/healthinfo/sage/en/] studies were analyzed. The COURAGE in Europe project was undertaken between 2011 and 2012 in Finland, Poland, and Spain, while the SAGE study was conducted in China, Ghana, India, Mexico, Russia, and South Africa between 2007 and 2010. These six countries were considered low- and middle-income at the time of the survey, whereas the countries included in the COURAGE study were high-income countries based on the World Bank classification (World Bank, 2011).
Potential respondents were selected by a stratified, multistage, clustered area probability design to generate nationally representative samples. Distinct strata were used according to country characteristics, including geographical areas (e.g. regions, provinces, communities, states, districts), population size, residential area (i.e. urban/rural), and race (Garin et al., Reference Garin, Koyanagi, Chatterji, Tyrovolas, Olaya, Leonardi and Haro2016). A list of household occupants was created for the final sampling units, from which one participant was randomly selected following age quotas (18–49 and 50+ years). In the SAGE study, however, individuals from 50+ households were all invited to participate (Kowal et al., Reference Kowal, Chatterji, Naidoo, Biritwum, Fan, Lopez Ridaura and Collaborators2012). Sampling weights were then generated to account for the population distribution obtained from the National Institute of Statistics or the United Nations Statistical Division for COURAGE in Europe and SAGE, respectively.
Overall, the sample comprised 53,269 non-institutionalized adults aged 18+ years. Individual response rates were as follows: 51% (Mexico), 53% (Finland), 67% (Poland), 68% (India), 70% (Spain), 77% (South Africa), 80% (Ghana), 83% (Russia), and 93% (China). Both surveys followed equivalent similar protocol to collect comparable, reliable, and valid data on health, lifestyle habits, and well-being outcomes. Standardized physical examinations and a neuropsychological test battery assessment were also performed. The protocol was translated from English into the local languages according to the WHO guidelines for translation and adaptation of instruments (World Health Organization, 2019). Face-to-face structured interviews were completed by trained personnel using a computer-assisted personal interview (CAPI) or a paper and pencil format. Quality control procedures were carried out during the fieldwork. If an individual was not able to undertake the survey because of severe (cognitive or physical) limitations, a shorter version of the questionnaire was administered to a proxy. Ethical approvals were obtained from the WHO Ethical Review Committee and the appropriate local ethics research review boards. All participants provided written informed consent.
The present study was focused on participants aged 50 years or older. Individuals who participated in the survey via a proxy respondent, whose information on cognition was lacking, were further excluded, yielding a final analytical sample of 42,116 individuals.
Measures
Cognitive performance
Cognitive function was measured by means of three performance tests assessing different domains [three elements from the CERAD (Consortium to Establish a Registry for Alzheimer’s Disease) neuropsychological battery] (Morris, Heyman, & Mohs, Reference Morris, Heyman and Mohs1989). The tests from the CERAD consisted of a word list memory (three trials of immediate recall), a word list recall (delayed recall), and a verbal fluency task. Word list memory assessed learning ability and episodic memory. In the COURAGE study, respondents were presented with 10 cards with unrelated common nouns to read aloud and remember. As for the SAGE project, the examiner read out the list of 10 words. Word list memory and word list recall measured the ability to recall the nouns given in the word list immediately and after presentation of a distractor task, respectively. Scores were the total number of words correctly recalled across the three learning trials (range 0–30) and after a brief delay (range 0–10). Higher values correspond to better performance. Individuals with missing values in any of the three trials of immediate recall were removed (less than 0.4%). The animal naming task assessed verbal fluency and production, semantic memory, and executive function. Participants were encouraged to name as many different animals as possible within a 60-s period. These include different breeds, gender, and generation-specific names. Repetitions and proper names (e.g. Lassie) were not credited. If the individual gave no answers within 15 s, the interviewer provided prompts or repeated basic instructions. In the same line, participants who stopped before the assigned time elapsed were encouraged to continue. Scores are the total number of valid answers. Higher scores indicate better performance.
Statistical Analyses
Descriptive analyses on the basic sociodemographic and clinical characteristics (i.e. sex, age, education, marital status, residential area, and medical conditions) were performed for each country. In order to obtain the expected score for each cognitive test with country-specific adjustments for sex, age, education, and residential area, multiple linear regression analyses were carried out. Regression-based methods play a useful role in neuropsychological research and clinical practice and produce methodologically robust outputs (Crawford et al., Reference Crawford, Garthwaite, Denham and Chelune2012). This approach is suitable for the comparison of an individual’s obtained score to those predicted from large-scale and demographically corrected normative data.
The assumptions of the linear regressions were first statistically and visually tested. Models were built for each country independently using the raw scores of each cognitive test. Sex (0 = women; 1 = men), age (in years), education (in years), and residential area (0 = urban; 1 = rural) were included as the predictor variables. Results from all multiple linear regression models are presented as intercept, unstandardized regression coefficients (B) with its corresponding 95% confidence intervals (CIs), square root of the mean square residual (root MSE), and adjusted R 2. Complete-case analysis was performed. The level of statistical significance was set at p < .05. Sample weighting and the complex study design were considered in these analyses to adjust for the population structure of the respective countries. Data analyses were performed using Stata 15.1 (Stata Corp LP, College Station, Texas).
To calculate demographically adjusted z scores and percentiles from linear regression models, the following formula was then computed:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210126130149480-0888:S1355617720000582:S1355617720000582_eqnu1.png?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210126130149480-0888:S1355617720000582:S1355617720000582_eqnu2.png?pub-status=live)
where:
-
Y i = participant’s real score
-
Y´ i = participant’s expected score
-
Root MSE = squared root of the mean square residual (i.e. standard deviation of the error term)
-
β0 = intercept
-
βsex = regression coefficient for woman
-
Sex i = participant’s real sex
-
βage = regression coefficient for age
-
Age i = participant´s real age
-
βeducation = regression coefficient for years of education
-
Education i = participant’s real years of education
-
βresidential area = regression coefficient for urban area
-
Residential area i = participant’s real residential area
Z scores can be positive or negative numbers. A positive Z score can be interpreted as a better cognitive performance than expected. On the contrary, a negative Z score would indicate that the individual showed worse cognitive performance than could be expected taking into account the subject’s sex, age, years of education, and residential area. More precisely, Z scores ≤ −1.0 usually denote “borderline cognitive impairment,” ≤−1.5 would indicate (non-clinical) “mild cognitive impairment,” while a Z score ≤ −2.0 generally represent “significant cognitive impairment” (Abbott et al., Reference Abbott, Skirrow, Jokisch, Timmers, Streffer, van Nueten and Weimar2019; Schinka et al., Reference Schinka, Loewenstein, Raj, Schoenberg, Banko, Potter and Duara2010). A standard method for creating point estimates of the percentile ranks was then used in keeping with Crawford, Garthwaite, and Slick (Reference Crawford, Garthwaite and Slick2009). Inspired by previous initiatives (Cavaco et al., Reference Cavaco, Goncalves, Pinto, Almeida, Gomes, Moreira and Teixeira-Pinto2013; Crawford et al., Reference Crawford, Garthwaite and Slick2009; Larouche et al., Reference Larouche, Tremblay, Potvin, Laforest, Bergeron, Laforce and Hudon2016), we have developed a user-friendly Libreoffice Calc® file for clinicians and researchers to facilitate the interpretation of the subject’s cognitive performance. This open source software can be downloaded for free at https://www.libreoffice.org/. The professional is requested to introduce the individual’s main characteristics and the real score for the specific cognitive test. The program will then provide the expected score for this individual, his/her z score and its corresponding point estimate of the percentile rank. This file is available in the Supplementary Material.
RESULTS
Sample Characteristics
Among our analytical sample of 42,116 individuals, the mean age was 62.8 years (SD = 9.5), and there were more women than men (52.5% vs. 47.5%). Table 1 provides a summary of the sociodemographic and clinical characteristics of the study sample by country. The level of education showed significant variability across countries, with the highest percentage of participants with completed secondary or tertiary education found in Finland (82.7%), Poland (75.0%), and Russia (92.5%). The percentage of respondents married or in a relationship was higher in China (85.1%) and India (76.9%). Likewise, the Chinese, Indian, and Ghanaian samples showed the largest proportion of individuals living in rural areas. Overall, an important proportion of participants had hypertension (>60%, except for India), while Mexico and Spain had the highest prevalence of diabetes and depression. A proportion ranging from 3.7 to 6.5 was found for stroke.
Table 1. Baseline demographic and clinical characteristics of the study sample by country
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210126130149480-0888:S1355617720000582:S1355617720000582_tab1.png?pub-status=live)
Values are percentages for each category unless otherwise indicated. Some percentages are based on an incomplete sample because of missing data [less than 1%, except for level of education in Mexico (4%) and South Africa (15%)].
Abbreviation: SD = Standard deviation.
aLevel of education followed the International Standard Classification of Education (ISCED-11). No/basic education was defined as illiterate, no formal education received but can read and write, or incomplete primary schooling.
bAn urban area includes towns, cities, and metropolitan areas that have been legally declared as being urban. A rural area includes commercial farms, small settlements, rural villages, and other areas which are further away from towns and cities.
cMedical conditions were ascertained through the use of combined criteria (except for diabetes) including self-reported diagnosis, lifetime symptoms, and blood pressure measurement. For a detailed description please refer to Garin et al. (Reference Garin, Koyanagi, Chatterji, Tyrovolas, Olaya, Leonardi and Haro2016).
Table 2. Multiple linear regression models for immediate recall
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210126130149480-0888:S1355617720000582:S1355617720000582_tab2.png?pub-status=live)
*p < 0.05, **p < 0.01, ***p < 0.001.
Abbreviations: CI = Confidence interval; Root MSE = Square root of the mean square residual.
Some analyses are based on an incomplete sample because of missing data in the cognitive test (less than 5% except for Finland 7.4%; Mexico 5.7%).
Table 3. Multiple linear regression models for delayed recall
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210126130149480-0888:S1355617720000582:S1355617720000582_tab3.png?pub-status=live)
*p < 0.05, **p < 0.01, ***p < 0.001.
Abbreviations: CI = Confidence interval; Root MSE = Square root of the mean square residual.
Some analyses are based on an incomplete sample because of missing data in the cognitive test (less than 5% except for Finland 7.4%; Mexico 5.7).
Table 4. Multiple linear regression models for verbal fluency
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210126130149480-0888:S1355617720000582:S1355617720000582_tab4.png?pub-status=live)
*p < 0.05, **p < 0.01, ***p < 0.001.
Abbreviations: CI = Confidence interval; Root MSE = Square root of the mean square residual.
Some analyses are based on an incomplete sample because of missing data in the cognitive test (less than 5% except for Finland 7.0%; Mexico 7.4%; Russia 11.3%; South Africa 5.7%).
Sociodemographic Effects on Cognitive Performance
Tables 2–4 show the results from the linear regression analyses. Assumptions for regression analyses were fulfilled for each model. The effect of sex, age, years of education, and residential area were explored for each cognitive test across all sites independently.
The influence of sex and residential area to the prediction of the cognitive performance was not uniform. For example, regression analysis showed significant differences between men and women from Poland in all cognitive tasks. On the contrary, cognitive performance was not associated with sex in any of the cognitive measures among the individuals living in Mexico or South Africa. Increasing age was significantly and consistently associated with lower cognitive scores across all sites and cognitive measures (B coefficients from −0.02 to −0.22). Similarly, higher years of education was significantly associated with higher cognitive scores in all tasks and countries. Overall, these sociodemographic variables accounted from 0.03 (South Africa for the verbal fluency task) to 0.30 (Finland and Russia for the immediate recall task) of variance in cognitive performance.
Normative Data
The previous tables would be used for the adjustment of the cognitive scores to generate normative data by country, sex, age, education, and residential area of the individual. As an illustration, assume that you have evaluated a Spanish man of 65 years living in a rural area who completed 9 years of formal education and generated 15 animal names in the verbal fluency task. According to the aforementioned equation, the individual expected score would be
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210126130149480-0888:S1355617720000582:S1355617720000582_eqnu3.png?pub-status=live)
The adjusted predicted score for this subject is 18.59. Next, to calculate the individual’s Z score, the predicted score is subtracted from the observed score and then divided by the root MSE, namely (15–18.59)/5.97. This z score of −0.60 equates to approximately the 27th percentile, representing a cognitive performance in the low average for his specific reference group, albeit clinically normal. In other words, around 27% of the Spanish men of 65 years and 9 years of formal education would name ≤15 words on this semantic fluency task. Please refer to the supplementary material in order to generate demographically adjusted norms using the Libreoffice Calc® file without the need for manual calculation.
DISCUSSION
Normative values for various widely used neuropsychological instruments were presented. While cognitive norms are increasingly available across the entire adult age range, respondents are commonly recruited following a purposive sampling method. To the best of our knowledge, this is the first and largest study to date to report normative data among nationally representative samples of low-, middle-, and high-income countries. The study design and procedure together with the regression-based approach extend the validity of our findings.
Although our objective was not to compare mean scores of cognitive tests across countries, but to report normative data for each test and country separately, we observed that Ghana, India, China, and South Africa consistently showed lower scores across all cognitive tasks.
Numerous studies have examined the effect of sociodemographic variables on cognitive performance. Our findings are in line with other previous research reporting that both age and education have a strong impact on cognitive results, albeit with different magnitude effects (Alenius et al., Reference Alenius, Koskinen, Hallikainen, Ngandu, Lipsanen, Sainio and Hanninen2019; Beeri et al., Reference Beeri, Schmeidler, Sano, Wang, Lally, Grossman and Silverman2006; Brody, Kramarow, Taylor, & McGuire, Reference Brody, Kramarow, Taylor and McGuire2019; Contador et al., Reference Contador, Almondes, Fernandez-Calvo, Boycheva, Puertas-Martin, Benito-Leon and Bermejo-Pareja2016; Ganguli et al., Reference Ganguli, Snitz, Lee, Vanderbilt, Saxton and Chang2010; Hankee et al., Reference Hankee, Preis, Piers, Beiser, Devine, Liu and Au2016; Melikyan et al., Reference Melikyan, Corrada, Dick, Whittle, Paganini-Hill and Kawas2019; Olabarrieta-Landa et al., Reference Olabarrieta-Landa, Rivera, Galarza-Del-Angel, Garza, Saracho, Rodriguez and Arango-Lasprilla2015; Sosa et al., Reference Sosa, Albanese, Prince, Acosta, Ferri, Guerra and Stewart2009; Van Der Elst, Van Boxtel, Van Breukelen, & Jolles, Reference Van Der Elst, Van Boxtel, Van Breukelen and Jolles2006). It is possible that education contributes to cognitive performance by altering how individuals process cognitive tasks (Stern et al., Reference Stern, Arenaza-Urquijo, Bartres-Faz, Belleville, Cantilon, Chetelat and Conceptual Frameworks2018) and further slowing cognitive decline (Hankee et al., Reference Hankee, Preis, Piers, Beiser, Devine, Liu and Au2016). On the contrary, Seblova, Berggren, and Lovden (Reference Seblova, Berggren and Lovden2020), who conducted a systematic review and meta-analysis of observational longitudinal cohort studies, have recently revealed that length of formal education was not associated with changes in cognitive performance. As suggested by Luck et al. (Reference Luck, Pabst, Rodriguez, Schroeter, Witte, Hinz and Riedel-Heller2018), it is also plausible that the effect of education on cognition is explained by reverse causality. That is, higher cognitive abilities may facilitate individuals to attain higher levels of formal education. However, this contrasts with the findings of a meta-analysis providing evidence for causal effects of education on a wide variety of cognitive abilities across the life span (Ritchie & Tucker-Drob, Reference Ritchie and Tucker-Drob2018). Educational attainment may also be related with engagement in healthy lifestyles and health care utilization that would ultimately affect cognitive performance. Beyond the impact of age and education on cognitive function, it is also plausible to explain differences in tests scores by cohort effects. Previous studies have reported that younger cohorts outperformed earlier generations for multiple cognitive domains (Dickinson & Hiscock, Reference Dickinson and Hiscock2011; Karlsson, Thorvaldsson, Skoog, Gudmundsson, & Johansson, Reference Karlsson, Thorvaldsson, Skoog, Gudmundsson and Johansson2015), with both measures of fluid and crystallized intelligence being vulnerable to these effects (Cornelis et al., Reference Cornelis, Wang, Holland, Agarwal, Weintraub and Morris2019; Pietschnig, Voracek, & Formann, Reference Pietschnig, Voracek and Formann2010). The exposure to analogous social and environmental circumstances for those being born at a specific place and moment in time will presumably influence cognitive abilities, while the progressive improvement of education with the passing years is thought to play a key role in determining cognitive performance (Schaie, Willis, & Pennak, Reference Schaie, Willis and Pennak2005). In view of the above, normative data become less valid over the years, as the use of outdated norms may overestimate an individual’s cognitive performance when compared with an older cohort. In this regard, our work, in addition to using nationally representative samples, has the advantage of providing normative values from recent data. It has to be mentioned, though, that observed changes in cognition may be partially explained by cohort effects given the cross-sectional nature of our study. There is vast literature emphasizing the importance of including sex to produce appropriate norms for neuropsychological tests. However, the effect of sex in this and other studies seems controversial (Luck et al., Reference Luck, Pabst, Rodriguez, Schroeter, Witte, Hinz and Riedel-Heller2018; Melikyan et al., Reference Melikyan, Corrada, Dick, Whittle, Paganini-Hill and Kawas2019; Olabarrieta-Landa et al., Reference Olabarrieta-Landa, Rivera, Galarza-Del-Angel, Garza, Saracho, Rodriguez and Arango-Lasprilla2015; Sosa et al., Reference Sosa, Albanese, Prince, Acosta, Ferri, Guerra and Stewart2009; Yang et al., Reference Yang, Unverzagt, Jin, Hendrie, Liang, Hall and Gao2012). Our findings showed a significant advantage for women in measures of episodic memory in some countries, whereas men performed better at the animal naming task. Finally, the residential area appeared to have a weaker influence on cognitive performance. Aspects of socioeconomic and sociocultural status could possibly influence cognitive scores (Contador et al., Reference Contador, Almondes, Fernandez-Calvo, Boycheva, Puertas-Martin, Benito-Leon and Bermejo-Pareja2016). Furthermore, differences in the study population (e.g. age range, country of origin), study setting (e.g. clinically based vs. population-based), and measurement approaches (e.g. mean/standard deviations, T scores, percentiles, Z scores) may explain the observed heterogeneity.
The applicability of population-based norms within specific subpopulations is gaining greater attention because of the unstoppable globalization. Previous studies have suggested that neuropsychological batteries may be biased against, for example, ethnic minority groups as these tests may not reflect the same underlying construct with respect to the population for which the norms were developed. In this regard, measurement invariance is important to determine whether, under different conditions, assessment procedures yield equivalence of a latent cognitive structure (Barnes et al., Reference Barnes, Yumoto, Capuano, Wilson, Bennett and Tractenberg2016; Wicherts, Reference Wicherts2016). Even if we have used nationally representative samples to generate cognitive norms for each country, measurement invariance has not been tested in the present study. Therefore, we caution the readers regarding the appropriateness of unitary norms for specific subgroups or country comparisons.
Tests of verbal recall and fluency are almost universally used and are typically included in standard protocols of cognitive tests. They can be administered in both clinical and research settings, and they may be an appropriate option for some individuals (e.g. those with sensory limitations) who might otherwise not be suitable for neuropsychological assessment. These tasks are quick and easy to administer and have value beyond their usefulness as reference measures of cognitive testing, being sensitive to many types of brain disorders (Backman et al., Reference Backman, Jones, Berger, Jonsson Laukka and Small2005; Jones et al., Reference Jones, Laukka and Bäckman2006; Van den Berg et al., Reference Van den Berg, Jiskoot, Grosveld, van Swieten and Papma2017; Weissberger et al., Reference Weissberger, Strong, Stefanidis, Summers, Bondi and Stricker2017). It is important to ensure reliability and validity of cognitive testing while taking into account the influence of socioeconomic and cultural factors. This increased sensibility would not only facilitate early detection but could also assist to improve the effectiveness of interventions.
Strengths and Limitations
The strengths of the present study include the use of nationally representative samples from very culturally and economically diverse sites. Moreover, this study included a regression-based approach to obtain neuropsychological tests norms, instead of previous common methods (e.g., standardized Z scores or percentiles based on means and SD derived from arbitrary categories), which constitutes a convenient contribution to the applied clinical research field (Crawford et al., Reference Crawford, Garthwaite, Denham and Chelune2012). Furthermore, all interviewers were involved in a course prior to the survey to enable them learning about the procedures to administrate performance tests in an accurate, clear, and comparable way. Some limitations should also be taken into account when interpreting our study results. First, despite our best attempt to exclude individuals with cognitive limitations, it is possible that some respondents presenting cognitive impairment or even mild dementia may have been included in our analyses. However, we made use of family-reported dementia diagnosis, and interviewers were trained to identify individuals with severe cognitive function who were unable to undertake the interview. Second, some of our regression analyses are based on an incomplete sample because of missing data (a missing rate of 5% or less in most cases). Statistical methods to handle missing data were not performed due to the nature of missing data, as the majority of cases arose from individuals who were not capable to complete the tasks. Indeed, individuals who did not respond to one or more of the cognitive tests were more likely to be older, have lower years of education, and present higher disability (data not shown). Third, as it can be seen from the heterogeneous R 2, there should other variables that may account for cognitive performance that have not been considered when generating normative data.
CONCLUSIONS
Neuropsychological assessment constitutes a key component for cognitive testing, diagnosis, and disease progression monitoring. In order to assist in the accurate identification and follow-up of the (risk of) cognitive dysfunction, demographically adjusted norms are necessary. Our study provided sex-, age-, education-, and residential area-specific regression-based norms that were obtained from one of the largest normative studies worldwide on verbal recall and fluency tests to date. Both age and education were associated with test performance, while the effect of sex and residential area on cognitive function was not homogeneous across countries neither cognitive tasks. Findings derived from this study will be especially useful for clinicians and researchers based at countries where normative data are limited.
ACKOWLEDGMENTS
This work was supported by the US National Institute on Aging Interagency Agreements (OGHA 04034785, YA1323–08-CN-0020, Y1-AG-1005-01, and research grant R01-AG034479); the European Community’s Seventh Framework Programme (M. L., grant agreement 223071 – COURAGE in Europe); the Instituto de Salud Carlos III (J.M.H., FIS research grants PS09/00295, PI12/01490, and PI16/01073), (J.L.A.M., FIS research grants PS09/01845, PI13/00059, and PI16/00218), the Spanish Ministry of Economy and Competitiveness ACI Promociona (J.M.H., ACI2009–1010); and Centro de Investigación Biomédica en Red de Salud Mental (CIBERSAM).
E.L.’s work is supported by the Sara Borrell postdoctoral programme (CD18/00099) from the Instituto de Salud Carlos III (Spain) and co-funded by European Union (ERDF/ESF, “Investing in your future”). The authors greatly appreciate the generous contribution of all the participants, which made this work possible. The authors also acknowledge the principal investigators from the SAGE study: P. Arokiasamy (India), R. Biritwum (Ghana), W. Fan (China), R. López Ridaura (Mexico), T. Maximova (Russia), and N. Phaswanamafuya (South Africa).
CONFLICT OF INTEREST
The authors have nothing to disclose.
SUPPLEMENTARY MATERIAL
To view supplementary material for this article, please visit https://doi.org/10.1017/S1355617720000582