Published online by Cambridge University Press: 01 March 2004
The ability to measure neuropsychological outcomes in a comparable manner in different cultural groups is important if studies conducted in geographically diverse regions are to advance knowledge of disease effects and moderating influences. The purpose of this study was to evaluate the application of neuropsychological test procedures developed for use in North America and Europe to children in a rural region of Kenya. Our specific aim was to determine if these methods could be adapted to a non-Western culture in a manner that would preserve test reliability and validity. Procedural modifications yielded reliable tests that were sensitive to both the sequelae of cerebral malaria and to children's social and school backgrounds. Results suggest that adaptations of existing tests can be made in such a way as to preserve their utility in measuring the cross-cultural sequelae of childhood neurological diseases. (JINS, 2004, 10, 246–260.)
Although children growing up in sub-Saharan Africa are exposed to a number of potentially debilitating diseases, we are only beginning to learn about the cognitive sequelae of these conditions. Outcomes of cerebral malaria are of particular concern, as this disease is the most common acute encephalopathy in Africa, if not the world. Recent studies of African populations have begun to document the adverse effects of cerebral malaria on cognitive function (Boivin, 2002; Carter et al., 2003; Dougbartey et al., 1998; Holding et al., 1999; Muntendam et al., 1996). A broader research base has been constrained by the lack of methodologies for evaluating the effects of health, nutritional, and educational risks. The present paper details the approach taken by our research group in developing appropriate assessment tools for the investigation of the effects of severe malaria. Our primary aim in describing this methodology is to help guide future studies of disease outcomes across geographic and language contexts.
One consideration in choosing assessment methods is the pathophysiology of the disease process under investigation. Knowledge of disease pathology may provide clues as to the cognitive abilities most likely to be affected. The related signs and symptoms associated with the development of neurological sequelae include coma, seizures, and hypoglycemia (Bondi, 1992; Brewster et al., 1990; Looareesuwan, 1992; Marsh, 1995; Marsh et al., 1996; Molyneux et al., 1989). The findings of electroencephalograms (EEGs) taken during the course of the disease indicate that seizure activity may be localized to the posterior parietotemporal region (Crawley et al., 1996), suggesting possible impairments of visuospatial processing, memory, attention and executive functions (Gaddes & Edgell, 1993; Luria, 1973; Zigmond et al., 1999). However, computerized tomography (CT) scans of survivors with gross neurological sequelae reveal a marked variability in the location of damage (Newton et al., 1994). In summary, the literature on the pathophysiology of severe malarial disease available during the planning our study suggested diffuse central nervous system involvement. Other influences on outcome also make a specific pattern of deficits unlikely. Children with cerebral malaria are exposed to a host of risk factors, including the lack of formal schooling, nutritional deficiencies, helminth infection, and chronic diseases such as sickle cell anemia. A test battery that taps a broad range of cognitive functions may thus be best suited to assessment of outcomes in these children.
A second consideration in the development of test materials is the specific sociocultural setting to which the measures will be applied (Super & Harkness, 1986). We work in Kenya, in an area where the population is predominately rural and where the Mijikenda is the majority ethnic/linguistic group. Most families depend upon subsistence farming, with extended families living in homesteads. Many families are polygamous; and children's caregivers are often persons other than their biological mothers. Less than 70% of children receive a school-based education, and about one-third of these fail to complete primary school (Holding & Katana, 1997). Less than one-half of women are literate and children spend much of their time with their peers; hence children's informal education is also limited. For these reasons, many of the children in our population are unfamiliar with test demands, and they are disinclined to interact with a strange examiner or engage with unfamiliar test materials. These characteristics are likely to lead to lower performance levels and may compromise the sensitivity of outcome assessments (Dasen, 1988; Greenfield, 1997; Wober, 1975).
A further concern is potential test bias. According to Reynolds (1983), sources of bias include unfamiliar test materials, insensitive or linguistically naïve examiners, lack of regional standardization or of evidence that tests are measuring the same constructs across cultural contexts, and lack of criterion related and predictive validity. The appropriateness of a test for use in a culture for which it was not originally developed and the assumption of lack of cultural bias needs to be evaluated on a context-by-context basis (Kline, 1993).
The primary aim of this paper was to review the methods we employed in developing a test battery to examine the sequelae of cerebral malaria in a rural coastal region of Kenya. We believe psychologists working with populations for which tests have not been standardized face challenges similar to those encountered by our research team. Our procedures may thus be instructive to others working in these settings. More specific aims were to make adaptations to test materials appropriate to our population setting and to examine the reliability and validity of these adaptations. Test sensitivity was investigated in relation to both biological and social risk factors. We also considered the potential differential effects of disease in different social and educational contexts. External validity was explored by assessing associations of neuropsychological performance with measures of arithmetic ability and parent ratings of child behavior.
In the initial development of our test battery, we focused on the Kaufman Assessment Battery for Children (K-ABC, Kaufman & Kaufman, 1983) as a measure of cognitive processing skills. The K-ABC has been used widely in assessing the consequences of neurological disorders and has been applied in diverse cultural contexts (Boivin et al., 1996; Giordani et al., 1996; Moon, 1989; Taylor et al., 2000; Weisglas-Kuperus et al., 1994; Wolke & Meyer, 1999). This test battery has a number of features that recommend its use in a rural African population. It provides a structured approach to assessment while incorporating teaching items to increase familiarity with the test materials. Additionally, most K-ABC subtests make limited demands on verbalization, a significant advantage in working with children who are reticent to speak with adults. One of the specific challenges in designing the test battery was the need for measures that would be appropriate for children who speak varying dialects, with the subsequent requirement that the assessment be carried out by examiners with expertise in these local languages.
The test battery was used to assess the school-age sequelae of severe malarial disease in Kilifi District, a rural region on the coast of Kenya where malaria is endemic (Holding et al., 1999). Our focus was on producing materials appropriate for 6-year-old children. This age is a key time in the children's lives in relation to both disease factors and socio-cultural influences. The highest incidence of severe malarial disease is before 6 years, with 1 in every 15 children in our region having been affected by this age. In about 25% of these cases the disease is accompanied by serious neurological complications. Official school entry is also at 6 years of age. Even for the children who do not attend school, this age is the point in development when children more regularly interact with persons outside of their families.
The test battery was developed in three phases. Each phase had a different purpose and involved a different sample of children. Phase 1 investigated the applicability of test content and procedures. Phase 2 was conducted to make final test selections and item modifications and to establish test reliability. In Phase 3, we investigated sensitivity and external validity by examining relationships of performance with biological and social risks and with measures of academic ability and behavior. Permission to assess the children was received both verbally and in writing from parents, the Kilifi District Education office, and, where appropriate, head teachers in schools.
Children in Phase 1 were drawn from eight nursery classes across the district. Schools were targeted to provide us with direct access to a pool of children. The schools differed in their catchment population, enabling us to sample different linguistic subgroups from within the majority Mijikenda population. The children, who were 5–7 years of age at the time of recruitment, were selected by asking teachers to nominate children of varying abilities from their classes. A pool of over 100 children was used, with different children completing different tasks.
Test content and format was varied across children in an effort to make useful modifications in procedures and materials. Subtests were examined item by item to identify floor and ceiling effects, establish which drawings were recognizable, and evaluate the clarity of instructions.
To provide a representative sample of the target population, we recruited a community sample of 56 children ranging in age from 5 years, 7 months to 6 years, 11 months (M age 6 years, 4 months). An initial list of households with children of the appropriate age was drawn using random sampling from a population census database held at our study site. Two children were recruited at each location. Only one of the families we approached declined participation. The reason given for refusal was that the head of the household was absent and thus could not give his consent. The sample consisted of 34 girls and 22 boys. Only 40% of the sample attended school (30% nursery school and 10% elementary school). Paternal occupation as described by family members included manual labor (85%), crop selling (4%), skilled employment (4%), and unknown (4%). Comparison of our sample characteristics to the census database of our community showed that our sample characteristics were representative of the community as whole.
Eight K-ABC subtests were modified to assess cognitive processing skills, including: Face Recognition, Gestalt Closure, Magic Window, Word Order, Triangles, Number Recall, Hand Movements, and Matrix Analogies. Several additional tests were also administered to extend the breadth of our battery, including a modified version of K-ABC Arithmetic, Picture Vocabulary (vocabulary knowledge), Pegboard (fine motor co-ordination), Visual Search (visual attention), Pragmatic Errors (pragmatic language), and the Child Behaviour Questionnaire for Parents (CBQFP, behavior problems). Appendix A provides a description of the test battery and summarizes the modifications needed to adapt the procedures to our population.
All assessments were carried out in Mijikenda, with slight variations in vocabulary employed depending upon the linguistic subgroup membership of the child. Special needs' teachers who were fluent in the Mijikenda languages assessed the children in or near the children's homes. The examiners held a diploma in special education and assessment. Further training was provided by the first author and included both direct observation and video-taped feedback. Assessments were administered in two sessions, each lasting no more than two hours. Subtests were administered in a set order. Following analysis of test performance, further modifications as described in Appendix A were made to some subtests to extend the range of items or to improve reliability. A third visit took place 6–10 weeks later to assess test–retest reliability. We readministered only half of the battery to a given child during this session, with half of the children repeating tests from the first session and the other half repeating tests from the second session. Test–retest reliabilities were computed for the modified tests on groups of approximately 20 six-year-old children who were recruited through local primary schools. Test–retest intervals for these samples were approximately 2–3 weeks.
The sample for Phase 3 included 174 children who were part of an investigation of severe malaria with impaired consciousness (Holding et al., 1999). Eighty-seven of the children met criteria for severe malaria, as defined by a hospital admission for treatment of P. falciparum parasites during which the child obtained a Blantyre coma score of 4 or less. Efforts to recruit these children were highly successful as only one of 88 families contacted refused participation. The mean age of the 87 participants at the time of hospital admission was 26 months (SD = 7, range = 10–42 months). The minimum interval between severe malarial disease and assessment was 42 months.
In consultation with clinical staff at the Kenya Medical Research Institute-Wellcome Trust Research Laboratories, Kilifi (K. Marsh & N. Peshu, personal communication, April, 1997), these 87 children were divided into medium- and high-risk malaria groups. The clinical variables considered in assigning children to these groups included laboratory findings (hemoglobin, parasite count, and blood glucose), observational measures of length and depth of coma, number of seizures, respiratory distress, and a biphasic pattern to the illness as defined by relapse into coma after having regained consciousness. An abnormal finding on these variables was based on clinical standards as described in a previous report (Holding et al., 1999). Based on the presence of at least five abnormal findings or gross neurological sequelae at discharge (e.g., sensory impairments or inability to walk or sit unaided), 17 children were assigned to the high-risk group. Eight of the children met both criteria, seven had more than five abnormal findings only, and two with fewer abnormalities had gross neurological sequelae at discharge. The 70 children who did not meet these criteria were assigned to the medium-risk group.
The remaining 87 children in the sample were recruited from a community database kept at our research unit. Although none of these children had a history of severe malarial disease, all children in our region have malaria by the age at which the children were assessed. The latter group was thus designated as the mild, or low-risk, malaria group. The children with low-risk malaria were matched to the children with severe malaria on age, gender, and socioeconomic status, as defined by the mother's ability to speak English and maternal educational level. Sample characteristics are presented in Table 1.
As in Phase 2, a field worker visited each household to inform the families about the study and request participation. Children were assessed in three sessions. The first session took place at Kilifi District Hospital and involved screening for vision and hearing problems. The next two sessions, each lasting about 2 hr, were conducted at schools near the children's homes. The tests developed in Phase 2 were administered in a fixed order. The CBQFP was completed by means of parent interview in tandem with child testing. Parents also provided information regarding children's school history and general health status, as well as measures of family socioeconomic status. Respondents were most often the children's mothers.
Social and background factors that were considered in predicting cognitive performance included school attendance (yes or no), gender, English-speaking mother (yes, no), and father presence in the home. Although measures of housing type and ownership of material goods have been used in other studies of outcomes of cerebral malaria in Africa (Boivin et al., 1996), these measures failed to distinguish families within our population. Results of preliminary analyses confirmed that each of these factors was associated with one or more of the cognitive measures. Correlations are listed in Table 2. There were no significant differences in these background factors between the groups, although there was a trend for higher proportions of non-English-speaking mothers in the high-risk group.
As a result of the piloting, we discarded some tests, removed or substituted test items, and identified the need for more extended pre-test training (see Appendix A). Based upon the distribution of scores obtained in piloting the tests, we added simpler items and included K-ABC subtests designed for preschoolers. Performance levels were enhanced by providing detailed preparatory instructions, including explanations for errors and successes, and by using familiar materials and representations (Brislin et al., 1973; Budoff, 1987; Carlson & Wiedl, 1979; Hedden et al., 2002; Lidz & Thomas, 1987).
Test–retest reliabilities for the final version of the tests were as follows: Face Recognition = .91 (p < .01), Visual Search = .84 (p < .01), Gestalt Closure = .81 (p < .01), Magic Window = .76 (p < .01), Pegboard = .76 (p < .01), Word Order = .75 (p < .01), Construction = .73 (p < .01), Number Recall = .70 (p < .01), Arithmetic = .63 (p < .01), Picture Vocabulary = .55 (p < .01), Hand Movements = .50 (p < .05), and Matrix Analogies = .35 (ns). Reliability thus was acceptable (r ≥ .7, Kline, 1993) for the majority of the measures.
Due to the need to make further modifications to the language assessment, we were unable to carry out test-retest reliabilities for Pragmatic Errors. Our speech samples, moreover, included a mean of only 37 utterances, far fewer than the mean of 180 obtained by Damico et al. (1980, 1983). Despite this limitation, application of the cut-off score used by these researchers to identify children with language disorder (>30% of utterances with errors) yielded a rate of disorder in our low-risk malaria group that was similar to the rate reported in their work. The latter finding suggests that the procedure was useful in identifying children with language problems.
As the CBQFP was administered on only one occasion, we were unable to examine test-retest reliability for this measure. Because the time between child test sessions was relatively short, stability estimates would have been inflated by parents' recall of their previous responses. However, we were able to evaluate inter-rater reliability by comparing scoring of the interview responses by the interviewer with scoring of interview scripts by the first author. Results of these comparisons confirmed a high degree of consistency across raters (r = .92). Internal reliability was also acceptable, with a standard item alpha of .61.
The construct validity of the modified K-ABC test battery was investigated by conducting a principal components analysis with varimax rotation on seven of the subtests. Matrix Analogies was excluded from analysis because of this test's low reliability. To determine if the other cognitive assessments were related to skills assessed by the K-ABC or if they measured distinct abilities, a second principal components analysis was carried out on a larger set of cognitive measures. Measures entered into this second analysis, in addition to the K-ABC subtests, included Picture Vocabulary, Visual Search, Pegboard, and Pragmatic Errors. Tests with factor loadings of greater than .4 were used to interpret the factors. Z-score transformations of raw scores for the 157 children in the medium-risk and low-risk malaria groups were used in both factor analyses. Due to concerns about floor effects limiting variability in the high-risk group, data from this group were excluded from analysis.
The results of the factor analyses are presented in Tables 3 and 4. Two factors emerged from the first principal components factor analysis, accounting for 63% of the variance in performance (see Table 3). Subtests loading on the first and second factors were those previously identified as measuring Simultaneous Processing and Sequential Processing, respectively (Kaufman & Kaufman, 1983). The first factor accounted for 36% of variance and the second for 27%. The second principal components analysis yielded a three-factor solution, accounting for 60% of the variance (see Table 4). The Simultaneous-Sequential distinction was maintained in this analysis, with Visual Search and Pragmatic Errors comprising a third factor. Because the latter tasks require attention, organization, and planning, the third factor was interpreted as an index of executive function. The loading of the Pegboard Test on the Sequential factor is consistent with the fact that this task requires sequential motor movements. The secondary loading of Construction on the Sequential factor suggests that this procedure requires both spatial and motor sequencing skills.
To examine the sensitivity of the tests to the combined effects of biological and social risks, data from Phase 3 was analyzed using analysis of covariance (ANCOVA). The dependent measures for these analyses were raw test scores. Additional measures included K-ABC Simultaneous and Sequential composites, formed by averaging the sample z scores for the tests with loadings greater than .4 on these factors; and a K-ABC Total composite, formed by averaging the Simultaneous and Sequential composites. Preliminary analyses revealed several interactions between Disease Group (low-, medium-, and high-risk malaria) × School Attendance (yes or no); hence school attendance was included as a fixed factor. Examination of the interaction of these factors permitted a test of the moderating influence of school attendance on disease effects, as justified by findings of exacerbated disease effects in socially at-risk children (Taylor & Alden, 1997). The covariates in each analysis were those additional social risk factors (gender, English-speaking mother, father presence in home) that accounted for independent variance in a given cognitive measure, over and above the effects of other covariates (Jacobson & Jacobson, 1995). Post-hoc tests were conducted on covariate-adjusted scores using single degree of freedom interaction contrasts (Jaccard & Guilamo-Ramos, 2002). The predictive validity of the cognitive tests was evaluated by computing partial correlations of each of the tests with the modified K-ABC Arithmetic subtest and the CBQFP total problem score, controlling for school attendance.
As shown in Table 5, results from ANCOVA revealed main effects of school attendance on Magic Window, Word Order, Picture Vocabulary, and Pragmatic Errors. Schooled children outperformed unschooled children on each of these measures. Main effects were not found for malaria group. Effect sizes were moderate to large for school attendance and small for group.
Analysis also revealed several Group × School attendance interaction effects. Follow-up tests of these interactions indicated that the unschooled high-risk group had lower scores than unschooled children in the other two groups, but that group differences were not significant for schooled children. This pattern held for Face Recognition, Gestalt Closure, Hand Movements, Number Recall, Visual Search, Pegboard, and the three K-ABC composite scores (Simultaneous, Sequential and Total). Further post-hoc testing revealed that, for the high-risk group, schooled children did better than unschooled children on all cognitive measures. Subgroup means are presented in Table 6.
Test performance was also associated with all three covariates. Children whose mothers spoke English had higher scores on Gestalt Closure [F(1,166) = 5.9, p < .05]; Magic Window [F(1,166) = 6.4, p < .05]; Picture Vocabulary [F(1,166) = 8.7, p < .01]; K-ABC Simultaneous [F(1,165) = 6.77, p < .05]; and K-ABC Total [F(1,165) = 4.06, p < .05]. Father absence was related to lower scores on Visual Search [F(1,124) = 5.7, p < .05]. Finally, boys obtained higher scores than girls on Construction [F(1,167) = 18.2, p < .01]; Word Order [F(1,167) = 7.0, p < .01]; K-ABC Simultaneous [F(1,165) = 5.57, p < .05]; and K-ABC Total [F(1,165) = 4.21, p < .05].
Controlling for school attendance, all test scores were correlated with performance on the Arithmetic measure. Most scores were also associated with behavior ratings on the CBQFP. Higher cognitive performance predicted higher arithmetic scores and fewer behavior problems (see Table 7).
Our findings demonstrate the utility of a cross-cultural adaptation of the K-ABC and of other tests not originally designed for the children in our region. We followed a process that required considerable investment of time and resources and that culminated in fundamental changes in content and procedure. Alterations included modifications of the language of instruction, materials, task structure, and test demands. These modifications extended the applicability of testing to children in a rural region of Kenya and resulted in greater task engagement. Similar procedural modifications made by other investigators have yielded enhancements in test performance relative to unmodified test formats (Cazden & John, 1971; Chione & Buggie, 1993; Cole & Scribner, 1974; Kamara & Easley, 1977; Laboratory of Comparative Human Cognition, 1979; Serpell, 1979).
Consistent with previous observations of the cultural specificity of non-verbal tests (Anastasi, 1988; Naglieri & Prewett, 1990; Parmar, 1989), we had particular difficulty adapting the K-ABC Matrices and Triangles subtests (see Appendix). The Matrices task failed to engage the children and was the least reliable of the subtests. Jahoda et al. (1976) also observed weaknesses in rural, unschooled children on tasks requiring the interpretation of complex pictures or analysis of spatial relationships within pictures. These weaknesses have been attributed to lack of familiarity with the material, rather than to deficits in perceptual or reasoning skills (Gregory, 1966; Wober, 1975). The difficulties we found in adapting Matrices may reflect a similar lack of experience with abstract visuoperceptual tasks (Kline, 1993). Future efforts to adapt these tests to non-Western cultures might include the provision of extended training and feedback on task requirements (Carlson & Wiedl, 1979).
The reticence of the children to engage in Triangles was so extreme that we were compelled to design a replacement task. Giordani et al. (1996) made a similar observation in testing children in the DR Congo who were older and had more school experience than our children. Our data suggests that the replacement task, Construction, was a good substitute. This task loaded on the Simultaneous factor, as does Triangles, and we found a similar cross-loading on Sequential Processing (Kaufman & Kaufman, 1983).
Test materials that have been translated in unmodified form have characteristically revealed significant differences between Simultaneous and Sequential Scores. In testing children in the DR Congo, Giordani et al. (1996) found a mean discrepancy of 17 points in favor of Sequential processing. Similar splits of 20 and 14 points, respectively, have been reported in studies of rural Lao children (Boivin et al., 1996) and of Korean children (Moon, 1989). These splits have also been attributed to lack of exposure to objects or events depicted in the Simultaneous subtest materials, rather than underlying differences in cognitive process, further justifying our radical modification of the test materials.
Despite the changes made to test content and procedures, factor analysis identified Simultaneous and Sequential processing dimensions similar to those reported by Kaufman and Kaufman (1983). Factor analysis of a translated but largely unmodified administration of the K-ABC to a Congolese sample by Giordani et al. (1996) yielded a four-factor solution. However, these researchers also found that a two-factor solution was similar to that reported by Kaufman and Kaufman. The more fundamental changes made in designing the Kilifi battery thus resulted in a somewhat better fit to the original factor structure. Moreover, the reliabilities of all but two of our subtests exceeded the highest reliability reported in the study by Giordani et al.
Results from the factor analysis that included our additional tests suggested that the Visual Search and Pragmatic Errors loaded on a factor that was distinct from the two K-ABC factors. The demands of both tasks on response organization and mental planning raise the possibility that this construct may tap aspects of executive function. Measures of social discourse skills and perceptual search are associated with other tests of executive function, and deficits in these skills have been linked to frontal lobe damage (Brookshire et al., 2000; Dennis et al., 2001; Lewis, 2001; Pennington, 1997). The results of follow-up studies of children with early brain insults suggest that tests of executive function provide useful measures of biological risks (Dennis et al., 2001; Taylor et al., 1996). For the present purposes, however, an important implication of the separate loadings of Visual Search and Pragmatic Errors is the justification this finding provides for broadening assessments beyond the dimensions measured by traditional test batteries. The addition of tests that tap different aspects of attention may increase sensitivity to disease effects (Boivin, 2002, Mirsky & Duncan 2001). Pragmatic language evaluations might include elicitation and analysis of narratives (Strong, 1998), story telling dyads, and parental interview schedules (Carter et al., 2003).
Associations of test performance with the severity of malarial disease and social risk factors offered further support for test validity. Disease effects were revealed by the lower scores obtained by the high-risk malaria group compared with the low-risk group on tests of sequential memory (Hand Movements, Word Order, and Number Recall), executive function (Visual Search), and visuomotor speed (Pegboard). The fact that disease effects were found only in the subgroup of unschooled children with high-risk malaria clarifies findings from previous analyses (Holding et al., 1999). By taking both disease severity and school attendance into account in the analysis, we were able to demonstrate the effects of disease on several K-ABC subtests. In our original study, which did not consider these factors, disease effects were found for an overall impairment index but not for individual subtests. Indications that disease sequelae were confined to the subgroup of children with high-risk severe malaria, defined in terms of multiple clinical abnormalities or neurological complications at discharge, is consistent with past investigations of this disease and other childhood encephalopathies (Boivin, 2002; Dougbartey et al., 1998; Muntendam et al., 1996; Taylor et al., 1990). Either less severe forms of cerebral malaria do not affect cognition or our measures or study design did not provide sufficient statistical power for the detection of more subtle sequelae.
Of the social background factors considered, school attendance most consistently predicted children's test scores. Formal education and literacy may affect cognitive development and test taking in a number of ways. Formal schooling is associated with enhanced cognitive development generally and with verbal skills in particular (Cahan & Cohen, 1989; Huttenlocker et al., 1998; Rutter, 1985). Previous studies comparing schooled and unschooled samples have found differences in the level of performance on formal assessment tasks, in the strategies applied to reasoning and memory tasks, and even in brain structure (Castro-Caldas et al., 1999; Cole & Scribner, 1974; Dash & Das, 1984; Luria, 1971; Olson, 1976; Wagner, 1974). The superior performance of our school-going children also may have reflected an increased confidence and willingness to attempt the tasks, possibly as a consequence of greater familiarity with test content and expectations (Kamara & Easley, 1977; Miller-Jones, 1989; Rogoff et al., 1984).
Biases affecting parents' decisions to send their children to school may also account for associations between school attendance and test scores (Ceci, 1991). Amongst the Mijikenda, the decision to send a child to school is based on both economic resources and on parents' perceptions that their children are mature enough to benefit from this experience (Holding & Katana, 1997). Parents' estimates of children's cognitive functioning may be particularly relevant to school attendance in our region, where dates of birth are not routinely kept by parents and where there are few formal records against which to verify parent-based judgments of children's chronological ages. Children who are perceived by their parents as having cognitive weaknesses may simply not have the option of attending school. To the extent that such a bias influenced school attendance for the present sample, children with disease-related cognitive deficits would have been less likely to attend school than other children. Attendance bias, rather than the effects of schooling per se, may thus have accounted for the lack of group differences in the schooled subset of the sample. Consistent with this possibility, children in the high-risk group were less likely to go to school than children in the other two groups. Whatever their origin, the moderating effects of schooling on disease effects demonstrate the importance of considering background factors in evaluating disease outcomes (Taylor & Alden, 1997).
Associations of test performance with father absence and mother's ability to speak English provide further evidence for the effects of social risk on outcome. These variables represent multiple economic and sociocultural characteristics of the child's environment. Father absence, for example, reflects both limited economic resources and family instability—factors with established links to child cognitive and behavioral outcomes (Ackerman et al., 1999). The mother's ability to speak English is a proxy for both maternal educational experience and achievement, and it may also be related to family wealth. An association between maternal education and child outcome has been reported in diverse cultures (Boivin et al., 1996; Stevenson et al., 1990; Wang et al., 1995). One explanation for this relationship is that mothers who remain in school for longer periods are more likely to expose their children to school-type materials and tasks. These experiences, in turn, may enhance children's acquisition of test relevant cognitive skills or their ability to comply with test demands (Greenfield, 1997).
Associations between test scores and measures of academic and behavior competency documented the external validity of the neuropsychological battery. Specifically, higher scores on a number of tests predicted both higher scores on Arithmetic and lower parent ratings of behavior problems on the CBQFP, even after taking school attendance into account. As in previous studies of North American samples, neuropsychological assessments of our children proved useful both in assessing disease consequences and in identifying predictors of competence in activities of daily living (Berninger & Rutberg, 1992; Morrison & Siegel, 1991; Rourke, 1982; Taylor et al., 1996). The association of test performance with behavior problems has special relevance in rural Africa, where children's behavioral adjustment may be more highly valued by the community than their academic skills (Dasen, 1988; Durojaiye, 1993; Harkness & Super 1977; Serpell, 1993; Wober, 1975).
One of the limitations of the study is that we sampled only a restricted range of cognitive functions and daily living skills. Our battery did not provide a comprehensive assessment of the neuropsychological domains tapped by most Western neuropsychological batteries (Fletcher et al., 1995; Yeates & Taylor, 2001). For example, we included only one test designed to measure attention (Visual Search). Additional assessments of attention and executive function may have enhanced the sensitivity of our test battery to disease effects. Measures of the environment and the child's competence in meeting the demands of everyday living were also of limited scope. The latter limitation is shared with North American and European investigations, which typically have examined only a narrow range of achievement and behavior measures. Cognitive skills are well-recognized predictors of academic achievement (Brown & French, 1979; Campione, 1989; Frisby, 1998; Stevenson et al., 1985; Vernon, 1967; Zeidner, 1987). Few investigations, however, have explored associations of these skills with other aspects of everyday functioning, such as the ability to assist in the raising of children or in maintaining family livelihood. Assessment of more culturally specific skills, including vocational and community functioning, requires a detailed knowledge of what Super and Harkness (1986) describe as the child's “developmental niche.” According to their conceptualization, competencies needed for successful adaptation are determined by physical and socio-cultural demands, customs of child rearing, and parental expectations. Examples of these assessments include the measures of practical abilities devised by Sternberg (Sternberg et al., 2001; Sternberg & Grigorenko, 2002).
Some investigators argue that measures of psychological development may be so culturally specific as to preclude the possibility of measuring a common set of cognitive constructs (Greenfield, 1997; Rogoff & Chavajay, 1995). Acceptance of this view would severely constrain efforts to determine if environmental or disease effects generalize across cultures and to compare the effects of different treatment strategies or contextual influences on disease outcomes. Assessment of common cognitive abilities across cultures would enable broader sampling of disease effects and enhance understanding of disease manifestations. However, such efforts will require the demonstration of cross-cultural equivalence of test constructs. One approach to the challenge of cultural specificity is to determine if test adaptations can be made without altering underlying measurement constructs. The present approach to the challenge of cultural specificity was to determine if test adaptations could be made without altering underlying measurement constructs. The utility of this approach will be supported if test adaptations are sensitive to the effects of a given disease, and if similar disease effects can be detected across cultures. The success of this approach will be further supported by findings showing that test procedures are sensitive to the effects of a variety of diseases and neurodevelopmental disorders (Olness, 2003). Another approach is to focus more exclusively on elemental cognitive processes, such as response speed or visual working memory. The Cognitive Abilities Tests (CAT, Detterman, 1988) is an example of the latter method that has proved useful in assessing disease outcomes in a number of different cultural contexts (Cueto et al., 1998; Lozoff et al., 2000; E. Pollitt, personal communication, April 3, 2002). Performance on the elementary tasks included in the CAT is similarly distributed across different cultural groups. Furthermore, when considered conjointly, these tests may help to account for variability in more complex cognitive functions (Detterman & Thompson, 1997; Fagan, 2000; Nell, 1999).
Although our findings demonstrate the utility of the Kilifi test adaptations and suggest that test modifications can be made without sacrificing construct validity, we do not claim to have established the existence of common latent structures of the mind (Sternberg & Grigorenko, 2002; Vernon, 1971). We do not know if children brought up in different cultures use similar strategies to solve similar problems. Addressing this fundamental issue will require that we submit our test methods to the rigors of experimental manipulation and more direct cross-cultural comparisons. A similar memory process, for example, would be suggested if, after controlling for the effects of differential experiences on task performance, manipulations in task parameters (e.g., quality of stimulus presentation or information load) were the same across cultural groups.
We hope our success in measuring neuropsychological outcomes of cerebral malaria in a rural area of Kenya will encourage others to more systematically assess outcomes of the numerous environmental and health related risks to which thousands of children in developing countries are exposed annually. The sensitivity of our test battery to both biological and social risk factors suggests that the validity of neuropsychological tests extends beyond the culture for which the tests were originally designed. Cultural influences on test performance cannot be avoided, but can be accommodated in a manner that preserves essential psychometric properties and allows comprehensive assessment of multiple cognitive outcomes.
This study was supported by KEMRI, the Wellcome Trust, and The Directors Initiative Fund of the UNDP/World Bank/WHO Special Programme for Research and Training in Tropical Diseases (TDR), Dr. Holding, principal investigator. We would particularly like to thank the field team, H. Katana, and Mzee Magongo. Thanks also go to the staff of the KEMRI Unit for their clinical support, and Cecile Gunning for her contribution to the development of the CBQFP. We are grateful to Tore Godal (WHO/TDR) and Professor Kevin Marsh (Director, Wellcome Trust Research Laboratories Kilifi) for their support in launching this project, and to Professors K. Connolly and R. Serpell for their comments on an earlier draft of the paper. This paper is published with the permission of the Director of KEMRI.
This appendix provides a summary of the strategies applied and changes adopted in developing the Kilifi test battery. For tests for which concerns remain even after our modifications, we recommend further strategies for test development.
1. K-ABC Magic Window requires the child to name a pictured object whilst it is rotated behind a narrow slit that displays the picture in segments. Modifications to this subtest included substituting familiar drawings for the original ones (replacements include a watch for the clock, a cow for the elephant, a cap for the hat, and a ball for the apple) and showing children the complete drawings as feedback following the first three items.
2. In K-ABC Face Recognition, the child selects faces from group photographs following a brief presentation of individual faces. This test was modified by substituting photographs of persons from our region for the original faces. Photographs are much sought after in this culture, but are not commonly available. We allowed unlimited scanning of the first sample item to orientate the children to the task. The orientation procedure and substitution of photographs enhanced task performance.
3. K-ABC Gestalt Closure requires the child to identify and name a series of incomplete silhouette drawings of everyday objects. Substitutions were made for unfamiliar items, such as the camera and typewriter. More familiar exemplars of other items, such as the chair and dog, were also substituted for the original drawings.
4. K-ABC Matrix Analogies involves the selection, from a multiple-choice array, of a design that completes a 2 × 2 visual analogy. We modified this subtest by replacing the initial picture analogies with other simple pattern analogies, as the picture analysis proved too difficult for our children. As a further alternation, we did not apply the orientation rule. Observation of children's responses indicated that the purpose of the task was unclear to them, and revisions extending the instructions thus were made. We recommend that future applications of this task take into account the relative cultural specificity of the material and the demands of the task, and include an extended teaching phase. Particular care needs to be taken with children with little or no school experience. Further investigations should also examine the cognitive construct measured by this task. It is unclear, for example, if the task taps the ability to learn or identify new rules and relationships or if it is more sensitive to attentional skills.
5. In K-ABC Triangles, the child uses triangular rubber pieces, colored blue on one side and yellow on the other, to construct a design depicted in two dimensions. All children included in the early piloting of this task (from two nursery schools) had difficulties in using the rubber triangles. Several were unwilling to even touch the materials and others had difficulty in manipulating them. The rubber triangles were thus replaced with more familiar materials as suggested by a task described by Rutter et al. (1970). The revised task, renamed Construction, requires the child to assemble a shape with wooden sticks to match the shape presented in a three-dimensional template. Test-retest reliabilities were initially low, but test performance and reliability improved when the items were re-ordered by level of difficulty and the number of items was extended.
6. K-ABC Hand Movements requires the child to repeat a series of hand movements in the same sequence as the examiner. Although the children appeared to understand task demands, reliability was initially low. Informal observation suggested that the length of the task might have drawn on children's ability to sustain attention. The development of a format for this test similar to that employed in the British Ability Scales (Elliot et al., 1983) reduced administration time and improved reliability. In this format, the items are grouped by level of difficulty and credit is given for all the items at one level so long as the child is able to pass the first two items at that level. The revised format reduces the number of items that a child must complete to reach the more challenging levels of the task.
7. In K-ABC Number Recall, the child repeats a series of numbers spoken by the examiner. As number words in the Mijikenda languages tend to have more syllables than their English equivalents, we excluded the longer Mjikenda number words from our subtest.
8. In K-ABC Word Order, the examiner reads a list of words to the child, who then has to point to picture representations of the words in the same order in which they were read. Although the original words were appropriate for our children, some of the pictures were replaced by more familiar exemplars to improve recognition (e.g., the shape of the house and cup were altered).
9. In K-ABC Arithmetic, test items are related by a common theme, a trip to the zoo, with colored pictures used as prompts. As a zoo is an unfamiliar concept in our setting, a bus journey to a wedding was used instead. The test items retained after piloting of the task assessed counting, matching of quantity, understanding of ordinal aspects and conservation of number, and applications of subtraction, division, and multiplication. Items excluded were those assessing knowledge of shape words and number symbol identification. Each item was assigned a score of 1–3, representing three levels of difficulty. A score of 3 was assigned if no prompts were provided, a score of 2 if verbal and gestural prompts were given, and a score of 1 if more direct guidance was needed to help the child solve the problem. Sticks were provided to aid in counting and their use actively encouraged.
10. The 40-item Picture Vocabulary Test assessed vocabulary. In this test each stimulus word is accompanied by four black and white drawings, including the correct depiction of the spoken word and three foils. The test is similar to the Peabody Picture Vocabulary Test (Dunn & Dunn, 1981), but includes different pictures and vocabulary items. Test items were based on work done in Kenya by Sigman et al. (1989, 1991; personal communication, July 1993). Further modifications of the Sigman pictures were made to increase familiarity (e.g., pictures of coastal houses were substituted for those of houses found in the interior of Kenya, which have different shapes and use different roofing materials).
11. To assess visual attention, we administered a task similar in design to that used by Baddeley et al. (1995), referred to as Visual Search. In this task, children are presented with a sheet of silhouette drawings organized in rows. The majority of the drawings were the same ones we developed for the K-ABC Word Order task. The task requires that the child draw a line through a target picture, but not through the other pictures. After demonstrating the procedure, the child completes a practice sheet, with immediate prompting to correct mistakes and to emphasize the importance of response speed. Three sheets are presented, with the target stimulus appearing randomly either once or twice in each row. To account for both speed and accuracy, performance is scored according to the formula: (total possible responses − errors)/total time taken.
12. Relative to the syntax or grammar of a language, many aspects of language pragmatics have commonalities across language groups. A measure of Pragmatic Errors developed by Damico et al. (1980, 1983) provides an approach to assessment appropriate in evaluating children whose mother tongue is different from that of the examiner. In our application of this method, speech samples were recorded and the recordings then reviewed to identify pragmatic errors within defined “speech utterances.” Errors were coded as delays before responding, linguistic non-fluencies, revisions, non-specific vocabulary, inappropriate responses, poor topic maintenance, and need for repetition.
To elicit speech samples, we first explored the use of story telling sessions. Children from cultures with an oral tradition are likely to be familiar with this format (Olson, 1976). However, we found children reticent to speak in adult company, with only 22 of 56 children (39%) in our sample producing a story of any length. Three children refused to speak at all during testing, despite performing adequately on the K-ABC measures that did not require verbalization. We therefore developed a multi-step procedure for collecting speech samples.
During the initial cognitive assessment session, the child was introduced to the Dictaphone machine, as often this was the first time the children had listened to recordings of their own voices. The child was then slowly encouraged, through a series of increasingly more open-ended tasks, to tell a story and to speak freely to the assessor. After being asked some prepared questions about the child's home, the child was asked to tell riddles (Ngub'usim, 1988), a common pastime for children in our community. Story telling was then introduced, and if children failed to volunteer a story, or if the story was brief, they were told a prepared story accompanied by pictures. Further sets of pictures were used to encourage the children to produce their own stories. Children and their parents were asked to prepare a further story for the subsequent visit by practicing “a long story” with brothers and sisters at home. On the second visit, the examiner taped and transcribed the children's spontaneous stories or, failing that, the stories they produced to accompany pictures presented by the examiner. Scoring was based on these taped language samples.
Despite our efforts, reticence to speak freely with the examiner present continued to be a major barrier to the elicitation of lengthy speech samples. Two possible future approaches might be to extend the data collection over several sessions, or to elicit information on pragmatics from parental interview (e.g., Carter et al., 2003).
13. Behavior was assessed by the Child Behaviour Questionnaire for Parents (CBQFP), a 15-item scale similar in content and format to the Behaviour Screening Questionnaire (BSQ, Richman et al., 1982). CBQFP items describe the child's development in the areas of social cognition, self-help skills, independence, tolerance for change/frustration, attention/concentration, antisocial behavior, and emotional lability. The questionnaire is administered in an interview format, with parental responses recorded by the interviewer for later scoring. Responses are coded on a 3-point scale (3: always a problem; 2: sometimes a problem; 1: not a problem).
The CBQFP was developed from an initial list of over 50 items compiled from several sources, including Vineland Adaptive Behavior Scales (Sparrow et al., 1984), the BSQ (Richman et al., 1982), and the Screening Test for Children from 6 Months to 6 Years (Kenya Institute of Science and Education, 1984). Items were translated into Kiswahili and Kigiriama through a system of back translation. Item selection and an appropriate interview format were developed through a series of 40 interviews monitored by a panel consisting of a research assistant fluent in the local vernacular and two psychologists. At the end of each interview, the parent was encouraged to comment on the content and procedure. Parent responses and comments were considered in developing the final format of the interview.
Items were discarded if it was not possible to find an appropriate translation (e.g., we were not able to identify equivalents for “gluttony” and “teasing”) or if greater than 12% of the pilot sample of parents gave no response. Items to which parents were unresponsive included questions concerning lying, respect for elders, stealing, having confidence, understanding danger, avoiding eye contact, and using bad language. Many of the items pertained to “moral development' and their meaning may have been unclear when translated. Parents may also have been reticent to discuss these issues with strangers. Differences in responses according to child age and gender were investigated, and items that showed some variability at the 6-year-old level and were appropriate for both boys and girls were retained.
Our experience with an initial set of 40 questionnaires indicated that parents found the 3-point scale too rigid. A more conversational approach was thus adopted in which the interviewer was supplied with a series of related prompts to guide the parent's description of their child's behavior. Parental responses were recorded for later coding by the interviewer. Parents were more willing to share their impressions of the child using this format, rather than to simply give yes or no responses. A final draft was piloted on 10 additional parents to insure ease of administration and clear and consistent recording of parent responses.