INTRODUCTION
According to a recent (December 2005) report by the Chinese Ministry of Health, the United Nations Program on HIV/AIDS (UNAIDS) and the World Health Organization (WHO), the HIV epidemic in China has been growing exponentially over the past several years. The reported estimates were that 650,000 people are living with HIV infection in China, with 70,000 new cases of infection occurring in 2005. Also, although the HIV epidemic in China began among injection drug users and commercial sex workers, it has been spreading from these high risk groups to the general population (China Ministry of Health and UN Group report, 2003). Because China is the most populous country on earth, with more than 1.3 billion people, the public health consequences of the Chinese HIV epidemic are of substantial international concern.
HIV enters the central nervous system (CNS) early after infection, and frequently is associated with structural and functional brain abnormalities (Grant et al., 1987; Masliah et al., 1992; Navia et al., 1986). Risks for significant brain involvement are highest in the more advanced stages of HIV disease (approximately 50%), but such changes also occur in approximately a third of early, medically asymptomatic carriers (Heaton et al., 1995; White et al., 1995).
Much of the research concerning HIV-related brain involvement has used standardized neuropsychological (NP) tests as outcome measures (Grant & Martin, 1994; White et al., 1995). The sensitivity of NP testing to brain disorders of diverse etiologies is well established (Lezak et al., 2004). Within the context of HIV infection, results of NP testing have shown robust associations with structural and functional brain imaging (Jernigan et al., 1993; Stout et al., 1998), as well as with postmortem neuropathology findings (Cherner et al., 2002; Everall et al., 1999; Masliah et al., 1997; Moore et al., 2006). Moreover, NP impairment in HIV-infected persons has been shown to be an independent predictor of early mortality (Ellis et al., 1997; Mayeux et al., 1993) and to be strongly predictive of a wide variety of difficulties in activities of daily living (Heaton et al., 2004a).
With a few notable exceptions (Maj et al., 1993, 1994; Yang et al., 1999; Yepthomi et al., 2006), most NP research with HIV+ populations has been conducted in Western countries, usually with NP tests that were standardized in English. It is unclear whether their results will generalize to other parts of the world, where the HIV epidemic has reached massive proportions, and where the specific manifestations of HIV-associated brain disease are largely unknown.
Chan et al. (2003) recently reviewed 123 NP studies (none of which examined neuroAIDS) that were conducted in Asian countries between 1981 and 2002. Studies in mainland China and Hong Kong primarily used measures that were originally standardized in the United States and then translated, sometimes in modified form, for use with Chinese populations. These included well-known instruments, such as the Wechsler Adult Intelligence Scale-Revised (WAIS), the Wechsler Memory Scale (WMS), Chinese adaptations of the Halstead Reitan Battery and Luria Nebraska Battery, the Wisconsin Card Sorting Test, the Category Fluency Test, Color Trails, the Hiscock Forced Choice Digit Memory Test (for assessing effort), and various aphasia batteries and dementia screening instruments. In an effort to identify tests that might have immediate applicability in clinical and research applications, Chan et al. determined whether measures (1) had evidence of accurate translation (by independent back-translation into English), (2) used local norms that included Asian samples of more than 50 subjects, and (3) provided evidence of convergent or discriminant validity, and cross-cultural equivalence. The authors reported that only two tests used in mainland China and three in Hong Kong met these criteria. These included two dementia screening tests, two clinical memory tests, and a category fluency test.
The authors concluded that, although many Western NP tests have been used in China, adequate local/national norms and evidence of cross-cultural equivalence of the tests are frequently lacking. This finding may have more to do with a dearth of research than limitations of the tests. Nevertheless, the relative lack of proven instruments and local norms is a major challenge in designing neurobehavioral research in China.
Starting in early 2003, the HIV Neurobehavioral Research Center (HNRC) group, from the University of California at San Diego (UCSD), began discussing with investigators from the China Comprehensive International Program for Research on AIDS (CIPRA) the possibility of establishing a collaboration to conduct neuroAIDS studies in China. To determine the feasibility of such a study, a pilot project was initiated to examine the adaptability and validity of neurocognitive assessments used at HNRC, in HIV-infected persons (HIV+) in China. In this article, we report our findings on these neurocognitive assessments in HIV+ and demographically matched HIV-uninfected persons (HIV−) in China, and compare their results with those from HIV+ and demographically matched HIV− persons in the United States. As such, this study is the first to examine effects of HIV on neurocognitive status in a Chinese sample, and also the first to compare results between China and the United States to explore the cross-cultural validity of the NP instruments used.
METHODS
The China HIV Neurocognitive Feasibility Study was launched in early 2004 after more than 1 year of preparatory work that included face to face meetings by investigators from both countries, agreement on research objectives, discussion of feasibility and sampling design, selection of research instruments, obtaining permission from U.S. test publishers to translate and adapt Western tests for use in the research, then translation and back-translation of the selected English language instruments, revision and modification of culturally inappropriate items, and the training of examiners in China. This research study was approved by the Institutional Review Boards from China Center for Disease Control (CDC) and National Center for AIDS (NCAIDS), and UCSD. Written informed consent was obtained from all participants after the research procedure had been fully explained to them.
Participants
The Chinese participants were HIV+ men and women 18 to 50 years of age, and demographically comparable HIV− controls. Individuals with a history of non–HIV-related confounding neuromedical factors that might potentially cause impairment of neurocognitive function were excluded from the study. These exclusion criteria included head injury with unconsciousness greater than 30 min, non–HIV-related neurological disorders (e.g., epilepsy, stroke), psychotic disorders (schizophrenia and bipolar disorder), and potentially significant levels of current substance use, defined as more than two alcoholic drinks per day over the past 30 days, or use of any illegal drugs three times or more per week in the past 30 days. One HIV+ participant was excluded due to illiteracy in Mandarin (which was not his first language), and one HIV− participant was excluded due to visual disability.
To sample from both urban and rural populations in China, this study was conducted in two different areas: the city of Beijing and rural Anhui province. Participants in Beijing were enrolled from the HIV clinic at You An Hospital, and those in Anhui were recruited from a local HIV screening station sponsored by the China Comprehensive AIDS Response (CARES) program. Recruitment for this study was conducted via word-of-mouth, posted announcements, and flyers distributed by the local CDC clinics, and nurse recruiters at HIV clinical sites. All participants signed the study consent before they were formally enrolled in this study. The Beijing site was picked because two investigators (Drs. Wu and Yu) were physically located in Beijing and able to provide immediate supervision during this feasibility study. After the initial group of 21 HIV+ and 16 HIV− participants were recruited and tested in Beijing, and this experience was deemed to be satisfactory, the protocol was field-tested with 7 more HIV+ and 7 more HIV− participants in the rural Anhui province because Anhui province is the chosen site for a larger follow-up investigation. Altogether, 28 HIV-infected participants and 23 HIV− controls completed the feasibility study. All HIV− participants received the HIV Quick Test (OraSure Technologies, Inc., Bethlehem, PA) to confirm their HIV status before being enrolled.
To explore the generalizability of HIV effects on Western test instruments across cultures, U.S. HIV+ and HIV− participants also were included. These participants were selected from larger samples from the HNRC to be as comparable as possible to the China samples with respect to demographic characteristics, stage of HIV disease, treatment status, and absence of non–HIV-related risk factors for neurocognitive impairment (see previously described criteria of exclusion).
The demographic and clinical characteristics of all groups are presented in Table 1. Chinese and U.S. groups were comparable for age and gender. Despite our effort to find education-comparable U.S. samples, education levels in China tend to be lower and we were unable to match across countries on this variable. However, there were no differences between HIV+ and HIV− groups within either country.
Most HIV+ individuals in both countries had advanced disease and met U.S. CDC criteria for AIDS (CDC, 1993). Also, most HIV+ participants were being prescribed antiretroviral treatment, although the U.S. sample was more immunosuppressed (lower CD4 count, see Table 1). However, in both the Chinese and U.S. samples, current CD4 cell counts were not correlated with a summary measure of NP performance (The Global Deficit Score, GDS defined below). Lastly, results on the Beck Depression Inventory (BDI; Beck, 1976) and the Patient's Assessment of Own Functioning Inventory (PAOFI, Chelune et al., 1986) indicate that both HIV+ groups, relative to their respective HIV− controls, had comparably increased depressive symptoms and self-reported cognitive difficulties in their everyday lives (Table 1).
Procedure
In addition to providing demographic information, every participant underwent a comprehensive neurocognitive evaluation, structured psychiatric examination, as well as assessment of daily functioning. In HIV+ participants, HIV staging and recent CD4 counts were also recorded.
Neuropsychological Evaluation
Test selection and adaptation
Together, the U.S. and Chinese investigators evaluated the entire HNRC neurobehavioral protocol to gauge each test's appropriateness for use in China. Some of the tests (or very similar instruments) had been used in China for some time and were known to be acceptable. Most of the other instruments in the HNRC battery also were judged to be acceptable with few or no modifications (other than translating the tests into Mandarin). It was anticipated that some participants would have little or no education; only one of the tests requires participants to read (and just a few simple words). A few tests require some simple counting or addition of single-digit numbers; the Chinese investigators thought all participants could do this, because even those with minimal education perform simple math and counting in their daily lives. Once the instruments were agreed upon, we contacted the publishing companies and obtained permission to translate these instruments and instructions into Mandarin. The Chinese investigators translated these instruments from English to Mandarin. Back-translation was performed independently by a professional translator with no prior knowledge of these instruments.
The test battery selected for the current research taps multiple cognitive–motor ability domains that repeatedly have been found to be affected by HIV-associated brain disease in the United States (listed in Table 2). The battery was carefully reviewed by Chinese mental health professionals, who considered the U.S. tests to be culturally appropriate for the study populations in China. With the permission of the test publishers, minor adaptations of certain items were made to improve their familiarity/understandability by Chinese subjects (e.g., a few items from the word list of the Hopkins Verbal Learning Test-Revised). Because Mandarin writing uses characters instead of letters, in the Chinese battery, we substituted three tests of the same ability domain that do not use English letters: WMS-III Spatial Span for WAIS-III Letter-Number Sequencing; Color Trails II for Trails B; and Action Fluency for Letter Fluency (see test references in Table 2).
Training and data quality assurance
To implement the feasibility study, the HNRC team traveled to Beijing in January 2004 to train five Mandarin bilingual examiners from the Institute of Mental Health at Peking University. Before this training session, training videotapes, test manuals, and test equipment and forms were sent to Beijing for review and practice. During the 1-week training session in Beijing, each test was demonstrated, and its purpose and administration nuances were discussed. Several rounds of “mock testing” were conducted. Certification sessions subsequently took place using staff volunteers from the hospital as test subjects. All certifications were done in Mandarin and were monitored by bilingual investigators from both countries. All five trainees were certified during the training week. The examiners continued to practice test administration before initiation of the feasibility study, and these test protocols were checked and double-scored by the U.S. team.
Assessment of Mood and Subjective Complaints
One of the purposes of this study was to obtain preliminary information on the relationships among mood disturbances (see detailed protocol in Jin et al., 2006), subjective complaints of cognitive difficulties in everyday life, and objective test evidence of neurocognitive impairment. Mood symptoms were assessed using the Mandarin version of the BDI-I, a 21-item self-report scale (Beck, 1976; Ping, 1993; Zheng, 1987). As in the English version, each item of BDI has four response options (0 to 3) of graded severity. Subjective neurocognitive complaints were assessed using the PAOFI, which was translated into Mandarin and back-translated to check accuracy (Chelune et al., 1986). The PAOFI includes 33 items on which participants rate themselves as having neurobehavioral difficulties in their everyday lives, using a six-point scale: almost never, very infrequently, once in a while, fairly often, very often, and almost always, in domains of memory, language and communication, sensory–perceptual and motor skills, and higher level cognitive functions. The score used is the sum of items on which the participants reported experiencing difficulties as either “fairly often,” “very often,” or “almost always” (Chelune et al., 1986). Employment status was derived from question 35 of the PAOFI questionnaire.
Quality Control of Data
To ensure data quality, we conducted weekly phone conferences immediately after the field study started, to discuss and answer any questions generated from the field testing. Daily e-mail communications also were used to answer and solve any testing-related questions. Completed test booklets were copied and sent by overnight mail to the U.S. weekly for review by bilingual HNRC personnel. If any illogical answers or unclear responses were found, those cases were reviewed with testers in China during the weekly phone conference. Further verification and corrections were made before the data were coded and entered into the HNRC database in the United States.
Data Analysis
To create domain and overall neuropsychological summary measures, scores on the individual tests were placed on a common metric. For this, we chose to use demographically uncorrected scaled scores, which in large U.S. normative samples have a mean of 10 and a standard deviation of 3 (e.g., Heaton et al., 2004b). Although the scaled scores on tests from the Wechsler Intelligence and Memory scales normally are age-corrected, we avoided this by using the mean age for the study group (i.e., 35) instead of the participants' actual ages. Average scaled scores for the tests within each ability domain and across the entire test battery were then analyzed using two-way analyses of variance (ANOVAs), to explore effects of HIV status, country, and possible interactions between these factors. Of most interest within these analyses are the effects of HIV status, and the Country-by-HIV interactions. If the latter interactions were found to be significant, this finding would suggest that any NP effects of HIV were different in the two countries. Given the exploratory nature of these analyses, the alpha level was designated at 5%, two-tailed.
RESULTS
Table 3 summarizes the results of all four groups on the individual NP test raw scores.
HIV Effects in the China Groups
On every individual NP test measure, the mean of the Chinese HIV+ group was worse than that of the Chinese HIV− group, regardless of whether raw scores or scaled scores were used. Medium HIV effect sizes (range, .42 to .65) were noted for both raw scores and scaled scores on 7 of the 14 individual test measures (Verbal fluency for Action, WAIS-III Digit Symbol, Trails A, WMS-III Spatial Span, Color Trails-II, Hopkins Verbal Learning Test-Revised, and Hopkins Verbal Learning Test-Revised Total Learning; see Figure 1). The Global mean scaled score also yielded a medium HIV effect size (0.55), and HIV effect sizes for domain scaled scores ranged from 0.22 to 0.60. Moreover, across all 14 measures, the effect sizes for raw scores and scaled scores were highly correlated (r = .97; p < .0001; Figure 1).
Effects of HIV and Country on NP Mean Scaled Scores
Table 4 summarizes the Global and Domain mean scaled scores of the four groups, as well as the results of the 2 × 2 (HIV status by country) ANOVAs on these measures. The ANOVAs revealed robust HIV effects for the Global score and all Domain scores. Significant country effects were found for two ability domains: Verbal Fluency and Speed of Information Processing. The U.S. group obtained higher scores on these domains, but there were no significant country effects on the global NP summary score or on the remaining five NP Domain scores. Finally, there were no significant HIV-by-Country interaction effects in any of these ANOVAs.
To explore the effect of slightly different NP test measures between the U.S. groups and the Chinese (See Tables 2 and 3 for details), we re-computed the Global mean scaled scores deleting these tests (see prorated Global score in Table 4). We found similar results on the 2 × 2 ANOVA: a robust HIV effect, and no significant country effect or Country-by-HIV interaction.
Self-Report of Depressed Mood and Functional Difficulties
Our Chinese HIV+ group demonstrated considerably more depressed mood than their HIV− controls, but not compared with their HIV+ U.S. counterpart (see Table 1). On the PAOFI, similar findings were observed: there were significantly more cognitive complaints in both the HIV+ samples compared with both the HIV− samples, while the Chinese and U.S. HIV+ groups did not differ (see Table 1).
NP performances (Global mean Scaled Score) were relatively independent of depressed mood (i.e., BDI) in both HIV+ samples (r = −.19; p = .32 for Chinese HIV+, and r = −.22; p = .20 for U.S. HIV+). On the other hand, cognitive complaints on the PAOFI were very strongly associated with depressive symptoms as reported on the BDI, in both HIV+ samples (r = .55; p < .003, for Chinese HIV+; and r = .70; p < .0001 for U.S. HIV+). Cognitive complaints (PAOFI) were significantly related to NP performance (Global mean scaled score) in the China group (r = −.47; p = .02; for Chinese HIV+) and showed a trend in the U.S. group (r = −.30; p = .09; for U.S. HIV+). Lastly, employment status in the Chinese HIV+ group showed a modest correlation with NP performance (r = .35; p = .06, between employment status and Global mean scaled score).
DISCUSSION
The above results, while admittedly tentative, are encouraging in several respects. First, the NP test battery that was chosen and adapted for use in this study was understood and accepted by the Chinese HIV+ and HIV− participants, from both urban and rural areas of the country, whose education levels ranged widely (4 to 18 years). Second, the battery, which was translated into Mandarin, appears to hold considerable promise for identifying and characterizing behavioral effects of HIV brain involvement in China. Finally, the consistent HIV effects across countries and the modest country effects in the current study (Table 4) suggest that these tests may have reasonable cross-cultural equivalence. Further psychometric analyses with much larger samples of neurologically normal and abnormal people are needed to confirm this.
The NP test batteries between the two countries were identical except for three NP tests that were substituted for the Chinese individuals to increase the cross-cultural suitability (Spatial Span for Letter Number-Sequencing; Color-Trails II for Trail Making Test B; and Action Fluency for Letter Fluency). The first test substitution was chosen to respect findings of factor analyses of the WMS showing that Spatial Span and Letter Number-Sequencing grouped with the same factor (WMS-III, 1997). In addition, Color-Trails II was designed to tap similar sequencing and processing requirements as Trail Making B (Maj et al., 1993). Finally, Action and Letter Fluency are both tests of word generation that tap into frontal neural systems (Woods et al., 2005). Despite these similarities, slight differences in tests may have influenced the HIV effects observed in the China and U.S. cohorts. To check for this possibility, we re-computed the overall mean scaled scores without these tests and found identical results (Table 4).
Country effects on two NP Domain scores are difficult to interpret for two reasons: the significant education differences between the U.S. and China groups probably influenced there results to some degree, and different tests were used for the Verbal Fluency domain in the two countries (Letter Fluency for U.S. and Action Fluency for China). Nevertheless, the most important findings in these ANOVAs were the consistently significant HIV effects for the total test battery and all seven ability domains, and the absence of any significant interaction effects. These results suggest comparable, significant effects of HIV on NP functioning in both countries.
Closer inspection of the pattern of HIV-related deficits reveals that NP impairment was dominated by deficits in abstraction/executive function, information processing speed, as well as learning in Chinese HIV+ individuals (Figure 1). This pattern of deficits is generally consistent with what has been shown in Western studies (Cysique et al., 2006; Grant & Martin, 1994; Reger et al., 2002).
High rates of depressed mood in HIV-infected persons were not significantly associated with NP performance in the United States or China. These preliminary results are consistent with a large body of literature showing that depressive symptoms do not account for neurocognitive impairment in HIV-infected individuals (Cysique et al., 2007; Goggin et al., 1997). In contrast, depressive complaints were strongly associated with self-report of neurobehavioral problems in both countries. This finding also is consistent with what has been reported in North American HIV cohorts (Carter et al., 2003; Rourke et al., 1999).
Consistent with U.S. findings on the impact of HIV-associated neurocognitive impairment on instrumental activities of daily living (Heaton et al., 2004a) and employment (Albert et al., 1995; Heaton et al., 1994), we found, in the Chinese HIV+ group, a correlation between NP functioning and complaints of cognitive difficulties in everyday life (i.e., PAOFI), and also a modest association between unemployment status and overall level of NP performance. However, given the major cultural and lifestyle differences in these two countries, much more research is needed to understand the functional consequences of general and specific neurocognitive deficits in China. In particular, because NP impairment has been shown to predict poor medication adherence among HIV+ individuals in the United States (Hinkin et al., 2004), it would be important to determine whether NP-impaired HIV+ people in China are at similar risk for poor medication management and possibly worse antiretroviral treatment outcome (d'Arminio Monforte et al., 1998).
A major limitation of this pilot study is the absence of established normative standards for Chinese people on the NP test battery. Although NP norms with appropriate demographic corrections were available for the U.S. participants, it is considered likely that demographic effects are different in China. For example, very low levels of education and illiteracy are not represented in the U.S. normative samples, but such backgrounds still are relatively common in China (especially in the rural areas). As a consequence, larger education effects might be expected within our China study groups. Unfortunately, our sample sizes were much too small to address this problem in the current study. Future large scale studies of healthy Chinese with diverse backgrounds will be needed to develop norms that can be used confidently to classify disease-related “impairments” in individual cases. However, the robust HIV effect sizes in our demographically matched Chinese samples (Figure 1) strongly suggest that these tests have potential for identifying individuals within the Chinese population who are suffering from CNS complications of HIV. Thus, our results would support continued work to better understand the nature and causes of NP performance differences in the normal Chinese population, as well as in people affected by HIV infection and other diseases of the CNS.
ACKNOWLEDGMENTS
This study was supported by the NIMH grant 5 P30 MH62512 (Dr. Grant, P.I.). The HIV Neurobehavioral Research Center (HNRC) is supported by Center award MH 62512 from NIMH. *The San Diego HIV Neurobehavioral Research Center (HNRC) group is affiliated with the University of California, San Diego, the Naval Hospital, San Diego, and the Veterans Affairs San Diego Healthcare System, and includes: Director: Igor Grant, M.D.; Co-Directors: J. Hampton Atkinson, M.D., Ronald J. Ellis, M.D., Ph.D., and J. Allen McCutchan, M.D.; Center Manager: Thomas D. Marcotte, Ph.D.; Naval Hospital, San Diego: Braden R. Hale, M.D., M.P.H. (P.I.); Neuromedical Component: Ronald J. Ellis, M.D., Ph.D. (P.I.), J. Allen McCutchan, M.D., Scott Letendre, M.D., Edmund Capparelli, Pharm.D., Rachel Schrier, Ph.D.; Neurobehavioral Component: Robert K. Heaton, Ph.D. (P.I.), Mariana Cherner, Ph.D., Steven Paul Woods, Psy.D.; Neuroimaging Component: Terry Jernigan, Ph.D. (P.I.), Christine Fennema-Notestine, Ph.D., Sarah L., Archibald, M.A., John Hesselink, M.D., Jacopo Annese, Ph.D., Michael J. Taylor, Ph.D., Brian Schweinsburg, Ph.D.; Neurobiology Component: Eliezer Masliah, M.D. (P.I.), Ian Everall, FRCPsych., FRCPath., Ph.D., T. Dianne Langford, Ph.D.; Neurovirology Component: Douglas Richman, M.D., (P.I.), David M. Smith, M.D.; International Component: J. Allen McCutchan, M.D., (P.I.); Developmental Component: Ian Everall, FRCPsych., FRCPath., Ph.D. (P.I.), Stuart Lipton, M.D., Ph.D.; Clinical Trials Component: J. Allen McCutchan, M.D., J. Hampton Atkinson, M.D., Ronald J. Ellis, M.D., Ph.D., Scott Letendre, M.D.; Participant Accrual and Retention Unit: J. Hampton Atkinson, M.D. (P.I.), Rodney von Jaeger, M.P.H.; Data Management Unit: Anthony C. Gamst, Ph.D. (P.I.), Clint Cushman, B.A., (Data Systems Manager), Michelle Frybarger, B.A., Daniel R. Masys, M.D. (Senior Consultant); Statistics Unit: Ian Abramson, Ph.D. (P.I.), Christopher Ake, Ph.D., Deborah Lazzaretto, M.S. The views expressed in this article are those of the authors and do not reflect the official policy or position of the Department of the Navy, Department of Defense, nor the United States Government.