INTRODUCTION
Leishmaniasis, caused by the Leishmania parasite, is a worldwide endemic disease, with an estimated disease burden of 2357000 disability-adjusted life years and 59000 deaths per year (WHO, 2002). This disease ranges in severity from a healing skin ulcer to an overwhelming visceral form. The visceral form, the most severe form, is caused by the species of the Leishmania donovani complex, and principally by Leishmania infantum and L. donovani. These species are associated with different epidemiology, ecology and pathology: L. infantum is anthropozoonotic with a dog reservoir and can produce visceral and cutaneous forms in humans, whereas L. donovani is largely anthroponotic and produces mainly the visceral form.
Several pathogenic protozoan parasites, like Leishmania, express multiple cysteine protease (CP) enzymes. These enzymes pertain to the papain superfamily and within the Leishmania genus, 3 types of CPs have been revealed: CPA and CPB, which have homology to mammalian cathepsin L, and CPC, which has homology to mammalian cathepsin B (Mottram et al. 1997, 1998; Sajid and McKerrow, 2002). Cysteine proteases are important for Leishmania survival, host cell infection, and evasion of the host immune response, and they have attracted considerable interest as targets for the design of new chemotherapy (Alexander et al. 1998; Beyrodt et al. 1997; Coombs and Mottram, 1997; Frame et al. 2000; McGrath et al. 1995; Mottram et al. 1996; Sajid and McKerrow, 2002; Souza et al. 1992). Furthermore, cysteine proteases are putative virulence factors of Leishmania parasites. Cpa and cpc are single-copy genes, whereas cpb is a multicopy gene. Differences in copy number and nucleotide sequences exist among the different Leishmania species. For example, Leishmania mexicana cpb, the most studied, are located in a single locus of 19 copies arranged in a tandem repeat (Mottram et al. 1996, 1997), whereas the L. donovani complex seems to be composed of 5 copies. Indeed, Mundodi et al. (2002) compared 1 L. donovani strain and 1 L. chagasi (syn L. infantum) strain and revealed at least 5 tandemly arranged genes. They observed that the cpb cluster of the L. donovani strain contains a cpbF copy which is absent from the cluster of the L. infantum strain. Furthermore, another copy called cpbE is distant from the cluster for the L. donovani strain, whereas it belongs to the cluster of the L. infantum strain (Mundodi et al. 2002). To explore the L. donovani complex, we focused our study on cpbE and cpbF. We sequenced them for a sample representative of the clinical (strains isolated from cutaneous leishmaniasis and visceral leishmaniasis) and genetic diversity of L. donovani/L. infantum. Phylogenetic analysis and protein predictions were conducted in order to compare these copies with other trypanosomatid cpb and those within the L. donovani complex. Evolutionary interpretations and potential clinical implications are discussed.
MATERIALS AND METHODS
Parasites
A sample of 15 strains representative of geographical and genetic diversity within the L. donovani complex was studied (Table 1). The 15 strains were isolated either from visceral or cutaneous human leishmaniasis or from phlebotomine sand fly and pertain to either the L. donovani species (6 strains) or to the L. infantum species (9 strains). All the Leishmania strains were typed by MLEE by the WHO reference centres of Montpellier, France (MON) and London (LON).

Cell culture
Promastigote cultures were maintained at 26 °C by weekly subpassages in RPMI 1640 medium, buffered with 25 mM HEPES, 2 mM NaHCO3 and supplemented with 10% heat-inactivated fetal calf serum, 2 mM glutamine, 100 U/ml penicillin and 100 μg/ml streptomycin. Cultures were harvested by centrifugation and stored at −80 °C until DNA extraction.
Specific PCR of cpbEF copies
Genomic DNA was extracted from parasite pellets by phenol/chloroform extraction. The optimal conditions for cpbEF amplification in 30 μl were: 6 pmol of each primer (forward: 5′-CGTGACGCCGGTGAAGAAT-3′; reverse: 5′-CGTGCACTCGGCCGTCTT-3′), 4·5 nmol dNTPs, 1 U Taq polymerase (Roche Diagnostics), 3 μl of Buffer 10X and 10 ng of genomic DNA. Thirty cycles were necessary for amplification (denaturation 30 s at 94 °C, annealing 1 min at 62 °C and elongation 1 min at 72 °C) followed by 10 min at 72 °C. The amplification reactions were analysed by agarose gel electrophoresis, followed by ethidium bromide staining and visualization under UV light. Two lengths of amplification products were generated: 702 bp for cpbE and 741 bp for cpbF.
Cloning and sequencing
For the 15 strains of the L. donovani complex, the PCR products were cloned into pGEM-T vector (pGEM-T Easy Vector System I, Promega) and transformed into competent E. coli (JM109). Nucleotide sequences were obtained by automated sequencing (ABI PRISM™ 310 Genetic Analyzer, Applied Biosystems) and chromatograms were analysed with Chromas 2.23 (Technelysium Pty Ltd, 1998–2002).
Sequence analysis
For phylogenetic analysis, our cpbE/cpbF sequences were compared with 4 available sequences of L. donovani (GenBank Accession no. AF309627), L. infantum (AF217087 and AJ628943) and L. major cpb (U43706). Nucleotide and protein sequences were aligned using the Multiple Sequence Alignment Program, ClustalX (version 1.81, June 2000) (Thompson et al. 1997). Phenetic and phylogenetic analyses were performed with the PHYLIP package: we used distance methods with DNADIST, PROTDIST, NEIGHBOR programs and parsimony methods with SEQBOOT, DNAPARS, PROTPARS and CONSENSE programs. Concerning the phenetic analysis, various distance types were used to build the trees: Kimura two-parameter (Kimura, 1980), Jukes-Cantor (Jukes and Cantor, 1969) and Maximum Likelihood (Felsenstein, 1981). For parsimony analysis, the bootstrap analyses were performed for 1000 replications to estimate the robustness of the nodes. All the trees were constructed using TreeDyn software ((Chevenet et al. 2006), http://www.treedyn.org). The genotypic diversity rate (for either cpbE or cpbF sequences) was obtained by dividing the number of divergent sequences with the total number of sequences analysed.
The DNA sequences were also analysed considering the mutation sites and several available sequences were used to compare CPBEF sequences with other CPBs of Leishmania such as L. chagasi ldccys1 (GenBank Accession no. AF004592), L. mexicana lmcpbr (Z14061), L. mexicana cpb18 (Y09958), L. mexicana lmcpb2.8 (Z49962), L. mexicana cpb1 (Z49963), L. mexicana cpb2 (AJ319727) and L. mexicana cpb19 (Z49965).
Predicted protein structure
The impact of amino acid mutations on the 3D protein structure was studied with Swiss-Model (swissmodel.expasy.org/spdbv). The crystal structure of cruzain (McGrath et al. 1995) was used as the reference structure. Identity with N-glycosylation sites was checked using NetGlyc1.0 (www.cbs.dtu.dk/services/NetNGlyc). Predicted MHC Class-I and II binding regions (T-cell epitopes) were analysed using web servers: MAPPP (MHC-1 Antigenic Peptide Processing Prediction – www.mpiib-berlin.mpg.de/MAPPP), SYFPEITHI (www.syfpeithi.de), RANKPEP (mif.dfci.harvard.edu/Tools/rankpep.html), and ProPred (www.imtech.res.in/raghava/propred/). Concerning B-cell epitopes, prediction of antigenic peptides was based on amino acid residues in experimentally known segmental epitopes (Prediction Antigenic Peptides – www.mifoundation.org/Tools/, ABCpred – www.imtech.res.in/raghava/abcpred/index.html) and on antigenic peptides, which should be located in solvent-accessible regions identified from 3D structures.
RESULTS
PCR products were sequenced for the entire sample (15 strains). All sequences were submitted to GenBank™ (see GenBank Accession numbers in Table 1). For the 6 L. donovani strains, only the cpbF copy was obtained, whereas the 9 L. infantum strains contained only cpbE, except 1 strain. Indeed, this strain, LIPA59 from Algeria and isolated from the cutaneous form, had a mixed pattern comprising the 2 cpb copies E and F. This pattern (E and F) has been obtained from different cultures of a given LIPA59 clone (obtained by micromanipulation) and also from different clones. Thus, this pattern could not be due to a mixture of 2 strains or just to a contamination. Consequently, 16 sequences were obtained in this study. Other DNA samples from the L. donovani complex (L. archibaldi, L. donovani from India and L. infantum from Greece and Tunisia) have been amplified (data not shown) and gave similar results: L. infantum only contained the cpbE copy whereas L. donovani and L. archibaldi contained the cpbF copy. Their amplification products have not been sequenced.
Phylogenetic analysis
To obtain a comprehensive view of cpbEF polymorphism within the L. donovani complex, we present here the dendrogram constructed using the parsimony method on DNA sequences (Fig. 1). See URL (http://www.treedyn.org/hide/hide2005b.html) containing the coloured dendrogram and a direct link to GenBank. The 2 sequences of L. infantum LIPA59 were excluded because their lengths were too short, and the gaps in 5′ and 3′ ends were removed to perform phylogenetic analysis on the common part among the sequences considered. All the phylogenetic analyses detailed in the Materials and Methods section (based either on genetic distances or characters) using DNA or protein sequences, gave congruent results. Thus, we present here only the dendrogram built by the parsimony method on DNA sequences after bootstrapping. In Fig. 1, the L. major cathepsin-L like sequence (GenBank Accession no. U43706) was used as outgroup. The dendrogram showed that the L. donovani species were more polymorphic than L. infantum, as illustrated by a genotypic diversity rate of 0·857 (6/7) and 0·3 (3/10), respectively. L. infantum strains were clustered together (included AF217087 and AJ628943) with a bootstrap value of 100 and appeared to be a subunit of L. donovani. Within the L. donovani species, there were 3 groups. The strains GILANI (Sudan), HU3 (Ethiopia) and IS2D (GenBank Accession no. AF309627, Sudan) belonged to the external group (bootstrap value: 90·6). The remainder of the strains were isolated from this group with a weak bootstrap value (55·3) with, on the one hand, 2 Kenyan strains (MRC(L)3 and LRC-L57), and on the other hand the HUSSEN (Ethiopia) and WangJie1 (China) strains. This last group was joined with the L. infantum sample (bootstrap value: 33·5). Within L. infantum, only 2 of the 4 strains isolated from human cutaneous forms (BCN1 and LEM1098) were separated from the other strains.

Fig. 1. Phylogenetic analysis of cpbEF within the Leishmania donovani complex. Dendrogram constructed with cpbE and cpbF sequences using the parsimony method. Four additional strains were added (*): L. major U43706 was used as outgroup and 3 sequences available on GenBank: AF309627 (L. donovani cpbF), AF217087 and AJ628943 (L. infantum cpbE). Bootstrap values are shown below the branches. Please go to http://www.treedyn.org/hide/hide2005b.html for coloured dendrogram with direct links to GenBank.
Sequence analysis of cpbE and cpbF copies
Blast analysis using L. donovani cpbF (GenBank Accession no. AF309627, 1185 bp) revealed that our cpbF PCR product started at nucleotide 411 and finished at 1151. For L. infantum, the cpbE PCR product went from nucleotides 411 to 1112 of AF217087 and AJ628943 sequences (1146 bp). After translation in protein sequences, the cpbEF PCR products did not contain the 136 first amino acids comprising the pre-pro-region and the 12 first amino acids of the mature domain (MD). These PCR products also excluded the 11 last amino acids pertaining to the COOH-terminal extension (CTE). Consequently, after translation, our PCR products corresponded to a protein sequence of 247 aa for CPBF and 234 aa for CPBE, both comprised between the 13th amino acid (aa) of the mature domain (MD) and the 41st aa of the CTE. This PCR did not generate amplification for the other Leishmania species samples, representative of the genetic diversity of the genus, and for Trypanosoma cruzi and T. brucei (unpublished data).
Considering the entire sample, we identified 1 gap and 40 mutation sites among the 16 cpbEF sequences, 10 of which led to synonymous mutations (Fig. 2). CpbE and cpbF differed by a deletion of 39 bp in the mature domain, absent in cpbE and by 2 mutations (in the black box on Fig. 1): 1 synonymous mutation in the MD and 1 mutation in position 245 of CTE (nucleotide 699 on Fig. 2), which generated a Proline (CCA) for L. infantum and a Leucine (CTA) for L. donovani. Positive selection analysis using the maximum likelihood method (Yang, 1997, 2002) revealed that the second mutation site (P245L) was under positive selection (unpublished data). Within the L. infantum species, the cpbE sequences showed a weak polymorphism, concerning for the most part 2 strains isolated from the cutaneous forms (LEM1098: 3 mutations, BCN1: 14 mutations). Some of these mutations generated amino acids common to L. mexicana CPBs (Fig. 3).

Fig. 2. Multiple alignment of cpbEF sequences for the 15 strains pertaining to the Leishmania donovani complex. For L. infantum strain LIPA59, both cpbE and cpbF were sequenced. The 40 mutation sites are indicated in grey-shaded boxes except for the 39-bp gap and the 2 mutations that discriminate between cpbE and cpbF, indicated in black-shaded boxes. Considering the complete cpbEF genes, these sequences range from the 411th nucleotide (in the mature domain) to the 1151th for cpbF and to the 1112th for cpbE (in the COOH-terminal extension). indicates the 5′ CTE.

Fig. 3. Comparison of the amino acid sequences of different CPB isoforms: CPBF (referred to as F plus strain name) and CPBE (referred to as E plus strain name) obtained in this study, L. donovani CPBA AF309626 (LdicpbA) and 6 L. mexicana sequences (LMCPB1, LMCPB2, LMCPB18, LM-Z14061 (LMCPBR), LMCPB2.8, LMCPB19). Dots correspond to identical amino acids. Mature domain and the 41st amino acids of the COOH-terminal extension (in italics) are represented. Cysteine residues are underlined (C), the three potential N-glycosylation sites are surrounded (N) and the 13-aa deletion absent within L. infantum CPBE is indicated in the black-shaded box. Residues 18, 60, 61, 64 and 84 are mentioned in grey-shaded boxes. An asterisk (*) indicates the catalytic triad (Cys25-His163-Asn183). Potential T-cell/B-cell epitopes and their position on the amino acid sequence are indicated with arrows. Mutations on the 2 dermotropic strains (BCN1 and LEM1098) are mentioned in dark grey-shaded boxes.
Among the 19 mutation sites observed for L. infantum, there was only 1 mutation on the LEM663 sequence (synonymous mutation) and 1 on the ITMAP263 sequence (synonymous mutation). Concerning BCN1, 13 out of the 14 mutations were localized at the 3′ end of the mature domain. The 5 other L. infantum cpbE sequences were identical (LIPA59, LEM716, LEM75, WR285, LEM356). The L. donovani species appeared more polymorphic with 21 mutation sites. Only 2 strains had the same cpbF sequence (HU3 and GILANI) and were identical to L. donovani cpbF AF309627. As the L. major genome has been completely sequenced, a blast analysis revealed that it did not contain cpbEF copies. After translation, the 39-bp fragment of cpbF corresponded to the amino acid sequence GVLTSCAGDALNH. It was identified for all the cpb sequences of the trypanosomatid family available on GenBank as L. donovani cpbA (another copy of the cpb cluster), L. mexicana cpb1, cpb2, cpb2.8, cpb18, cpb19, L. pifanoi lpcys2, T. cruzi cruzipain and L. major cpb-like (data not shown). On the other hand, a blast analysis on CTE revealed that the cpbEF CTE (156 nucleotides) had no homology with the other CTE regions (Fig. 3).
Predicted protein analysis
The sequences of cpbEF obtained were used to predict mature proteins except for the 12 first amino acids, which were absent from our sequences. We compared the different CPB isoforms within the L. donovani complex and also with L. major CPB, L. mexicana CPB and T. cruzi cruzain.
Residue composition
Considering modelling analysis, the CTE was not shown on the model, as the structure of this domain has not been solved. Three-dimensional structures of CPBF (Fig. 4A) and CPBE (Fig. 4B) had been predicted by Comparative Protein Modelling plus CPBE BCN1 (Fig. 4C) to visualize their numerous mutations. Mutations on the BCN1 strain seemed to have an impact on the conformation but no relevant differences were found (Fig. 4C). The location of cysteine residues (Cys) on the protein surface was different between CPBE and CPBF, perhaps because of the deleted sequence. Within the mature domain, there were 7 Cys for the L. donovani CPBF and 6 for the L. infantum CPBE because Cys156 belongs to the deletion of 13 aa (Fig. 3). For each species, there were also 2 Cys residues specific to the L. donovani complex in the CTE. CPBF contained 3 disulfide bonds, like T. cruzi, in positions Cys22-Cys63, Cys56-Cys101 and Cys156-Cys204 (Fig. 4A). This last bond was absent in CPBE because Cys156 pertained to the deleted sequence GVLTSCAGDALNH (Fig. 4B and C). This sequence was located on the protein surface near the catalytic domain and appeared similar between cruzain and CPBF. There was 100% identity between L. donovani cpbA AF309626 and cpbF considering protein sequences or DNA of this deleted sequence. Moreover, this sequence contained a histidine residue (His163) belonging to the catalytic triad involved in the protease activity for T. cruzi cruzain (McGrath et al. 1995) and L. mexicana (Juliano et al. 2004). This triad (Cys25-His163-Asn183) was found in CPBF, CPBA and for the various CPB isoforms of L. mexicana (CPB1, CPB2, CPB2.8, CPB18, CPB19) but not in CPBE, which was missing His163 (Fig. 3).

Fig. 4. Homology-based protein model of the mature domain (MD) of Leishmania donovani cpbF (A), L. infantum cpbE (B) and L. infantum BCN1 cpbE from cutaneous lesion (C). The blue chain (containing a histidine residue of the catalytic site triad) on (A) corresponds to the 13-aa deletion of L. infantum. The modelling server has numbered the Val1 as Val5; as a consequence, the residue numbers have a difference of 4 residues compared to those in the text. 5′ MD (Val5) and 3′-end MD (Pro207-Pro220) are indicated like cysteine residues. Note that L. infantum (B and C) does not contain Cys160 (pink cloud on A) and Cys183 (orange cloud on A). There are 3 disulphide bonds for A (Cys160-Cys208 (pink-green), Cys60-Cys105 (green-dark blue), Cys26-Cys67 (pink-blue) and only 2 for B and C (Cys60-Cys105, Cys26-Cys67).
A few amino acid variations between CPB isoenzymes are important in modifying the substrate specificities (Juliano et al. 2004; Judice et al. 2005). In Fig. 3, we compared these amino acids among, on the one hand, CPBF, CPBA and CPBE for the donovani complex and, on the other hand, CPB1, CPB2, CPB2.8, CPB18 and CPB19 for L. mexicana. The residues in positions 18, 60 and 61 were identical between L. donovani CPBs (A, E and F) and L. mexicana CPB18 (i.e. Asn18, Asp60 and Asn61) but different for all the other L. mexicana CPBs, whereas Asn64 and Phe84 were specific to the L. donovani complex CPBs.
N-glycosylation sites and potential T-cell/B-cell epitopes
CPBE and CPBF contained 3 potential N-glycosylation sites in the mature domain. The first, Asn103, also exists for Lpcys2 (Boukai et al. 2000a) and for the different L. mexicana CPBs (Fig. 3), whereas Asn122 and Asn171 were found only on protein sequences of the L. donovani complex (Fig. 3). In addition, after blast analysis on the complete L. major genome, 8 cpb-like copies were revealed and all contained a potential N-glycosylation site in position Asn171 (data not shown). Thus, only Asn122 appears to be specific to the L. donovani complex CPBs but not to CPBE/CPBF since it is present for L. donovani CPBA (Fig. 3).
Now considering the complete protein sequence, these predicted CPB isoforms appeared fairly rich in potential T-cell and B-cell epitopes (Fig. 3). Indeed, 2 regions that could provide 3 HLA class I and II epitopes were specific of the L. donovani complex: the HLA-B7 epitope (in position 119–128 of the MD), the HLA-A2 epitope (in position 236–246 of the CTE) and the HLA-DR1 epitope (in position 234–342 of the CTE). The HLA-B7 epitope was present in CPBA but HLA-A2 and HLA-DR1 were specific to the CPBE/CPBF CTE. CPBs of the L. donovani complex were also characterized by the absence of a few epitopes. In fact, L. mexicana CPBs contained a potential HLA-DR5 (in position 90–98 of the MD) and a HLA-A2 (in position 90–98), both of which are absent for L. donovani CPBs (Fig. 3) but also present for L. major CPBs-like as well as L. pifanoi LPCYS2 (data not shown). Concerning potential B-cell epitopes, 1 region (in position 178–193 of the MD) contained a potential epitope specific to the L. donovani complex (CPBA, CPBE and CPBF) and another (in position 153–169) was present for all the CPBs analysed (for L. mexicana, L. major and L. mexicana) except CPBE, because this region contained the 13-aa deleted sequence.
DISCUSSION
Phylogenetic analysis
This study presents an analysis of cpbE/cpbF sequences for the L. donovani complex. The PCR products were sequenced and revealed that the cpbE product is specific to L. infantum, whereas cpbF is specific to L. donovani. In the publication by Mundodi et al. (2002), the authors revealed the presence of cpbE for one L. donovani strain, whereas this copy was not evidenced in our study (Mundodi et al. 2002). Even so, our PCR is able to reveal both cpbE and cpbF in the same reaction since a mixed pattern was obtained for L. infantum LIPA59, which contains the 2 copies. The phylogenetic tree constructed on the basis of gene sequences follows the species classification since L. infantum is individualized from L. donovani. We can also note that the taxonomic status of the GILANI (MON30) strain as L. donovani, initially typed as L. infantum by MLEE, is confirmed by this analysis (Oskam et al. 1998; Quispe Tintaya et al. 2004; Zemanova et al. 2004). Nevertheless, the L. infantum sample shows a low level of polymorphism and is included in the L. donovani sample. The polymorphic character of cpbF, the weak polymorphism within L. infantum cpbE, and the phylogenetic classification of the latter as a subunit of the L. donovani species confirm the ancestral status of L. donovani species already proposed by several authors (Ibrahim and Barker, 2001; Pratlong et al. 2001). This suggests that L. infantum has recently descended from a L. donovani clone and it may have evolved and adapted to the canine reservoir (Ibrahim, 2002). The differences between cpbE and cpbF, as, for example, the deletion in cpbE, appear to reflect this event. Thus, these differences could be directly or indirectly related to the zoonotic character of L. infantum and the anthroponotic character of L. donovani. Concerning the clinical polymorphism, even if cutaneous leishmaniasis most probably has a multi-factorial origin – a combination of environmental, parasite, host and vector factors – it is interesting to note that the polymorphism within L. infantum CPBE is only caused by mutations on strains isolated from the cutaneous form. Furthermore, most of these ‘cutaneous’ mutations led to amino acids identical to those of L. mexicana CPBs, a dermotropic species. Nevertheless, among the 5 L. infantum dermotropic strains, 3 of them revealed CPBE sequences similar to those of other L. infantum strains (isolated from visceral form). If we consider that the parasite could play a role in expression of clinical signs, these results lead us to several hypotheses. First, the original strain of L. infantum could have been a ‘dermotropic strain’ which consequently might evolve for a much longer time than the ‘viscerotropic strain’. This hypothesis is congruent with the presence of common amino acids between L. infantum dermotropic strains (BCN1 and LEM1098) and L. mexicana. Second, the dermotropic character results from an adaptation of the parasite to its human host. This last proposition is congruent with the diversity observed within dermotropic strains: each mutation may result from a complex interaction between the parasite and its environment, the immune host response and host genetic factors. Another point concerns the presence of both cpbE and cpbF copies for L. infantum LIPA59 isolated from human cutaneous lesion. This strain might result from a hybridization process between the two species L. infantum and L. donovani resulting from a sexual recombination. Other results based on microsatellite genotyping and sequencing of other cp copies (cpa and cpb) as well as lpg2, amastin and iunh, have confirmed the heterozygous status of this strain LIPA59 (unpublished data).
CPBE and CPBF comparison
CPBs are composed of 3 regions: a pre-pro-domain (from amino acid (aa) 1 to aa 124), a mature domain (from aa 125 to aa 342) and a CTE (variable size according to the CPB isoform). Sequence analysis of these CPBE/CPBF sequences and blast analysis with other trypanosomatids, including other L. donovani complex CPBs such as CPBA, revealed that CPBE and CPBF are specific to the L. donovani complex. Considering epitopes, our study revealed potential B-cell and T-cell epitopes specific to the L. donovani complex as well as potential T-cell epitopes common to different Leishmania CPBs (L. major, L. mexicana, L. pifanoi) but absent for the L. donovani complex. The implication of these T and B epitopes in host immune response toward leishmaniasis is currently unknown, so further work is necessary to understand the functional consequences of this epitopic diversity. Results obtained on T-cell epitopes are very interesting because cell-mediated immune responses play a role in both protective and counter-protective immune responses (Sacks and Noben-Trauth, 2002).
Concerning the protein conformation, mutations observed in the anthropozoonotic agent L. infantum generate important consequences on the protein structure. CPBE presents a deletion in the mature domain affecting essential characteristics existent in other CPBs. Indeed, because of the CPBE deletion, CPBE misses a potential B-cell epitope in position 153–169. Furthermore, the protease activity of all papain-like proteases is associated with the catalytic triad consisting of a nucleophilic cysteine, a histidine and an asparagine (for example Cys25-His163-Asn183 for Leishmania) (Selzer et al. 1997; Juliano et al. 2004). Our study has revealed that, because of CPBE deletion, His163 is absent, as is 1 disulfide bond (absence of Cys156). A similar phenomenon was observed in T. cruzi where the absence of Cys318 leads to the loss of a disulfide bond (Cazzulo et al. 1992). Furthermore, these modifications are close to the catalytic site and it could be suggested that CPBE could have different substrate preferences from those of CPBF or cruzain. Thus, thanks to their differences, these 2 isoforms, CPBE and CPBF, may interact differently with the host and consequently reflect the difference in host specificity between L. infantum and L. donovani. These observations are in agreement with the notion developed by Mottram et al. (1997) who suggest that individual CPB isoforms have a distinct interaction with the host. Within the L. donovani complex, at least 5 copies of cpb seem to exist (Mundodi et al. 2002), and independent analysis of enzymatic activities, cellular localization and regulation of each isoform must be conducted in order to enhance our knowledge of these proteins.
Comparison of CPBE/CPBF with other Leishmania CPBs
We also compared our results with CPBs of other Leishmania complexes, most particularly with L. mexicana CPBs, which are the most frequently studied. CPBs are known to be glycosylated at various sites (Parodi et al. 1995) and the comparison with other CPB isoforms shows that CPBE/Fs are different: first by the presence of specific potential N-glycosylation site (Asn122), second by additional potential cell epitopes and characteristic CTEs. The 3D model's prediction showed accessibility on the protein surface of this Asn122. Consequently, this site could be important for the protein structure. Directed mutagenesis on potential N-glycosylation sites was performed on L. pifanoi cpb (Lpcys2) and revealed that glycosylation is not involved in the targeting to the lysosome (Boukai et al. 2000a). They seem instead to play an important role in folding, stabilization or protease protection (Dwek, 1998). These synapomorphic characteristics could be associated with the specific epidemiology of the L. donovani complex (visceral tropism, etc) compared to other species complexes as several authors have highlighted the important role of CPs in pathogenic processes (Alexander et al. 1998; Mottram et al. 1998; Mahmoudzadeh-Niknam and McKerrow, 2004). Juliano et al. (2004) have demonstrated that some amino acids (60, 61, 64 and 84) are important for substrate specificities for L. mexicana. Comparison of CPBE/CPBF with L. mexicana regarding residues 18, 60 and 61 revealed that CPBE/CPBFs were closer to CPB18. Unfortunately, little is known about CPB18 specificities except that it is expressed in the intracellular amastigote stage and encodes a 47·9-kDa protein that has a high degree of sequence homology with CPB2.8 (Mottram et al. 1997). By contrast, residues 64 and 84 are specific to L. donovani complex CPBs. Juliano and coworkers (2004) revealed that residues 60, 61 and 64 are located at α-helices that form the wall of the active site cleft and that changes in these residues would modify the electrostatic environment and thus possibly CPB enzymatic activities. Consequences of these amino acid changes on substrate specificity, enzyme activity, etc. require further study.
The CPBE/CPBF COOH-terminal extension
Cpb of trypanosomatids are characterized by the presence of an unusual CTE, absent in mammal cysteine protease (Boukai et al. 2000b). We also showed that CPBE/CPBFs have a specific CTE, different from the other CPB CTEs of Leishmania and that the CPBE/CPBF CTE contains potential epitope sites. The presence of these particular epitopes is an indicator of the involvement of this region in host–parasite interactions, especially with the immune system. This is in agreement with the data obtained by several authors. Chang and McGwire (2002) and Nakhaee et al. (2004) hypothesized that cpbEF CTE could act as pathoantigen and thus might be involved in some clinical manifestations of visceral leishmaniasis. Nakhaee et al. (2004) have shown the importance of L. infantum cpb CTE as an immune response target in canine leishmaniasis (Nakhaee et al. 2004).
Finally, cysteine protease genes are very complex and significant differences exist depending on the Leishmania species. Mottram et al. (1997) suggested that the individual CPB isoforms have distinct roles in the parasite's interaction with its host. Consequently, it is important now to focus our work by studying each cpb copy independently, also considering the development stage and the parasite species involved. Furthermore, concerning CPBE/CPBF, the synapomorphic characteristics such as specific CTE suggest that these proteins must be studied further in order to understand their potential involvement in visceral leishmaniasis.
M.H. is sponsored by the CNRS, R.B.G. and A.L.B. by the IRD. We are grateful to D. Sereno for a critical reading of this article and to F. Chevenet for his help with Treedyn software. We gratefully thank I. Mauricio, G. Schonian, K. Soteriadou and F. Pratlong, who provided us with some Leishmania strains or DNA. The English version of this manuscript was revised by L. Northrup.