INTRODUCTION
Cystic echinococcosis (CE), a zoonotic disease caused by the larval stage of the tapeworm Echinococcus granulosus sensu lato (s. l.), is a significant global public health concern (Eckert et al. Reference Eckert, Gemmell, Meslin and Pawlowski2001). CE is listed among the most severe parasitic diseases in humans, ranking second in the list of food-borne parasites globally (FAO/WHO report, 2012) and representing one of the 17 Neglected Tropical Diseases prioritised by the World Health Organisation (Daumerie et al. Reference Daumerie, Savioli, Crompton and Peters2010). The life cycle of the parasite involves mainly dogs and wild carnivores as definitive hosts (e.g. Moks et al. Reference Moks, Jõgisalu, Saarma, Talvik, Järvis and Valdmann2006; Deplazes et al. Reference Deplazes, van Knapen, Schweiger and Overgaauw2011; Laurimaa et al. Reference Laurimaa, Davison, Süld, Plumer, Oja, Moks, Keis, Hindrikson, Kinkar, Laurimäe, Abner, Remm, Anijalg and Saarma2015), which harbour the adult worms in the intestine. A wide range of domestic and wild mammals, but also humans, can serve as intermediate hosts (Eckert et al. Reference Eckert, Gemmell, Meslin and Pawlowski2001). Proglottids containing eggs or free eggs are passed to the environment by faeces of the definitive host and a suitable intermediate host becomes infected after oral infection with eggs. The hydatid cysts develop in the intermediate host, mainly in internal organs such as liver and lungs. The cycle is completed if a fertile hydatid cyst of an infected intermediate host is eaten by a suitable carnivore (Haag et al. Reference Haag, Araujo, Gottstein, Siles-Lucas, Thompson and Zaha1999; Eckert et al. Reference Eckert, Gemmell, Meslin and Pawlowski2001).
Echinococcus granulosus s. l. exhibits considerable intraspecific variability in terms of genetic diversity, host range, infectivity to humans, pathogenicity, antigenicity and developing rate (Eckert et al. Reference Eckert, Gemmell, Meslin and Pawlowski2001). Molecular studies have identified a number of genotypes/species within the E. granulosus complex (Bowles et al. Reference Bowles, Blair and McManus1992, Reference Bowles, Blair and McManus1994; Thompson and McManus, Reference Thompson and McManus2002; Lavikainen et al. Reference Lavikainen, Lehtinen, Meri, Hirvelä-Koski and Meri2003; Thompson, Reference Thompson2008; Knapp et al. Reference Knapp, Nakao, Yanagida, Okamoto, Saarma, Lavikainen and Ito2011) that are closely related to other species in the genus Echinococcus (Knapp et al. Reference Knapp, Gottstein, Saarma and Millon2015). Traditionally, the complex is considered to consist of genotypes G1–G10, but the taxonomy is currently under debate (Saarma et al. Reference Saarma, Jõgisalu, Moks, Varcasia, Lavikainen, Oksanen, Simsek, Andresiuk, Denegri and González2009; Knapp et al. Reference Knapp, Nakao, Yanagida, Okamoto, Saarma, Lavikainen and Ito2011; Nakao et al. Reference Nakao, Lavikainen and Hoberg2015; Romig et al. Reference Romig, Ebi and Wassermann2015). It has been proposed that some of these genotypes deserve the species status: E. granulosus sensu stricto (s. s.; genotypes G1–G3), E. equinus (G4), E. ortleppi (G5) and E. canadensis (G6–G10) (Thompson and McManus, Reference Thompson and McManus2002; Nakao et al. Reference Nakao, McManus, Schantz, Craig and Ito2007; Knapp et al. Reference Knapp, Nakao, Yanagida, Okamoto, Saarma, Lavikainen and Ito2011). Genotype G9 is not considered as valid (Kedra et al. Reference Kedra, Swiderski, Tkach, Dubinsky, Pawlowski, Stefaniak and Pawlowski1999).
Cystic echinococcosis is a widespread problem in Europe despite efforts to control it and the parasite maintains constant prevalence in areas where extensive farming is common (Giannetto et al. Reference Giannetto, Poglayen, Brianti, Sorgi, Gaglio, Canu and Virga2004; Carmena et al. Reference Carmena, Sánchez-Serrano and Barbero-Martínez2008; Garippa and Manfredi, Reference Garippa and Manfredi2009; Cardona and Carmena, Reference Cardona and Carmena2013). The highest rates for ovine hydatidosis in Europe has been reported in Romania, Greece, Turkey and central-southern Italy (particularly the islands of Sardinia and Sicily) where the prevalence in livestock ranged from 30·2 to 75·3% (Altintas, Reference Altintas2003; Giannetto et al. Reference Giannetto, Poglayen, Brianti, Sorgi, Gaglio, Canu and Virga2004; Scala et al. Reference Scala, Garippa, Varcasia, Tranquillo and Genchi2006; Varcasia et al. Reference Varcasia, Canu, Lightowlers, Scala and Garippa2006; Mitrea et al. Reference Mitrea, Ionita, Costin, Predoi, Avram, Rinaldi, Maurelli, Cringoli and Genchi2014; Chaligiannis et al. Reference Chaligiannis, Maillard, Boubaker, Spiliotis, Saratsis, Gottstein and Sotiraki2015). The parasite spreading is promoted by slaughter-houses with poor control over waste management, home slaughtering, low public awareness of the disease, high numbers of stray dogs and low sanitation (Dakkak, Reference Dakkak2010; Varcasia et al. Reference Varcasia, Tanda, Giobbe, Solinas, Pipia, Malgor, Carmona, Garippa and Scala2011).
Echinococcus granulosus s. s. genotype G1, also known as the common sheep strain, is widely distributed in southern Europe with the highest prevalence in the Mediterranean countries (Romig et al. Reference Romig, Dinkel and Mackenstedt2006; Casulli et al. Reference Casulli, Interisano, Sreter, Chitimia, Kirkova, La Rosa and Pozio2012). In northern and north-eastern Europe this genotype is rare, though it has been recently found in a cat in St. Petersburg, Russian Federation (Konyaev et al. Reference Konyaev, Yanagida, Ivanov, Ruppel, Sako, Nakao and Ito2012) and in urban dogs in Tartu, Estonia (Laurimaa et al. Reference Laurimaa, Davison, Süld, Plumer, Oja, Moks, Keis, Hindrikson, Kinkar, Laurimäe, Abner, Remm, Anijalg and Saarma2015). The genotype has been identified also in humans (Finland, Norway), but the diagnosed patients were immigrants mainly from the Near East or African countries (A. Lavikainen, pers. comm.). In northern and north-eastern European countries such as Finland, Sweden, Estonia and Latvia, genotypes G8 and G10 dominate (Lavikainen et al. Reference Lavikainen, Lehtinen, Meri, Hirvelä-Koski and Meri2003, Reference Lavikainen, Lehtinen, Laaksonen, Agren, Oksanen and Meri2006; Moks et al. Reference Moks, Jõgisalu, Saarma, Talvik, Järvis and Valdmann2006, Reference Moks, Jõgisalu, Valdmann and Saarma2008; Marcinkute et al. Reference Marcinkute, Šarkunas, Moks, Saarma, Jokelainen, Bagrade, Laivacuma, Strupas, Sokolovas and Deplazes2015; Oksanen and Lavikainen, Reference Oksanen and Lavikainen2015). In the Mediterranean countries, genotype G1 has been reported in definitive hosts such as dogs or wolves in Albania, Spain, Italy, Greece and Turkey (Sobrino et al. Reference Sobrino, González, Vicente, Fernández de Luco, Garate and Gortázar2006; Xhaxhiu et al. Reference Xhaxhiu, Kusi, Rapti, Kondi, Postoli, Rinaldi, Dimitrova, Visser, Knaus and Rehbein2011) and also in a wide range of intermediate hosts: human, cattle, sheep, pig, wild boar, goat and buffalo (González et al. Reference González, Daniel-Mwambete, Montero, Rosenzvit, McManus, Gárate and Cuesta-Bandera2002; Daniel-Mwambete et al. Reference Daniel-Mwambete, Ponce-Gordo and Cuesta-Bandera2004; Varcasia et al. Reference Varcasia, Canu, Lightowlers, Scala and Garippa2006, Reference Varcasia, Canu, Kogkos, Pipia, Scala, Garippa and Seimenis2007; Busi et al. Reference Busi, Šnábel, Varcasia, Garippa, Perrone, De Liberato and D'Amelio2007; Casulli et al. Reference Casulli, Manfredi, La Rosa, Di Cerbo, Genchi and Pozio2008; Martin-Hernando et al. Reference Martin-Hernando, González, Ruiz-Fons, Garate and Gortazar2008; Vural et al. Reference Vural, Baca, Gauci, Bagci, Gicik and Lightowlers2008; Dore et al. Reference Dore, Varcasia, Pipia, Sanna, Pinna Parpaglia, Corda, Romig and Scala2014). In other European countries, G1 has been reported in dogs, jackals or wolves in Austria, Portugal, Kosovo, Bulgaria and Romania (Breyer et al. Reference Breyer, Georgieva, Kudrova and Gottstein2004; Sherifi et al. Reference Sherifi, Rexhepi, Hamidi, Behluli, Zessin, Mathis and Deplazes2011) and in intermediate hosts such as humans, pigs, cattle or sheep (Breyer et al. Reference Breyer, Georgieva, Kudrova and Gottstein2004; Bart et al. Reference Bart, Morariu, Knapp, Ilie, Pitulescu, Anghel, Cosoroaba and Piarroux2006; Badaraco et al. Reference Badaraco, Ayala, Bart, Gottstein and Haag2008; Beato et al. Reference Beato, Parreira, Calado and Crácio2010; Schneider et al. Reference Schneider, Gollackner, Schindl, Tucek and Auer2010). The genotype has been described also in horse in Italy (Varcasia et al. Reference Varcasia, Garippa, Pipia, Scala, Brianti, Giannetto, Battelli, Poglayen and Micagni2008), horse, mule and donkey in Turkey (Utuk and Simsek, Reference Utuk and Simsek2013; Simsek and Cevik, Reference Simsek and Cevik2014; Simsek et al. Reference Simsek, Roinioti and Eroksuz2015) and in red deer in Romania (Onac et al. Reference Onac, Győrke, Oltean, Gavrea and Cozma2013). In addition to being widely spread among wild and domestic animals in Europe, genotype G1 is the most frequently implicated genotype in human infections, 88% worldwide (Alvarez Rojas et al. Reference Alvarez Rojas, Romig and Lightowlers2014), therefore deserving particularly close attention.
To date, although numerous studies have analysed the genetic diversity and population structure of E. granulosus s. s. (Nakao et al. Reference Nakao, Li, Han, Ma, Xiao, Qiu, Wang, Yanagida, Mamuti and Wen2010; Casulli et al. Reference Casulli, Interisano, Sreter, Chitimia, Kirkova, La Rosa and Pozio2012; Yanagida et al. Reference Yanagida, Mohammadzadeh, Kamhawi, Nakao, Sadjjadi, Hijjawi, Abdel-Hafez, Sako, Okamoto and Ito2012; Andresiuk et al. Reference Andresiuk, Gordo, Saarma, Elissondo, Taraborelli, Casalongue, Denegri and Saarma2013; Yan et al. Reference Yan, Nie, Jiang, Yang, Deng, Guo, Yu, Yan, Tsering and Kong2013; Boufana et al. Reference Boufana, Lett, Lahmar, Buishi, Bodell, Varcasia, Casulli, Beeching, Campbell, Terlizzo, McManus and Craig2015; Romig et al. Reference Romig, Ebi and Wassermann2015), data covering large geographical areas are scarce. The largest geographical coverage in Europe is provided by Casulli et al. (Reference Casulli, Interisano, Sreter, Chitimia, Kirkova, La Rosa and Pozio2012) who analysed the genetic variability of E. granulosus s. s. in Italy, Bulgaria, Romania and Hungary. However, the analytical power has remained low in most studies (Europe and elsewhere) as the analyses have largely been based on short sequences of mitochondrial DNA, most often on a single gene, e.g. the full cytochrome c oxidase subunit 1 gene (cox1) (Yanagida et al. Reference Yanagida, Mohammadzadeh, Kamhawi, Nakao, Sadjjadi, Hijjawi, Abdel-Hafez, Sako, Okamoto and Ito2012; Romig et al. Reference Romig, Ebi and Wassermann2015) or partial sequence of the cox1 or nad1 (e.g. Casulli et al. Reference Casulli, Interisano, Sreter, Chitimia, Kirkova, La Rosa and Pozio2012; Andresiuk et al. Reference Andresiuk, Gordo, Saarma, Elissondo, Taraborelli, Casalongue, Denegri and Saarma2013). Analysing significantly larger portion of the mitochondrial genome could potentially yield more detailed insight into the genetic variability and phylogeography of E. granulosus s. s.
The objectives of the present study were to: (i) investigate the genetic diversity and phylogeography of E. granulosus genotype G1 in part of its distribution range in Europe, and (ii) compare the results derived from the 8274 bp of the mitochondrial genome with previously used shorter sequences (351 and 1674 bp of cox1) and highlight major differences.
MATERIALS AND METHODS
Parasite material
Two hundred and fifty E. granulosus s. s. genotypes were initially analysed, of which 106 gave positive polymerase chain reaction (PCR) with all primers (the remaining samples did not yield positive PCR most probably due to low DNA quality). Samples were obtained during routine meat inspections or from hospital cases and were ethanol-preserved at −20 °C until further use. We confirmed the identity of G1 genotypes based on phylogenetic comparison with other E. granulosus genotypes according to Bowles et al. (Reference Bowles, Blair and McManus1992). However, genotype G3 samples (n = 15) could be distinguished with confidence from genotype G1 samples based on 8274 bp of mtDNA (Kinkar et al. unpublished data), and were excluded from the analysis. Thus, a total of 91 genotype G1 samples were analysed in this study originating from 6 intermediate host species (cattle, sheep, pig, goat, wild boar and human) in 7 European countries: Turkey (n = 69), Spain (n = 10), Italy (n = 7), Albania (n = 2), Romania (n = 1), Greece (n = 1), Finland (Algeria) (n = 1) (Fig. 1; Table 1). Although the relatively large number of final samples in this study originates from Turkey, considering its important geographical location near the ancient domestication centre of ruminants such as sheep and cattle, this area is likely to represent a large part of G1 genetic diversity in Europe and can therefore provide valuable insight into the phylogeography of G1.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170127144956-29406-mediumThumb-S0031182016001530_fig1g.jpg?pub-status=live)
Fig. 1. Geographic locations of Echinococcus granulosus s. s. genotype G1 samples (N = 91; red) from Europe analysed in this study. Additional distribution range of G1 in Europe is represented in pink.
Table 1. Data for 91 Echinococcus granulosus s. s. genotype G1 isolates analysed in this study.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170127144956-59450-mediumThumb-S0031182016001530_tab1.jpg?pub-status=live)
Note that the G1 isolate identified in Finland was from a patient originating from Algeria.
DNA extraction, PCR amplification and sequencing
DNA was extracted from protoscoleces or cyst membranes using High Pure PCR Template Preparation Kit (Roche Diagnostics, Mannheim, Germany), following the manufacturer's protocols. To analyse large portion of the mitochondrial genome, 10 novel primer pairs were designed (Table 2). PCR reactions were carried out in a total volume of 20 µL, using 1 × BD Advantage-2 PCR buffer (BD Biosciences, Franklin Lakes, NJ, USA), 0·2 mm dNTP (Fermentas, Vilnius, Lithuania), 0·25 µ m of each primer, 1U Advantage-2 Polymerase mix (BD Biosciences) and 20–50 ng of purified genomic DNA. Touchdown protocol was used for PCR: initial denaturation at 95 °C for 1 min, followed by 10 cycles of 95 °C for 20 s, 55 °C for 45 s (annealing temperature progressively reduced by 0·5 °C in each cycle) and 68 °C for 2 min; followed by 25 cycles of 95 °C for 20 s, 50 °C for 45 s, 68 °C for 2 min; and finishing with a final elongation step at 68 °C for 3 min. Of the amplified PCR products 10 µL were examined on 1·2% agarose gel electrophoresis. The remaining 10 µL of the PCR products were purified with 1 unit of shrimp alkaline phosphatase/exonuclease I (Fermentas, Vilnius, Lithuania). The mixture was subsequently incubated at 37 °C for 30 min and then heated 80 °C for 15 min to inactivate the enzymes.
Table 2. Primers used for E. granulosus s. s. G1 mtDNA analysis; positions are according to NC_008075 in GenBank.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170127144956-82216-mediumThumb-S0031182016001530_tab2.jpg?pub-status=live)
Sequencing was performed using the same primers as for the initial PCR amplification (Table 2) with BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems, Foster City, California, USA), following the manufacturer's protocols. Cycling parameters were 96 °C for 1 min, followed by 25 cycles of 96 °C for 10 s, 50 °C for 15 s and 60 °C for 4 min. Sequences were resolved on the ABI 3130xl sequencer (Applied Biosystems). All sequences were deposited in GenBank and are available under accession numbers KU925351–KU925433.
Data analysis
Sequences were assembled in CodonCode v4.2.7, manually corrected in BioEdit v7.2.5 and aligned with a E. granulosus genotype G1 sequence available in GenBank (NC_008075) (Yang et al. Reference Yang, Rosenzvit, Zhang, Zhang and McManus2005) using Clustal W. Phylogenetic networks were calculated using Network v4.612 (Bandelt et al. Reference Bandelt, Forster and Rohl1999) (http://www.fluxusengineering.com/, Fluxus Technology Ltd., 2004). Networks were constructed for 3 different alignments: (1) 8274 bp of mtDNA; (2) complete sequence of cox1 gene (1674 bp, according to AB786664; Nakao et al. Reference Nakao, Yanagida, Konyaev, Lavikainen, Odnokurtsev, Zaikov and Ito2013); (3) reduced dataset of 351 bp – a fragment of cox1 gene, used previously in E. granulosus phylogeographic analysis in Europe (according to JF513058; Casulli et al. Reference Casulli, Interisano, Sreter, Chitimia, Kirkova, La Rosa and Pozio2012; note that majority of publicly available G1 sequences fall between 300–400 bp).
The total length of all amplicons was >10 kb. However, after alignment, manual correction and trimming, the final length of aligned mtDNA sequences used for further analysis was 8274 bp (the sequence lengths varied between 8269 and 8274 bp). This included 15 full length gene coding areas: cytochrome b (cytb 717–1784; positions according to NC_008075), NADH dehydrogenase 4L (nd4l 1798–2058), ATP synthase subunit 6 (atp6 3473–3985), NADH dehydrogenase 1 (nad1 5100–5993), cytochrome c oxidase subunit 1 (cox1 6760–8367), 9 tRNA-encoding genes (tRNA-Gln 3282–3343, tRNA-Phe 3343–3405, tRNA-Met 3402–3467, tRNA-Val 4900–4962, tRNA-Ala 4968–5031, tRNA-Asp 5032–5096, tRNA-Asn 6010–6075, tRNA-Thr 8358–8422, tRNA-Cys 9400–9462) and small-subunit ribosomal RNA (ssu-rRNA 9463–10162); and 6 gene fragments: NADH dehydrogenase subunit 4 (nd4 2019–2091; 2518–3278), NADH dehydrogenase subunit 2 (nd2 3994–4176; 4356–4361; 4430–4875), cytochrome c oxidase subunit 2 (cox2 10182–10574), 2 tRNA encoding genes (tRNA-His 667–714, tRNA-Pro 6082–6086), and lsu-rRNA (8423–8495; 8789–9399).
The population diversity indices (number of haplotypes, haplotype diversity and nucleotide diversity) were calculated using DnaSP v5.10.01 (Librado and Rozas, Reference Librado and Rozas2009). Neutrality indices Tajima's D (Tajima, Reference Tajima1989) and Fu's Fs (Fu, Reference Fu1997) and pairwise fixation index (Fst) were calculated using the population genetics package Arlequin 3·1 (Excoffier et al. Reference Excoffier, Laval and Schneider2005). Indices were calculated separately for total population, different localities and hosts. The minimum sample size for localities and hosts that were included in the analysis was five.
RESULTS
Variations in nucleotide sequences
A total of 8274 bp of mtDNA was successfully sequenced for 91 E. granulosus G1 sequences (out of 250) from seven European countries (Albania, Finland, Greece, Italy, Romania, Spain and Turkey), covering the majority of the G1 range in Europe. The geographical origin of the samples is shown in Fig. 1. Phylogenetic networks were constructed considering both indels and point mutations. Total number of variable sites was 288.
mtDNA networks
The results of this study demonstrated extremely high genetic diversity of E. granulosus genotype G1 in Europe. The analysed 91 sequences were divided into 83 haplotypes: among these, 62 were found in Turkey, 10 in Spain and 6 in Italy (Table 3). The structure of the phylogenetic network is shown in Fig. 2. The average number of mutational steps was 12 and the maximum 27 (Alb2 and Tur45). No predominant haplotype was found in the phylogenetic network, most haplotypes were singletons (n = 76). Five haplotypes (Tur45, Tur10, Tur35, Tur56 and Ita3) included two samples and one haplotype (Tur3) included 4 samples.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170127144956-28469-mediumThumb-S0031182016001530_fig2g.jpg?pub-status=live)
Fig. 2. Phylogenetic network of Echinococcus granulosus s. s. genotype G1 based on 8274 bp of mtDNA. Circles represent haplotypes. Haplotype names and colours represent different geographical origins: Tur (yellow) – Turkey, Rom (dark blue) – Romania, Fin-Alg (light blue) – Finland (a patient from Algeria), Alb (orange) – Albania, Gre (light yellow) – Greece, Spa (gray) – Spain, Ita (green) – Italy. Small black circles are median vectors (i.e. hypothetical haplotypes: haplotypes not sampled or extinct). Host species are indicated with letters (B – bovine, S – sheep, H – human, P – pig, W – wild boar, G – goat). The number inside haplotype circles indicates the frequency of the haplotype.
Table 3. Diversity and neutrality indices for E. granulosus s. s. genotype G1 in Europe based on 8274 bp of mtDNA. The Southern European samples (South Eur) included all samples except Turkish and Finnish (Algerian).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170127144956-83861-mediumThumb-S0031182016001530_tab3.jpg?pub-status=live)
n, number of isolates examined; Hn, number of haplotypes; Hd, haplotype diversity; π, nucleotide diversity; D (D), Tajima's; Fs, Fu's Fs; s.d., standard deviation.
** Highly significant P value (P < 0·000001).
* Significant P value (P < 0·05).
As expected, we found that numerous geographically distant samples were also genetically distant, for example Spanish and Albanian haplotypes Spa2 and Alb2 (separated by 25 mutations), also Turkish and Spanish haplotypes Tur41 and Spa1 (separated by 20 mutations) and Turkish and Italian haplotypes Tur12 and Ita6 (separated by 18 mutations). Also, numerous geographically close samples were genetically closely related, for example Turkish haplotypes Tur11 and Tur13 (separated by 1 mutation) and Italian haplotypes Ita4 and Ita2 (separated by 2 mutations).
However, numerous samples collected from geographically close localities showed remarkably high genetic diversity and distance. Turkish samples collected from Erzurum and Elazig provinces in Eastern Turkey, demonstrated high genetic variation despite the geographical proximity. For example, haplotypes Tur12 and Tur26 from Erzurum were separated by 24 mutations and Tur43 and Tur58 from Elazig by 20 mutations. Spanish samples obtained from Central Spain were highly divergent as well, for example, haplotypes Spa2 and Spa4 were separated by 20 mutations.
Moreover, numerous samples from geographically distant localities were genetically closely related, i.e. several monophyletic groups comprised samples from different countries. These include Albanian and Turkish monophyletic group (Alb2, Alb1, Tur8, Tur28, Tur61, Tur54), Greek and Turkish group (Gre1, Tur58, Tur4) and Romanian and Turkish group (Rom1 and all Turkish samples derived from central haplotype Tur35). Also, two monophyletic groups comprised samples from Spain and Turkey (Spa2, Tur17, Tur25, Tur12, Tur45, Tur63 and Spa10, Tur10) and one group included one Italian (Ita4), Spanish (Spa7) and Finnish/Algerian (Fin1) sample.
No host-specific structure was detected. Cattle and sheep samples were frequently genetically closely related, for example haplotype Tur35 consists of samples from sheep and cattle. Human G1 haplotypes were not genetically closest to one another, but to those of cattle and sheep. Haplotypes obtained from wild boar, pig and goat were genetically closest to haplotype Ita2 obtained from sheep (6, 4 and 6 mutations, respectively).
In the networks based on reduced datasets of 1674 and 351 bp in length, the sequences were divided into 49 and 11 haplotypes respectively, of which two were predominant in both networks (Fig. 3). In comparison between 8274 and 1674 bp datasets, some haplotypes were positioned into different haplogroups, e.g. Spa7 and Fin1, whereas haplotypes Spa4, Spa10, Tur6, Tur9, Tur42 and Tur43 assumed different phylogenetic relations to each other (Figs 2 and 3).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170127144956-75975-mediumThumb-S0031182016001530_fig3g.jpg?pub-status=live)
Fig. 3. Phylogeographic networks of Echinococcus granulosus s. s. genotype G1, using exactly the same set of samples as in Fig. 2, but shorter sequences: (A) complete sequence of cox1 gene, 1674 bp; (B) partial sequence of cox1 gene, 351 bp.
Diversity and neutrality indices
Haplotype diversity was extremely high in the overall population (Hd = 0·997), whereas nucleotide diversity was rather low (π = 0·00143) (Table 3). High haplotype diversity and low nucleotide diversity was also observed in the Italian, Spanish and Turkish subpopulations, ranging from 0·952 to 1·000 and 0·00068 to 0·00147, respectively. The Italian population showed the lowest values for both indices. High haplotype and low nucleotide diversities were also observed in cattle and sheep (Hd = 0·999, π = 0·00152 and Hd = 0·991, π = 0·00131, respectively). In comparison with the two shorter datasets, haplotype diversity was almost equally high for the 8274 bp and the full cox1 gene (1674 bp; Hd = 0·920; Table S1), whereas considerably lower for the 351 bp dataset (Hd = 0·596; Table S2). Low nucleotide diversities were observed for both of the reduced datasets: π = 0·00196 based on full cox1 gene (1674 bp) and π = 0·00219 for the partial cox1 gene (351 bp; Tables S1 and S2).
Neutrality indices such as Tajima's D and Fu's Fs were significant for most of the analysed variants (Table 3). The highest values were detected for the overall population and for Turkish samples. Cattle and sheep populations showed also high negative values. The Tajima's D was nonsignificant for the Italian samples.
Fixation indices
Low Fst values were observed among different localities (Table S3). The Fst value for 8274 bp dataset was statistically significant only between Spain and Turkey (F ST = 0·04130, P < 0·05). Relatively low Fst values (F ST = 0·01180, P < 0·05) were also recorded between cattle and sheep subpopulations.
DISCUSSION
The results of this study demonstrated extremely high haplotype diversity of E. granulosus s. s. genotype G1 in Europe (Fig. 2): 91 analysed samples were divided into 83 haplotypes (overall haplotype diversity 0·997). From earlier studies it is known that G1 has the highest host variability among all E. granulosus genotypes, capable of infecting numerous taxa, including wild and domesticated mammals and humans (Bowles et al. Reference Bowles, Blair and McManus1992; Eckert et al. Reference Eckert, Gemmell, Meslin and Pawlowski2001). It is likely that the high genetic variation observed in this study reflects, at least to some extent, the ability of G1 isolates to infect such a wide range of hosts. This can be regarded as a warning sign, suggesting that associations with new species may easily form if G1 distribution widens in Europe.
There was not only very high global haplotype diversity, but the diversity was high also locally. For example, haplotype diversity indices were 1·0 or close to that number in Italian, Spanish and Turkish G1 populations (Table 3), pointing to a very high degree of genetic diversity of genotype G1 across the Mediterranean countries, the main distribution area for G1 in Europe. The genetic diversity of E. granulosus G1 is likely to be higher at the domestication centre, while declining as the distance from the centre grows. However, the phylogenetic structure of G1 observed in this study does not follow this pattern. The Anatolia region, roughly corresponding to the Asian part of Turkey, is at the immediate vicinity of the Fertile Crescent, both considered as part of a domestication centre for the majority of livestock. Anatolia is also known as one of the earliest centres in Europe from which livestock were distributed westward along the Mediterranean coast, and only later towards north (Chessa et al. Reference Chessa, Pereira, Arnaud, Amorim, Goyache, Mainland, Kao, Pemberton, Beraldi, Stear, Alberti, Pittau, Iannuzzi, Banabazi, Kazwala, Zhang, Arranz, Ali, Wang, Uzun, Dione, Olsaker, Holm, Saarma, Ahmad, Marzanov, Eythorsdottir, Holland, Ajmone-Marsan, Bruford, Kantanen, Spencer and Palmarini2009). Sheep and cattle were among the first livestock species domesticated about 11–10 thousand years ago in the area from where they were shortly after domestication transported to the Mediterranean region by humans (Zeder, Reference Zeder2008). For example in sheep, the most frequent intermediate host for E. granulosus G1, recent data based on ancient DNA analysis have revealed that the proportion of rarer haplotypes have declined during the expansion of sheep from the Near Eastern domestication centre towards Europe (Rannamäe et al. Reference Rannamäe, Lõugas, Niemi, Kantanen, Maldre, Kadõrova and Saarma2016). As the lifecycle of E. granulosus genotype G1 is maintained mainly by domestic animals, their distribution is subject to anthropogenic effects, most likely extensive animal trade along the Mediterranean shore, resulting in high degree of genetic diversity across this region. Although wild animals can also distribute E. granulosus G1, animal transportation can help to spread the parasite with significantly higher pace. Moreover, the narrow landbridge connecting Turkey to the rest of Europe has posed, at least to some extent, a migration barrier for wild animals.
The importance of animal trade is further endorsed by lack of genetic segregation between different countries. Several Turkish samples were more closely related to Spanish, Romanian, Albanian and Greek samples than with geographically close other Turkish samples (Fig. 2). Furthermore, low Fst values between different localities (e.g. Spain and Turkey F ST = 0·041, P < 0·05) suggest relatively moderate genetic divergence between Mediterranean countries. Therefore, these observed phylogeographical patterns might also be shaped by livestock trade that has facilitated the parasite dispersal over vast areas. Demographic analysis also supported this hypothesis. High haplotype diversity coupled with relatively low nucleotide diversity values observed in this study (Hd = 0·997, π = 0·0014 for overall population) suggest rapid demographic expansion, supported by significant negative values of neutrality indices Tajima's D (−2·69) and Fu's Fs (−24·32) (Avise, Reference Avise2000). In addition to the efficient distribution of livestock (infected with G1) by humans, population bottlenecks can also cause the rapid demographic expansion. However, the relatively high divergence of haplotypes is better explained by livestock trade, since demographic bottleneck would rather result in a star-like network structure where majority of haplotypes are identical or very closely related and geographically linked.
The effect of large-scale animal trade on E. granulosus haplotype distribution has been discussed also by others (e.g. Casulli et al. Reference Casulli, Interisano, Sreter, Chitimia, Kirkova, La Rosa and Pozio2012; Yanagida et al. Reference Yanagida, Mohammadzadeh, Kamhawi, Nakao, Sadjjadi, Hijjawi, Abdel-Hafez, Sako, Okamoto and Ito2012). Casulli et al. (Reference Casulli, Interisano, Sreter, Chitimia, Kirkova, La Rosa and Pozio2012) considered the effect of animal trade negligible compared with thousands of years of diffusion. The phylogeography of E. granulosus G1 based on high-resolution network in this study suggests that the observed pattern is likely due to both factors: trade and diffusion. However, their role on the genetic diversity and distribution of genotype G1 in Europe remains largely unresolved and requires further investigations using more elaborate sampling and coverage of the entire G1 distribution range in Europe.
The results of this study indicated the absence of host-specific phylogeography of G1 according to host species (Fig. 2), supported also by low Fst value (0·0118, P < 0·05) of G1 between cattle and sheep. As the samples in this study were mostly from livestock animals, the rapid expansion of G1 isolates observed in this study has most likely been facilitated by the intensive (shepherd) dog-livestock transmission cycle. These results support efficient transmission of G1 between different hosts via dogs (and to lesser extent by other definitive hosts) and suggest that different host species are not particularly susceptible to any specific mtDNA haplotype. Analysis of the nuclear genome is required to address this question in more detail.
On the phylogenetic network (Fig. 2), haplotype Ita2 originating from southern Italy and Turkish haplotype Tur35 from east of the country, both assumed central positions in the network, suggesting that they are ancestral to many other haplotypes (note, however, that samples from Turkey are in excess compared with other regions). The ancestral position of these haplotypes might reflect early arrival of E. granulosus with sheep and other livestock to Europe via eastern Turkey, which lies at the immediate vicinity of a domestication centre for the majority of livestock species, and via southern Italy. However, this scenario remains to be further tested with a larger set of samples.
The main value of this study lies largely on the high-resolution approach based on relatively long mtDNA sequences. Also, we were able to provide preliminary results on what valuable information could be lost when using must shorter sequences, which is useful for future research. However, it is important to note that in this study samples from Turkey were in excess compared with other regions, as well as cattle and sheep samples that were in excess compared with other hosts. Therefore, the results of this study are biased towards Turkey, which should be taken into account. On the other hand, the relatively large number of samples from Turkey represents a value in itself, since this area, as part of a domestication centre for the majority of livestock, is likely to represent large part of G1 genetic diversity in Europe and can therefore provide valuable insight into the phylogeography of G1. Also, as cattle and sheep are the most common hosts for genotype G1, it was inevitable that the samples that we analyzed originated mostly from these species.
The longer sequences used in this study revealed significantly higher resolution compared with the shorter sequences. The networks based on shorter sequences both revealed two dominant haplotypes, whereas on the network based on longer sequences, no dominant haplotypes were highlighted. The shortest dataset based on 351 bp was able to separate 6 Turkish, 2 Spanish haplotypes and positioned all 7 Italian samples into the central haplotype (Fig. 3). The network based on 1674 bp separated 35 Turkish, 6 Spanish and 2 Italian haplotypes. However, in the 8274 bp network, Turkish samples were divided into 63 haplotypes, Spanish samples were all fully resolved and divided into 10 haplotypes and Italian samples were divided into 6 haplotypes (Fig. 2).
Although the resolution of the phylogenetic network based on different lengths of mtDNA was significantly higher for the 8274 bp dataset, the haplotype diversity index for the 1674 bp dataset was only slightly lower compared with the 8274 bp (Hd = 0·920 and Hd = 0·997, respectively) (Tables 3 and S1). It is interesting to note that nucleotide diversity increased with shorter sequences (Tables 2, S1 and S2) indicating that the average diversity of the cox1 gene is higher compared with the 8274 bp of mtDNA. For the 8274 bp dataset, haplotype diversities were equally high for Turkey (part of the domestication area) and for Southern Europe, indicating that the genetic diversity of G1 has remained high after the expansion from the domestication area. However, using shorter sequences, haplotype diversities were lower in Southern Europe compared with Turkey, suggesting that using a single mtDNA gene or its fragment may not be sufficient to reveal the level of genetic diversity of G1 in different localities.
There were also significant differences regarding the origin and prevalence of central ancestral haplotypes. All three networks based on different sequence lengths revealed two ancestral haplotypes. However, in networks based on shorter sequences, a significant number of samples were positioned into the central ancestral haplotypes: 23 and 9 samples based on full cox1 gene, also 52 and 25 samples based on 351 bp, respectively (Fig. 3). Both networks based on shorter sequences suggest a wide geographical spectra of samples in the ancestral haplotypes, whereas the dominant haplotypes in both networks based on shorter sequences were fully resolved in the 8274 bp network (Fig. 2), demonstrating that Ita2 and Tur35 are the ancestral haplotypes, originating from a specific country. This represents a good example how complex haplotypes can be resolved to the highest degree, revealing the ancestral sequences at which all others coalesce. Furthermore, in both networks based on shorter sequences, the most dominant haplotype is identical to the haplotype EG1 (Casulli et al. Reference Casulli, Interisano, Sreter, Chitimia, Kirkova, La Rosa and Pozio2012), which has been found to be highly prevalent worldwide (Nakao et al. Reference Nakao, Li, Han, Ma, Xiao, Qiu, Wang, Yanagida, Mamuti and Wen2010; Yanagida et al. Reference Yanagida, Mohammadzadeh, Kamhawi, Nakao, Sadjjadi, Hijjawi, Abdel-Hafez, Sako, Okamoto and Ito2012; Boufana et al. Reference Boufana, Lahmar, Rebaï, Safta, Jebabli, Ammar, Kachti, Aouadi and Craig2014, Reference Boufana, Lett, Lahmar, Buishi, Bodell, Varcasia, Casulli, Beeching, Campbell, Terlizzo, McManus and Craig2015). However, the 8274 bp dataset showed that this haplotype is actually genetically highly diverse and was fully resolved, revealing the single ancestral haplotype Ita2 (Fig. 2).
The networks also show that the longer sequences have significantly more power to reveal the genetic relations between different haplotypes as the longer sequences positioned a number of haplotypes differently compared with shorter ones. For example, haplotypes Spa4, Tur43, Spa7 and Fin1 assumed different phylogenetic relations to each other (Figs 2 and 3). Based on 8274 bp, haplotypes Spa7 and Fin1 originate from the central Italian haplotype Ita2, whereas the network based on the full cox1 gene suggests that the same haplotypes originate from the Turkish central haplotype Tur35. Furthermore, based on 351 bp, they were positioned into both of the ancestral haplotypes – Fin1 into the central dominant haplotype that contains Italian samples and Spa7 into the other ancestral haplotype. Also, based on 1674 bp, haplotype Tur43 was most closely related to Spanish haplotype Spa4, whereas based on 8274 bp, the haplotype formed a monophyletic group of 4 Turkish samples most closely related to central Italian haplotype Ita2.
Our results demonstrate that using longer mtDNA sequences for phylogeographic analysis has indeed clear advantages over commonly used shorter sequences. The same has been demonstrated also for other species, e.g. for the brown bear (Keis et al. Reference Keis, Remm, Ho, Davison, Tammeleht, Tumanov, Saveljev, Männil, Kojola, Abramov, Margus and Saarma2013): the analysis of complete mitochondrial genomes on brown bear clearly demonstrated the advantage of using data from complete mitogenomes, which allowed identifying spatio-temporal population processes that had not previously been detected using shorter mtDNA sequences, not even by those of ca 2 kb (Korsten et al. Reference Korsten, Ho, Davison, Pähn, Vulla, Roht, Tumanov, Kojola, Andersone-Lilley, Ozolins, Pilot, Mertzanis, Giannakopoulos, Vorobiev, Markov, Saveljev, Lyapunova, Abramov, Männil, Valdmann, Pazetnov, Pazetnov, Rõkov and Saarma2009). Therefore, analysis of genetic diversity and evolutionary trajectories of E. granulosus and other parasites are likely to benefit significantly from large-scale mitochondrial and nuclear genome sequencing. In time, the next-generation sequencing methods will most likely replace many of the Sanger-sequencing approaches, including the mitogenome analysis.
Our findings have obvious public health importance as knowledge of E. granulosus s. s. genetic diversity and geographic distribution is fundamental to understand how such life-threatening pathogens evolve. The level of genetic diversity forms a basis for future adaptations of pathogens, constituting a force towards the emergence of new host-parasite associations and potentially also for development of drug resistance (Morgan et al. Reference Morgan, Clare, Jefferies and Stevens2012). Better understanding of E. granulosus G1 phylogeography may thus contribute to the advancement of effective strategies to control the spread of hydatid disease.
SUPPLEMENTARY MATERIAL
The supplementary material for this article can be found at http://dx.doi.org/10.1017/S0031182016001530.
FINANCIAL SUPPORT
This work was supported by institutional research funding (IUT20-32) from the Estonian Ministry of Education and Research (to U.S.); by grant ESF-8525 from the Estonian Research Council (to U.S.); the European Union through the European Regional Development Fund (Centre of Excellence FIBIR) (to U.S.); and the Estonian Doctoral School of Ecology and Environmental Sciences to (L.K. and T.L.) the European Community's Seventh Framework Programme under the grant agreement 602051 (Project HERACLES; http://www.Heracles-fp7.eu/) (to A.C.); the Dutch Food and Consumer Product Safety Authority (NVWA) (to J.vd G.). The funding sources had no involvement in the preparation, ideas, writing, interpretation, or the decision to submit this article.