Introduction
Carassotrema is a genus of trematodes from the family Haploporidae Nicoll, 1914. The family is represented by small trematodes with a hermaphroditic sac enclosing the male and the female terminal genitalia and with one or two testes (Overstreet & Curran, Reference Overstreet and Curran2005; Blasco-Costa et al., Reference Blasco-Costa, Balbuena, Kostadinova and Olson2009; Besprozvannykh et al., Reference Besprozvannykh, Atopkin, Ngo, Ha, Tang and Beloded2018; Atopkin et al., Reference Atopkin, Besprozvannykh, Ha, Nguyen, Nguyen and Chalenko2018, Reference Atopkin, Besprozvannykh, Ha, Nguyen and Nguyen2020). Carassotrema spp. are intestinal parasites of various fish, usually freshwater cyprinids and sometimes estuarine mullet fish (Overstreet & Curran, Reference Overstreet and Curran2005). The life cycle of Carassotrema spp. is implemented through brackish-water snails Stenothyra spp., which produce gymnocephalous cercariae (Shameen & Madhavi, Reference Shameen and Madhavi1991).
Carassotrema koreanum Park, 1938, the type species of the genus, was first described from the Korean freshwater cyprinid fish Carassius auratus (Linnaeus, 1758) and originally assigned to the family Allocreadiidae (Park, 1938). Overstreet & Curran (Reference Overstreet and Curran2005) moved Carassotrema to the Haploporidae (the suborder Waretrematinae Srivastava, 1937) and proposed that the haploporids evolved from a waretrematine ancestor in the coastal mullet. A ribosomal DNA (rDNA)-based phylogenetic analysis of the Haploporidae has supported this hypothesis (Atopkin et al., Reference Atopkin, Besprozvannykh, Ha, Nguyen, Khamatova and Vainutis2019).
The phylogenetic position of the Haploporidae is unresolved. Their position within the Xiphidiata, a digenean suborder created for trematodes whose cercariae possess a stylet, may be a phylogenetic misplacement because haploporids do not have a stylet (Olson et al., Reference Olson, Cribb, Tkach, Bray and Littlewood2003). It has been suggested that the haploporids should be elevated to a new order Haploporata (Pérez-Ponce de León & Hernández-Mena, Reference Pérez-Ponce de León and Hernández-Mena2019), an idea partially confirmed by the mitogenomic sequence data (Pérez-Ponce de León & Hernández-Mena, Reference Pérez-Ponce de León and Hernández-Mena2019; Atopkin et al., Reference Atopkin, Semenchenko, Solodovnik, Ivashko and Vinnikov2021).
The next-generation sequencing (NGS) data on the complete mitochondrial genome sequence of a haploporid trematode Parasacccoelium mugili Zhukov, 1971 have recently been obtained (Atopkin et al., Reference Atopkin, Semenchenko, Solodovnik, Ivashko and Vinnikov2021). That study has offered the first phylogenetic analysis based on the mitochondrial DNA (mtDNA) sequence data for the Haploporidae, and provided a basis for the accumulation and analysis of mitogenomic data on other haploporids as well as on digeneans in general. Complete mitochondrial genomes are currently known for 60 digenean species.
Complete mitochondrial genome sequence data also allow one to test the codon usage bias as an alternative molecular marker for systematics and to clarify possible relationships between the gene re-arrangement processes and the occurrence of synonymous nucleotide substitutions in Digenea. These relationships are known for some animal groups such as marine bivalves, snakes, ascidians and some hymenopterans (Shao et al., Reference Shao, Dowton, Murrel and Barker2003). Studies made on different classes of Platyhelminthes have shown that guanine+cytozine (GC) bias has a great influence on synonymous codons and amino-acid usage (Lamolle et al., Reference Lamolle, Fontenla, Rijo, Tort and Smircich2019).
In this study we generated a complete mtDNA and nuclear ribosomal operon sequences of C. koreanum ex C. auratus from the south of the Russian Far East. Our aims were to characterize the complete mitochondrial genome and the nuclear ribosomal operon of C. koreanum and to reconstruct its phylogenetic relationships with other digeneans using complete mtDNA sequence data.
Material and methods
Sample collection and DNA extraction
Adult worms were collected from the intestine of a single naturally infected C. auratus individual. Fish were caught from the estuary of the Kievka River, Primorksy Region (the south of the Russian Far East) during parasitological field work in July 2018. Trematodes were killed with hot water and then fixed in 96% ethanol. Total DNA was extracted from 30 pooled worms using a Qiamp Investigator Kit (Qiagen, Hilden, Germany), according to the manufacturer's protocol. Amount of total DNA was measured with the Qubit Fluorometer 3.0 (Invitrogen, Waltham, Massachusetts, USA) and then used for NGS sequencing in final 2 ng/μl.
Preparing genome library for NGS
Libraries were prepared using an Ion Plus Fragment Library Kit and unique adapters (Ion Xpress, Waltham, Massachusetts, USA) with pre-fragmentation on a Covaris M220 Focused-ultrasonicator (Covaris, Woburn, Massachusetts, USA). The emulsion polymerase chain reaction and template preparation were obtained on an Ion One Touch2 System (ThermoFisher Scientific, Waltham, Massachusetts, USA) followed by sequence on an Ion S5 sequencing platform using Ion 540 chip at the Far Eastern Federal University (Vladivostok, Russia). Ambiguous parts of the genome sequence were re-examined with Sanger's sequencing using highly specific oligonucleotide primers, developed for this study: Ck_ND5_5’F (5’-CGT CTG TGT GAG GGT TGT TG-3’), Ck_ND5_5’R (5’-CAA TAG TCA AAG TCC TCA CAG CC-3’). Additionally, we re-examined 5’-end of Parasaccocoelium mugili from our previous study with Sanger's sequencing using originally developed specific primers: Pm_ND5_5'F (5’-GAT GAT ATG GGT TGG TTT TAA G-3’) and Pm_ND5_5’R (5’-CCA AAG ATA AAA AGT CAA AAG-3’).
The quality of raw reads were checked using FastQC 0.11.9 (Braham Bioinformatics, https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and then reads were assemble using SPAdes 3.14.1 (Nurk et al., Reference Nurk, Bankevich, Antipov, Gurevich, Korobeynikov, Lapidus, Prjibelsky, Pyshkin, Sirotkin, Sirotkin, Stepanauskas, McLean, Lasken, Clingenpeel, Woyke, Tesler, Alekseyev and Pevzner2013) with correction of IonTorrent data using the IonHammer tool available in the software SPAdes. The scaffolds, containing mitochondrial and ribosomal operon DNA data, were manually assembled in MEGA X software (Kumar et al., Reference Kumar, Stecher, Li, Knyaz and Tamura2018).
Mitochondrial genome annotation was performed with MITOS2 online software (http://mitos2.bioinf.uni-leipzig.de/index.py), and then mitochondrion was manually assembled and aligned with P. mugili (Atopkin et al., Reference Atopkin, Semenchenko, Solodovnik, Ivashko and Vinnikov2021) in MEGA X. Searching of tandem repeats was completed with Tandem Repeat Finders software (https://tandem.bu.edu/trf/trf.html). The nucleotide sequences of both ribosomal operon and mitochondrial complete genome were deposited in GenBank under accession numbers ON598381 and ON598382.
Codon usage, gene variations and phylogenetic analyses
Nucleotide and amino acid sequence alignments were performed with ClustalW algorithm in MEGA X. The poorly aligned regions were removed using the Gblocks Server (http://molevol.cmima.csic.es/castresana/Gblocks_server.html).
Phylogenetic analysis was performed on the basis of concatenated amino acid sequences with the maximum likelihood (ML) algorithm, realized using PhyML 3.1 software (Guindon & Gascuel, Reference Guindon and Gascuel2003) and Bayesian inference (BI) using MrBayes 3.2.6 program (Huelsenbeck et al., Reference Huelsenbeck, Ronquist, Nielsen and Bollback2001). The ML algorithm was performed using the LG (Lee and Gascuel) evolutionary model (Lee & Gascuel, Reference Lee and Gascuel2008), subtree pruning and rearrangements (SPRs) tree topology search and random sequence addition. The Bayesian algorithm was performed using protein model, mixed set of substitution types, mixed amino acid model and uninformative amino acid substitution rates as priors. A Markov chain Monte Carlo algorithm was performed with 10,000,000 generations during two independent runs and sampling each 1000 generation. The significance of phylogenetic relationships was estimated with a posterior probabilities (Huelsenbeck et al., Reference Huelsenbeck, Ronquist, Nielsen and Bollback2001) for Bayesian algorithm and approximate likelihood-ratio test using eBayes support (Anisimova & Gascuel, Reference Anisimova and Gascuel2006) for the ML algorithm. Codon usage statistics were calculated for concatenated protein-coding gene sequence data with MEGA X. Sequence cluster analyses based on codon usage bias were performed with Statistica 13 software (TIBCO Software Inc., 2017) using the weighted pair group average joining (tree clustering) method with Euclidian distances calculation. Statistical significance of codon frequencies for clusterization was estimated with analysis of variance within k-means clustering procedure with Statistica 13.
The phylogenetic relationships and codon usage bias calculation were inferred using sequences of our samples and other trematode species from the National Center of Biotechnology Information (NCBI) GenBank database (table 1).
Table 1. List of Digenea sequences from GenBank used in phylogenetic analysis.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220727080342349-0370:S0022149X22000438:S0022149X22000438_tab1.png?pub-status=live)
Results
Annotation of ribosomal operon of C. koreanum
The ribosomal operon of C. koreanum was 10,644 bp in length, including ETS1 (1449 bp), 18S ribosomal RNA (rRNA) gene (1988 bp), ITS1 rDNA (558 bp), 5.8S rRNA gene (157 bp), ITS2 rDNA (274 bp), 28S rRNA gene (4152 bp) and ETS2 (2066 bp). The nucleotide composition of ribosomal operon of C. koreanum was as follows: Adenine (A) = 22.6%, Timin (T) (Uracil (U)) = 27.3%, Cytosine (C) = 22.5% and Guanine (G) = 27.6%. It was slightly different from that of P. mugili in a higher content of thymine and cytosine and a lesser content of adenine and guanine.
General characteristics of mitochondrial genome of C. koreanum
The mitochondrial genome of C. koreanum contained 13,965 bp with 12 protein-coding genes, two ribosomal genes, 22 tRNA genes and a non-coding region (NCR) (fig. 1 and table 2). There were no alternative read variants in the NGS raw data, and no intraspecific variable positions were observed. The nucleotide composition in the C. koreanum mitochondrial genome was as follows: A = 19.2%, T (U) = 40.6%, C = 14.8% and G = 25.4%. Nucleotide pair frequency was 59.7% for Adenin+Timin (AT) content, and 40.2% for GC content, indicating a bias towards T over A (AT skew = −0.36) and G over C (CG skew = 0.26), respectively.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220727080342349-0370:S0022149X22000438:S0022149X22000438_fig1.png?pub-status=live)
Fig. 1. Organization of the complete mitochondrial genome of Carassotrema koreanum.
Table 2. Annotation of mitochondrial genome of Carassotrema koreanum.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220727080342349-0370:S0022149X22000438:S0022149X22000438_tab2.png?pub-status=live)
NCR, non-coding region. atRNA-Ser (S1) missed the paired dihydrouridine (DHU) arm.
Protein-coding genes of mitochondrial genome
The complete mitochondrial sequence length of 12 protein-coding genes was 10,146 bp. The arrangement of protein-coding genes was: cox3–cytb–nad4L–nad4–atp6–nad2–nad1–nad3–cox1–cox2–nad6–nad5. Start codons for protein-coding genes were ATG or GTG, except nad5 gene that started with a special haploporid TTG codon. The nucleotide composition of the assembled protein-coding part of the mitochondrial genome sequence was: A = 16.9%, T (U) = 42.8%, C = 14.7% and G = 25.6%. Nucleotide pair frequency was 59.7% for AT content, and 40.3% for GC content, indicating a bias towards T over A (AT skew = −0.43) and G over C (CG skew = 0.27), respectively.
Codon usage statistics for C. koreanum agreed with the nucleotide composition ratio: the most common triplets contained T (U) and/or A bases. They were UUU (frequency = 9.7%), UUA (frequency = 3.9%) and AUU (frequency = 3.3%). In comparison with P. mugili, triplet UAU occurred less often in the mitochondrial genome of C. koreanum, while triplet GGG (G) occurred more frequently. A total of 3449 amino acids were encoded by the mitochondrial protein-coding genes of C. koreanum.
Phylogenetic analysis
The ML and BI algorithms were used on the basis of the 2228 amino acid alignment length available after Gblocks processing. The BI tree topology showed that digeneans could be subdivided into two large clades (fig. 2). Clade I consisted of 12 species of the family Schistosomatidae Stiles & Hassal, 1898, while Clade II comprised 47 species from 18 families representing seven suborders. Carassotrema koreanum was closely related to P. mugili. These two species formed a monophyletic subclade, corresponding to the suborder Haploporata. This subclade appeared as a sister taxon to the other digenean suborders, except Hemiurata Skrjabin & Guschanskaja, 1954. The suborder Xiphidiata Olson, Cribb, Tkach, Bray, Littlewood, Reference Olson, Cribb, Tkach, Bray and Littlewood2003 was paraphyletic and appeared as two independent groups. The first group contained Brachicladium goliath (van Beneden, 1858), and species of Paragonimus Braun, 1988 (Xiphidiata). This group was closely related to Opisthorchiata. Plagiorchis maculosus (Rudolphi, 1802) appeared as a sister to the Xiphidiata and Opisthorchiata spp. mentioned above. The second group of the Xiphidiata included representatives of Dicrocoeliidae Odhner, 1911 (Dicrocoelium dendriticum (Rudolphi, 1819), D. chinensis (Sudarikov & Ryjikov, 1951) Tang & Tang, 1978, Eurytrema pancreaticum (Janson, 1899)) and Eucotylidae Skrjabin, 1924 (Tamerlania zarudnyi Skrjabin, 1924, Prosthogonimus cuneatus (Rudolphi, 1802) Braun, 1901). This group appeared as a sister to Echinostomata La Rue, 1926, Opisthorchiata La Rue, 1957, Pronocephalata Olson, Cribb, Tkach, Bray, Littlewood, Reference Olson, Cribb, Tkach, Bray and Littlewood2003 and the first group of the Xiphidiata. Two representatives of Diplostomata, Cyathocotyle prussica Mühling, 1896 and Clinostomum complanatum (Rudolphi, 1819) Braun, 1899, were closely related to Clade II and not to the other members of the suborder.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220727080342349-0370:S0022149X22000438:S0022149X22000438_fig2.png?pub-status=live)
Fig. 2. Phylogenetic relationships of Carassotrema koreanum and other digenetic trematodes, reconstructed by means of Bayesian inference on the basis of the 2228 amino acid alignment length, available after Gblock processing. Nodal support indicated with posterior probabilities, calculated with Bayesian algorithm. Scale bar shows the number of substitutions per site.
In general, the topology of the ML tree was similar to that of the Bayesian tree, demonstrating two main clades within Digenea (fig. 3). However, closely related C. koreanum and P. mugili were united into the subclade with P. maculosus, and this subclade was sister to the Echinostomata.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220727080342349-0370:S0022149X22000438:S0022149X22000438_fig3.png?pub-status=live)
Fig. 3. Phylogenetic relationships of Carassotrema koreanum and other digenetic trematodes, reconstructed by means of maximum likelihood on the basis of the 2228 amino acid alignment length, available after Gblock processing. Nodal support indicated with posterior probabilities, calculated with approximate likelihood ratio test (eBayes support). Scale bar shows the number of substitutions per site.
Sequence cluster analysis based on codon usage bias
We estimated codon usage bias in protein-coding part of mitochondrial genomes for all species involved in our study and performed a cluster analysis of Digenea using these codon usage bias data. In general, ten trematode clusters could be revealed (fig. 4). The species of the genus Schistosoma Hansen, 1916 (Diplostomata) formed a distinct clade, and so did most members of Pronocephalata. Other eight clusters contained trematodes from different suborders. Carassotrema koreanum was situated within the same cluster with Metagonimus yokogawai (Katsurada, 1912) Katsurada, 1912 and Paragonimus westermanii (Kerbert, 1878), whereas P. mugili clustered with P. maculosus.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220727080342349-0370:S0022149X22000438:S0022149X22000438_fig4.png?pub-status=live)
Fig. 4. Results of sequence cluster analysis based on codon usage bias of Digenea, using frequencies of 64 mitochondrial codons.
On the basis of this result, we used the k-means clustering method for revealing statistically significant codons for differentiation of all the studied digeneans and, separately, the two species of haploporids. Firstly, we specified two clusters for all trematode species: Schistosoma spp. + C. prussica (cluster #1) and other trematodes (cluster #2) (data not illustrated). The analysis of variance indicated insignificant frequencies for ten codons for this clustering (table 3). Next, we specified ten possible clusters for all the species based on the results of the phylogenetic and cluster analyses. We found that Schistosoma spp. were separated from all the other trematodes. This finding is supported by the results of the phylogenetic and cluster analyses. Carassotrema koreanum and P. mugili were placed into different clusters (table 4). An analysis of variance indicated that the frequencies of all but one (GCA) codons were statistically significant for this clustering (table 5).
Table 3. Results of analysis of variance for 64 codons within k-means cluster analysis of trematodes with two specified clusters.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220727080342349-0370:S0022149X22000438:S0022149X22000438_tab3.png?pub-status=live)
Between SS, intergroup variance; Within SS, intragroup variance; df, degrees of freedom; F, Fisher's statistical criterion; P, significance. Codons with insignificant frequencies are bolded.
Table 4. Results of k-means cluster analysis for Digenea with ten specified clusters. Spesies of Haploporidae are bolded.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220727080342349-0370:S0022149X22000438:S0022149X22000438_tab4.png?pub-status=live)
Table 5. Results of analysis of variance for 13 codons, significant for C. koreanum and P. mugili clustering.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220727080342349-0370:S0022149X22000438:S0022149X22000438_tab5.png?pub-status=live)
Between SS, intergroup variance; Within SS, intragroup variance; df, degrees of freedom; F, Fisher's statistical criterion; P, significance. Codons with insignificant frequencies are bolded.
Finally, we performed cluster analysis for C. koreanum and P. mugili and identified 13 codons for which the differences between these two species were insignificant. Then we performed cluster analysis for all trematode species based on the frequencies of these 13 codons. In this analysis, C. koreanum and P. mugili belonged to the same cluster. At the same time, most of the schistosomes gathered within the same separate cluster, and so did all Pronocephalata (fig. 5). An analysis of variance indicated a high statistical significance for all 13 codons (table 6). There were three instances when two codons encoded the same amino acid: UUA and CUG for leucine, UCU and UCG for serine, AAU and AAC for asparagine.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220727080342349-0370:S0022149X22000438:S0022149X22000438_fig5.png?pub-status=live)
Fig. 5. Results of sequence cluster analysis of Digenea based on codon usage bias of 13 mitochondrial codons.
Table 6. Results of analysis of variance for 64 codons within k-means cluster analysis of trematodes with ten specified clusters. Unsignificant values are bolded.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220727080342349-0370:S0022149X22000438:S0022149X22000438_tab6.png?pub-status=live)
Between SS, intergroup variance; Within SS, intragroup variance; df, degrees of freedom; F, Fisher's statistical criterion; P, significance.
Discussion
The results of our previous study of the mitochondrial genome-based phylogenetic relationships of P. mugili (Atopkin et al., Reference Atopkin, Semenchenko, Solodovnik, Ivashko and Vinnikov2021) agreed with the phylogenetic tree of Digenea from Olson et al. (Reference Olson, Cribb, Tkach, Bray and Littlewood2003), and there were no contradictions with the conclusions about the establishment of the suborder Haploporata for Haploporoidea Nicoll, 1914 proposed by Pérez-Ponce de León & Hernández-Mena (Reference Pérez-Ponce de León and Hernández-Mena2019). The results of this study, which was based on an expanded species sample and two phylogenetic algorithms, provide further solid arguments in favour of the validity of the Haploporata. The Bayesian phylogenetic tree topology was the most reliable in this respect, indicating a separate phylogenetic branch for two Haploporidae species within Digenea, while the ML tree demonstrated a close relationships of the two haploporid species with P. maculosus. The Xiphidiata was paraphyletic, which was expressed in the separation of the Dicrocoeliidae members from other representatives of this suborder on both BI and ML trees.
The suborder Xiphidiata was established with the help of phylogenetic analysis based on concatenated ribosomal complete 18S and partial 28S gene sequences, where it appeared as a monophyletic clade (Olson et al., Reference Olson, Cribb, Tkach, Bray and Littlewood2003). The Haploporidae and the Dicrocoeliidae were situated in the gorgoderoid subclade within the Xiphidiata. However, Olson et al. (Reference Olson, Cribb, Tkach, Bray and Littlewood2003) interpreted the position of the Haploporidae, which possess unarmed cercariae, as a phylogenetic misplacement. A clear paraphyly of the Xiphidiata has been shown based on the results of rDNA-based phylogenetic analysis in the study of Pérez-Ponce de León & Hernández-Mena (Reference Pérez-Ponce de León and Hernández-Mena2019), where the Haploporoidea, as well as some Allocreadioidea and Gorgoderoidea, were separated from the other Xiphidiata on the 18S rDNA-based tree (Pérez-Ponce de León & Hernández-Mena, Reference Pérez-Ponce de León and Hernández-Mena2019). As a result, it was proposed to elevate the Haploporoidea to the suborder Haploporata, resolving the paraphyly of the Xiphidiata (Pérez-Ponce de León & Hernández-Mena, Reference Pérez-Ponce de León and Hernández-Mena2019). Nevertheless, the clade uniting the Dicrocoeliidae with some gorgoderoid families was separate from the other Xiphidiata.
The paraphyly of the Xiphidiata, expressed by a considerable differentiation of the Dicrocoeliidae from other members of the suborder, was shown in phylogenetic studies of Digenea based on mitochondrial DNA sequence data (Locke et al., Reference Locke, Dam, Caffara, Pinto, López-Hernández and Blanar2018; Fu et al., Reference Fu, Jin and Liu2019b; Wu et al., Reference Wu, Gao, Cheng, Xie, Yuan, Liu and Song2020; Atopkin et al., Reference Atopkin, Semenchenko, Solodovnik, Ivashko and Vinnikov2021). However, most of the authors paid no attention to this paraphyly, noting only a stable differentiation of all digeneans into two clades, the Plagiorchiida and the Diplostomida. In our opinion, there are still not enough complete mtDNA sequence data on trematodes to generate the final systematic and phylogenetic conclusions concerning Digenea in general and the Xiphidiata in particular. However, the paraphyly of this suborder has been demonstrated in the latest phylogenetic studies based on individual rDNA sequences of more than 1000 species from 106 families and on complete mitochondrial genome sequence data for more than 50 species (Locke et al., Reference Locke, Dam, Caffara, Pinto, López-Hernández and Blanar2018; Pérez-Ponce de León & Hernández-Mena, Reference Pérez-Ponce de León and Hernández-Mena2019; Fu et al., Reference Fu, Jin and Liu2019b; Wu et al., Reference Wu, Gao, Cheng, Xie, Yuan, Liu and Song2020; Atopkin et al., Reference Atopkin, Semenchenko, Solodovnik, Ivashko and Vinnikov2021), as well as in our present work, which was based on 59 species. Therefore, we believe that the Xiphidiata in its present state cannot be recognized as a valid and self-sufficient taxon. Detailed morphological and molecular studies are needed to ascertain its structure. It cannot be ruled out that the key role of the presence of stylet in cercariae in the validation of the suborder Xiphidiata will have to be reconsidered.
Gene arrangement of the whole mitochondrial genome sequence of C. koreanum was identical to that of P. mugili, except for the NCR, which was represented by a single region in C. koreanum and divided into three regions, including two tandem repeats TR1 and TR2, in P. mugili (Atopkin et al., Reference Atopkin, Semenchenko, Solodovnik, Ivashko and Vinnikov2021). The arrangement of protein-coding genes in the mtDNA of C. koreanum is the same as in the members of the Xiphidiata and in all the representatives of the other suborders of the Plagiorchiida La Rue, 1957 (Biswal et al., Reference Biswal, Chatterjee, Bhattacharya and Tandon2014; Liu et al., Reference Liu, Yan, Otranto, Wang, Zhao, Jia and Zhu2014a; Briscoe et al., Reference Briscoe, Bray, Brabec and Littlewood2016; Chang et al., Reference Chang, Liu and Gao2016; Qian et al., Reference Qian, Zhou, Li, Wang, Miao and Hu2018; Wang et al., Reference Wang, Wang, Xu, Li, Qu, Song, Tang and Lin2018; Le et al., Reference Le, Nguyen, Nguyen, Doan, Agatsuma and Blair2019; Suleman et al., Reference Suleman, Ma, Khan, Tkach, Muhammad, Zhang and Zhu2019a; Atopkin et al., Reference Atopkin, Semenchenko, Solodovnik, Ivashko and Vinnikov2021) that are included into Clade II (figs 2 and 3).
The mitochondrial gene rearrangements in Digenea have been discussed in detail by Wu et al. (Reference Wu, Gao, Cheng, Xie, Yuan, Liu and Song2020). In this study, we provided a new complete mtDNA sequence of C. koreanum, but our results do not essentially change the concept of the digenean gene arrangement, and so we do not discuss this aspect here.
In our previous work we have proposed possible relationships between the gene re-arrangement processes and the occurrence of synonymous nucleotide substitutions in Digenea, and discussed the relevant studies made on Platyhelminthes (Atopkin et al., Reference Atopkin, Semenchenko, Solodovnik, Ivashko and Vinnikov2021). The results of the cluster analysis based on codon usage bias in the present study demonstrated some agreement with the results of the phylogenetic analysis. Schistosoma spp. were also differentiated from other members of Digenea, and Pronocephalata also appeared as a single cluster. Carassotrema koreanum and P. mugili had different positions on the cladogram based on the frequencies of 64 codons. In our view, this result indicates that the complete data on the codon usage bias have a different efficiency for species clustering in different taxonomical groups of trematodes. This assumption is confirmed by the cluster analysis of all digeneans based on 13 codons with close frequency values in C. koreanum and P. mugili (Haploporidae), on which the grouping of these species in the same cluster was based. Moreover, three codon pairs out of the 13 pairs used in the analysis were degenerate. This result means that synonymous substitutions in mitochondrial protein-coding genes are also important in the differentiation or clusterization of separate groups of Digenea.
Additionally, some results of this study indicate that 13 mitochondrial codons are important for the molecular characterization of haploporids. Nevertheless, these results are preliminary and require verification. In particular, the mitochondrial genomes of C. koreanum and P. mugili must be compared with those of Skrjabinolecithum spp. and Elonginurus spp., respectively, in light of the recent studies on phylogenetic relationships of the Haploporidae (Atopkin et al., Reference Atopkin, Besprozvannykh, Ha, Nguyen, Khamatova and Vainutis2019). Moreover, we suggest that an analysis of the codon usage bias for each mitochondrial gene separately may provide more information about the molecular features of differentiation of trematode species. A detailed analysis of this kind would be also useful for evaluating and characterizing mitochondrial gene sequences most suitable as molecular markers for the study of relationships within Digenea.
As mentioned above, there are three pairs of digenean mitochondrial DNA codons encoding one and the same amino acid: UUA and CUG encode leucine, UCU and UCG encode serine, and AAU and AAC encode asparagine. This result means that synonymous substitutions in mitochondrial protein-coding genes are important for differentiation of digenean groups.
We also obtained the first new data on the TTG start codon for nad5 gene from C. koreanum, a representative of the Haploporata. In all trematode mitochondrial genomes examined so far, this gene uses two start codons, ATG or GTG. TTG has been reported as the first codon of mitochondrial nad5 gene in two Dicrocoelium spp. (see table 1). However, in the mitochondrial genome annotation of these species, the start codon of nad5 gene is not determined (Liu et al., Reference Liu, Yan, Otranto, Wang, Zhao, Jia and Zhu2014a). Moreover, we re-examined the 5’-end of nad5 gene of P. mugili with Sanger's sequencing and found that this gene actually started with TTG, not with ATG, as reported in our previous study (see Atopkin et al., Reference Atopkin, Semenchenko, Solodovnik, Ivashko and Vinnikov2021). Thus, TTG as the first codon for mitochondrial nad5 gene can be considered as a homoplasy of Dicrocoelium, Parasaccocoelium and Carassotrema. Interestingly, these species play a key role in generating the paraphyly of the Xiphidiata on molecular-based phylogenetic studies (Olson et al., Reference Olson, Cribb, Tkach, Bray and Littlewood2003; Pérez-Ponce de León & Hernández-Mena, Reference Pérez-Ponce de León and Hernández-Mena2019; Atopkin et al., Reference Atopkin, Semenchenko, Solodovnik, Ivashko and Vinnikov2021). At the same time, this character cannot be applied as a molecular marker for the Dicrocoeliidae, because in E. pancreaticum the nad5 gene starts with ATG. Therefore, we can expect that in other haploporid species this gene also has the usual start codons, ATG or GTG.
Acknowledgements
We are grateful to Dr Vladimir V. Besprozvannykh for his help in the identification of the specimens of C. koreanum at the early stages of our research, and to Dr Kirill A. Vinnikov for his helpful comments on the NGS technique.
Financial support
This study was supported by Russian Scientific Foundation, project № 22-24-00896.
Conflicts of interest
None.
Ethical standards
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional guides on the care and use of laboratory animals..