Introduction
Psylloidea (Hemiptera) or jumping plant-lice with about 4000 described species worldwide contain some economically important pests in agriculture and forestry. Psyllids can lower crop yield by direct feeding and by transmitting agents of plant diseases (Burckhardt, Reference Burckhardt1994; Burckhardt and Ouvrard, Reference Burckhardt and Ouvrard2012). One of the most important citrus pests is the Asian citrus psyllid, Diaphorina citri Kuwayama, 1908 that acts as a vector of Candidatus Liberibacter asiaticus, the pathogenic bacteria causing HLB ( = Huanglongbing or citrus greening disease) (Halbert and Manjunath, Reference Halbert and Manjunath2004; Gottwald, Reference Gottwald2010; Grafton-Cardwell et al., Reference Grafton-Cardwell, Stelinski and Stansly2013; Hall et al., Reference Hall, Richardson, Ammar and Halbert2013). The potential arrival of exotic insects is closely monitored in some countries as these can become economically important pests (Bové et al., Reference Bové, Danet, Bananej, Hassanzadeh, Taghizadeh, Salehi and Garnier2000; Tsai and Liu, Reference Tsai and Liu2000; French et al., Reference French, Kahlke and Da Graça2001; Halbert and Núñez, Reference Halbert and Núñez2004; Martoni et al., Reference Martoni, Bulman, Pitman, Taylor and Armstrong2018).
Diaphorina Löw, 1879, comprises 76 described species in the Old World (Ouvrard, Reference Ouvrard2019). The genus is particularly species-rich in more arid regions of the Palearctic, Afrotropical and Oriental zoogeographical realms (Burckhardt, Reference Burckhardt1985). Many constituent species are heavily pigmented and morphologically similar, making their identification difficult (Burckhardt, Reference Burckhardt1985; Hollis, Reference Hollis1987). In contrast to other psyllid groups, the male terminalia are fairly uniform and provide often few good characters for species identification. The shape of the genal processes and the female terminalia, as well as forewing shape and pattern, are used in species identification (Burckhardt, Reference Burckhardt1985). From the Kerman Province, Iran, the following five Diaphorina species have been reported: D. chobauti Puton, 1898, developing on Convolvulus L., D. citri on Citrus L., D. luteola Loginova, 1978 on Solanum L., D. lycii Loginova, 1978 on Lycium L. and D. zygophylli Loginova, 1978 on Zygophyllum L. The species are morphologically similar and difficult to identify.
DNA barcoding can be an effective method and a standard tool for species identification (Hebert et al., Reference Hebert, Ratnasingham and deWaard2003b). The 5′ half of the cytochrome oxidase 1 (COI) gene has been used as the barcode locus for most animals in DNA barcoding (Hebert et al., Reference Hebert, Cywinska, Ball and deWaard2003a). The approach has been successfully applied to several groups of psyllids (Percy et al., Reference Percy, Butterill and Malenovský2016; Percy, Reference Percy2017; Taylor et al., Reference Taylor, Fagan-Jeffries and Austin2016; Martoni et al., Reference Martoni, Bulman, Pitman, Taylor and Armstrong2018; Lashkari et al., Reference Lashkari, Shamsi Gushki and Mirzaei2019; Cho et al., Reference Cho, Malenovsý, Burckhardt, Inoue and Lee2020; Lashkari et al., Reference Lashkari, Burckhardt and Shamsi Gushki2020). The barcode locus used for psyllids is short (typically 472 bp), easy to amplify and it can be successfully sequenced even when the DNA is fairly degraded and fragmented.
Morphometrics is a useful method for analyzing shape and addressing a wide range of questions including evaluating phenetic differences between taxa (e.g., Burckhardt and Basset, Reference Burckhardt and Basset2000; Serbina et al., Reference Serbina, Burckhardt, Birkhofer, Syfert and Halbert2015; Serbina and Mennecart, Reference Serbina and Mennecart2018; Shamsi Gushki et al., Reference Shamsi Gushki, Lashkari and Mirzaei2018; Lashkari et al., Reference Lashkari, Burckhardt and Shamsi Gushki2020). In geometric morphometrics, the shape is usually conceptualized and has been defined as those components of form which remain after removing the size (Kendall, Reference Kendall1977; MacLeod, Reference MacLeod2017).
Digital image processing can be a useful and informative method for species identification. This method performs some operations on an image, in order to get an enhanced image or to extract useful information (Chanda and Majumder, Reference Chanda and Majumder2011). Although the progress in designing and implementing workable systems for automated identification is slow, there are several examples of this method from insects using the wings (Weeks et al., Reference Weeks, Gauld, Gaston and O'Neill1997; Watson et al., Reference Watson, O'Neill and Kitching2004; Crnojević et al., Reference Crnojević, Panić, Brkljač, Ćulibrk, Ačanski and Vujić2014) or the body shape (Umar et al., Reference Umar, Abbas and Gulzar2017). The method was also applied to flatworms (Platyhelminthes) (Leow et al., Reference Leow, Chew, Chong and Dhillon2015) and copepods (Kalafi et al., Reference Kalafi, Tan, Town and Dhillon2016) using the body shape.
As the identification of Diaphorina species is difficult for the non-specialist with traditional morphological comparison, we examine here three methods for species discrimination and compare their results: (1) cytochrome oxidase 1 (COI) sequences as DNA barcodes to discriminate the three Diaphorina species from Kerman, (2) geometric morphometrics of their forewings to define which morphometric characters are diagnostic, and (3) digital image processing to automate the identification of the studied species.
Materials and methods
Sample collection
Adult psyllids were collected in the Kerman province, Iran, during 2017 (table 1). Additionally, 15 COI sequences of Diaphorina species from GenBank were included in a cluster analysis. The collected adults were transferred to 95% ethanol and stored at 20°C. Species identification was done by the second author comparing with specimens from the collection of the Naturhistorisches Museum, Basel, Switzerland.
Table 1. Details of the specimens used for molecular, morphometric and automated identification of the three Diaphorina species

Geometric morphometric study
A total of 97 right forewings of females, belonging to the three Diaphorina species were used for geometric morphometrics (table 1). The forewings were mounted on microscopic slides in Canada balsam and images were captured at 40 times magnification with a digital camera mounted on a microscope. The forewing images were first processed by the tpsUtill programme (Rohlf, Reference Rohlf2019a), then a total of 19 landmarks, including 13 homologous landmarks (1–4, 6–9, 11, 13, 15, 16, 18) and six semi landmarks (5, 10, 12, 14, 17 and 19) were digitized at vein intersections or tip of the veins or vein margins by the tpsDig2 programme (fig. 1) (Rohlf, Reference Rohlf2017). To increase accuracy in semi landmark selection, two parallel lines were drawn and spots were chosen with the most significant convexity (fig. 1).

Figure 1. (a) Position of landmarks (circles) and semi landmarks (triangle) on the left forewing of Diaphorina citri. (b, c and d) forewings of D. citri, D. chobauti and D. zygophylli, respectively.
The coordinates of the landmarks obtained from tpsDig2 were used in the tpsRelw programme (Rohlf, Reference Rohlf2019b) to perform the procedure of geometric morphometric analysis (alignment and superimposition). The consensus shape was subtracted and maximal shape variation was shown along the first two relative-warp axes (RW1 and RW2). After that, the shape variables, Weight Matrix, were subjected to multivariate analysis of variance (MANOVA) to compare forewing shapes among the three species, using the SAS statistical programme. Differences in forewing shape among the three species were visualized by deformation grids using the tpsRelw programme (thin plate splines). To construct a dendrogram, an unweighted pair group method analysis (UPGMA) was performed on squared Mahalanobis distances using the NTSYSpc programme (Rohlf, Reference Rohlf2000). The size of each forewing specimen was calculated based on the centroid sizes (CS; the square root of the sum of squared distances between each landmark and the wing centroid) (Zelditch et al., Reference Zelditch, Swiderski and Sheets2012). CS was calculated using the tpsRelw programme (Rohlf, Reference Rohlf2019a, Reference Rohlf2019b) and processed by one-way analysis of variance procedures (ANOVA with post hoc Tukey's HSD test) in the SAS statistical programme.
Molecular study
DNA was extracted using DNA Extraction Kit DNPTM (SinaClon, Iran) according to the manufacturer's instructions. COI was partially amplified as DNA barcodes in a TAdvanced 96 SG thermocycler (Biometra GmbH, Germany), under the following cycling conditions (Percy et al., Reference Percy, Butterill and Malenovský2016): 94°C for 3 min, followed by 40 cycles of 92°C for 30 s, annealing at 50°C for 40 s and 72°C for 1 min, followed by a final extension time of 10 min at 72°C. The COI sequence was amplified with primers C1-J-1718 (5′ GGA GGA TTT GGA AAT TGA TTA GTT CC 3′) and C1-N-2191 (5′ CCC GGT AAA ATT AAA ATA TAA ACT TC 3′) that can make a polymerase chain reaction (PCR) product with about 472 bp in length (Simon et al., Reference Simon, Frati, Beckenbach, Crespi, Liu and Floors1994; Percy et al., Reference Percy, Butterill and Malenovský2016). The used primers were synthesized at Macrogen Inc. (Seoul, South Korea). Positive and negative controls were included to ensure the consistency of the results. PCR products were visualized on a 1.2% agarose gel, in TAE buffer (pH 8.0) at 80 V for 1 h to check the success of the PCR amplification. Purification and sequencing were performed at the Macrogen Inc. (Seoul, South Korea). All alignments were edited to remove ambiguous bases and primer sequences using software BioEdit version 7.2.6. After that, sequences were aligned using ClustalW in software MEGA X (Kumar et al., Reference Kumar, Stecher, Li, Knyaz and Tamura2018). Each obtained sequence was compared to sequences in GenBank using BLAST and sequences of D. citri, D. communis and D. lycii were included in cluster analyses (figs 7 and 8). The gene tree was constructed using the parsimony method in PAUP* 4.0a133 (Swofford, Reference Swofford2002). The DNA sequences are deposited in GenBank under accession numbers MT296823–MT296838. Pariaconus ohiacola (Crawford, 1918) (Hemiptera: Triozidae) (sequence in GenBank: accession number KY294469.1) was chosen as the out-group for the gene tree analyses.
Digital image processing
Digital image processing consists of two phases; the pre-processing phase and the classification phase in which the image is classified into one of the three types of species (D. chobauti, D. citri and D. zygophylli). In the first phase, several image processing tools are employed to enhance the quality of the images and to extract the important attributes, such as the colour pattern on the wing and geometric shapes. The most important step in the pre-processing phase is separating the foreground (wing) from the background. Since the transparency of the wings is high, it is difficult to separate them from the background. At the end of the pre-processing stage, the image is fed into the classification phase. Classification of wings is done in the same manner as an expert does, i.e. important features which are specific to each wing are extracted, and species are classified based on these features. These steps are explained in more details in the next subsection.
Pre-processing phase
In the pre-processing phase, the following steps are performed. Firstly, the RGB image is transformed into a grayscale image. Then, Sobel operator is performed on the grayscale image to acquire the edges of the wing. By doing this, the wing stands out from the background. In the next step, a morphological dilation operator is applied to the resulting image. Dilation is one of the two basic operators (dilation and erosion) in the area of mathematical morphology. It is usually applied to binary images, but there are versions that work on grayscale images.
Morphological transformation needs two inputs: the original image and a structuring element or kernel which decides the nature of operation (Raid et al., Reference Raid, Khedr, El-dosuky and Aoud2014). The basic effect of the dilation operator on a binary image is to gradually enlarge the boundaries of regions of foreground (i.e. typically white pixels). Therefore, areas of foreground pixels grow in size while holes within those regions become smaller. After removing small connected components (white pixels which are connected to each other) of the resulting image, the remaining white pixels are connected to each other so that a convex hull is created. This convex object specifies the boundaries of the wing. In the next step, image skew should be detected and corrected. To find the right direction, the images are rotated in such a way that the major axis falls on the horizontal line. Now, this image can be used as a mask to extract the foreground (wing) from the background. Finally, the resulting image is binarized using the Otsu global binarization method. Otsu's method tries to find a threshold in the image histogram that binarizes the image into two classes, the background and the foreground (Kashef et al., Reference Kashef, Nezamabadi-Pour and Rashedi2018). Figure 2 shows the steps described above.

Figure 2. Preprocessing steps: (a) original image; (b) grayscale image; (c) edge detected images using Sobel operator; (d) image dilation; (e) removal of small connected components; (f) convex hull of image in stage e; (g) skew correction (mask); (h) extract wing using the mask; (i) binarization.
Classification phase
In this phase, the wing images are assigned to one of the three studied species. By comparing the sum of white pixels in the wings, the species can be distinguished with high accuracy. As the images are captured in the same condition, the size of the wings can be used as a discriminant feature for classification. Therefore, after the colour pattern of the wings, the size was also checked. Analyses were performed with the Matlab software version R2018a. One-way analysis of variance procedures (ANOVA) and pairwise tests were applied, with post hoc Tukey's HSD test, in the SAS statistical programme.
Results
Geometric morphometric study
The MANOVA clearly showed that differences in the forewing were highly significant among the studied species (Wilks' Lambda = 0.01077021, F = 36.94, P < 0.001). The first 2 relative warps (RW1 and RW2) exhibited 28.88% and 17.01% of the total variation. The shape characters separate D. chobauti from D. citri as well as D. chobauti from D. zygophylli along RW1, whereas D. citri is separated from D. zygophylli along RW2. The position of each specimen is shown along the relative warps’ axis (fig. 3). The wing shapes were visualized by the TPS analysis. The TPS analysis showed that the shape deformations were mainly derived from the middle of the wing (landmarks 5, 7, 9, 12 and 13). The superimposed landmark configurations are shown in fig. 4a and the superimposed forewing shapes in fig. 4b. In the cluster analysis, the three species formed two groups: D. citri + D. zygophylli and D. chobauti (fig. 5). The forewing size comparison showed significant differences between some species (F = 200.43, P < 0.0001). Forewing size (CS) differed significantly between D. chobauti + D. citri (smaller) and D. zygophylli (larger) (fig. 6), but not between D. chobauti and D. citri.

Figure 3. Scatter plot for the RW1 and RW2 on the forewing of the three Diaphorina species. Circle: D. citri; square: D. chobauti; pentagon: D. zygophylli.

Figure 4. (a) Superimposed landmark configurations and (b) superimposed forewing shapes of the three Diaphorina species: D. citri ………, D. chobauti ––––– and D. zygophylli - - - -.

Figure 5. Phenogram plotted by UPGMA method based on the generalized distance matrix in three psyllid species of the genus Diaphorina.

Figure 6. Forewing size (centroid size) comparison in the three Diaphorina species.
Molecular study
The COI barcode sequences of the three species were successfully amplified. For the clustering analyses, sequences from GenBank of two additional species, viz. Diaphorina communis Mathur, 1975 and D. lycii, were included (figs 7 and 8). The two species were chosen as their sequences are similar to the other three species. No intraspecific genetic difference was found between individuals within the three studied species. The number of base substitutions per site (p-distance) between D. chobauti and D. citri is 0.244, and that between D. citri and D. zygophylli 0.103 (table 2). The three studied species formed separate groups in the clustering analyses (fig. 7).

Figure 7. COI gene tree of the three Diaphorina species. The scale bar shows 10 changes. Bootstrap support values are indicated above the nodes.

Figure 8. Diaphorina spp. (a, b) D. chobauti; (c, d) D. citri; (e, f) D. communis; (g, h) D. lycii, (i, j) D. zygophylli. (a, c, e, g, i): habitus; (b, d, f, h, j): head, dorsal view.
Table 2. Pairwise distance between Diaphorina chobauti, D. citri and D. zygophylli. Upper matrix, Molecular; Lower matrix, Morphometric

Digital image processing
The ratio of white to black pixels in some parts of the wing is shown in table 3. The proposed method based on image processing for classifying the three studied species was executed with 49 microscopic images. The images were split randomly into two sets, the training set and the test set, with 80/20 ratio. The accuracy of our model which is simulated by the Matlab software was 100%, i.e. all test samples were classified correctly. The results showed that in D. citri, the dark pattern spreads over areas along the anterior and posterior wing margins with a gap in the apical part of cell r 1. In D. chobauti, the dark pattern is similar but with an additional gap in cell m 2 in addition to that in cell r 1. In D. zygophylli, the dark spots are spread mostly over the apical third of the wing.
Table 3. Ratio of white to black pixels in the forewing of Diaphorina chobauti, D. citri and D. zygophylli.

Means with the same letter are not significantly different (P < 0.05).
Discussion
Although geometric morphometrics has been used to separate species in different taxa, including Hemiptera (Shamsi Gushki et al., Reference Shamsi Gushki, Lashkari and Mirzaei2018; Lashkari et al., Reference Lashkari, Burckhardt and Shamsi Gushki2020), it has not been applied to Diaphorina, a psyllid genus with morphologically similar species that are notoriously difficult to identify. The forewing shape of many Diaphorina species is oblong oval, widening toward the apical quarter as in the three studied species. In D. zygophylli, the fore margin is slightly curved proximally and sharply bent distally, while in D. chobauti and D. citri it is straight proximally and weakly bent distally. The TPS analysis also showed that forewings of D. chobauti are narrower than those of the other two species, as well as the angle of the junction of veins M and Cu as well as R and M + Cu is smaller than those in the other two species, therefore the perpendicular distance between the veins is shorter than in the other two species. D. citri and D. zygophylli share a similar forewing width. Based on the morphometric data, the best characters separating the three species are found in the middle portion of the forewing, related to the veins Rs, M and C + Sc. Similar good results using geometric morphometrics for separating morphologically close species in the psyllid genera Psyllopsis Löw, 1879, and Agonoscena Enderlein, 1914, were obtained by Shamsi Gushki et al. (Reference Shamsi Gushki, Lashkari and Mirzaei2018) and Lashkari et al. (Reference Lashkari, Burckhardt and Shamsi Gushki2020).
A suitable genetic marker for species identification must be (1) a conserved region, (2) short enough to be sequenced in a single reaction and (3) contain enough variability to be informative for species identification (Nicolas et al., Reference Nicolas, Schaeffer, Missoup, Kennis, Colyn, Denys, Tatard, Cruaud and Laredo2012). In many insect groups, the COI gene serves as a good standard for barcodes (Hebert et al., Reference Hebert, Penton, Burns, Janzen and Hallwachs2004a, Reference Hebert, Stoeckle, Zemlak and Francis2004b; Hajibabaei et al., Reference Hajibabaei, Janzen, Burns, Hallwachs and Hebert2006; Ashfaq et al., Reference Ashfaq, Hebert, Mirza, Khan, Zafar and Mirza2014a, Reference Ashfaq, Hebert, Mirza, Khan, Mansoor, Shah and Zafar2014b, Reference Ashfaq, Prosser, Nasir, Masood, Ratnasingham and Hebert2015). The primers used in the current study showed their effectiveness for the studied species and were also successfully applied in other studies on Psylloidea (Percy et al., Reference Percy, Butterill and Malenovský2016; Lashkari et al., Reference Lashkari, Shamsi Gushki and Mirzaei2019, Reference Lashkari, Burckhardt and Shamsi Gushki2020). Although the climatic conditions vary between the collection sites, no intraspecific genetic difference was found between populations and individuals of each studied species, including all Iranian geographic populations of D. citri. On the other hand, Lashkari et al. (Reference Lashkari, Manzari, Sahragard, Malagnini, Boykin and Hosseini2014) showed differences between populations of D. citri from Iran and Pakistan. Boykin et al. (Reference Boykin, De Barro, Hall, Hunter, McKenzie, Powell and Shatters2012) found eight COI haplotypes of D. citri among 52 collections worldwide. They used a novel COI primer in their analysis, which is different from the standard location of the DNA barcoding site. Therefore, regarding the positive features of the primers used in Percy's study (Percy et al., Reference Percy, Butterill and Malenovský2016), a new study of worldwide collections of D. citri should be performed for understanding better how to select suitable primers. Based on the cluster analysis, our molecular data support the results of morphometric analysis. Of the 76 described Diaphorina species worldwide, eight species have been reported from Iran, i.e. D. aegyptiaca Puton, 1892, D. chobauti, D. enormis Loginova, 1978, D. luteola, D. lycii, D. tamaricis Loginova, 1978, D. zygophylli (Burckhardt and Lauterer, Reference Burckhardt and Lauterer1993) and D. citri (Bové et al., Reference Bové, Danet, Bananej, Hassanzadeh, Taghizadeh, Salehi and Garnier2000). Of these, the following five species were reported from Kerman province: D. chobauti, D. citri, D. luteola, D. lycii and D. zygophylli. Except for D. citri which is a major pest in Iran, little is known about the occurrence and distribution of Diaphorina species in Iran. Based on the current knowledge, it seems that after D. citri, D. chobauti is the most widely distributed species in the Kerman Province.
In D. zygophylli, the fore margin of the forewing is relatively curved proximally, sharply bent distally and the colour pattern covers mostly the apical third of the wing. D. chobauti and D. citri possess similar forewing features, i.e, the fore margin is straight proximally, slightly bent distally and the dark pattern covers areas along the anterior and posterior wing margins. In D. citri, the colour pattern displays a gap in cell r 1, whereas, in D. chobauti, the colour pattern has gaps in cell r 1 and m 2. In this study, we show that the combination of three methods, molecular analyses, geometric morphometrics and digital image processing, can be very helpful in species identification.
Acknowledgements
The authors thank Dr Liliya Serbina and an anonymous reviewer for useful suggestions on a previous manuscript draft, as well as Dr Azadeh Habibi for technical advice. The authors would like to acknowledge the financial support by the Iran National Science Foundation (INSF) under grant number 51586 and by the Institute of Science and High Technology and Environmental Sciences, Graduate University of Advanced Technology, Kerman, Iran, under grant number 96/3318.