Introduction
Man-made oak savannah-like woodlands are landscapes widespread in southern Iberia (the so-called dehesas) with outstanding socio-economic, ecological and biodiversity values protected under the EU Habitats Directive (CEC, 1992; Montero et al., Reference Montero, San Miguel, Cañellas, Jiménez-Díaz and Lamo de Espinosa1998; Bugalho et al., Reference Bugalho, Caldeira, Pereira, Aronson and Pausas2011; Ramírez-Hernández et al., Reference Ramírez-Hernández, Micó, Marcos-García, Brustel and Galante2014). Among the most important wood-boring insects associated with these oaks, three large longhorn beetles stand out: two Cerambycinae, namely Cerambyx welensii Küster, 1846 (Cw) and Cerambyx cerdo Linnaeus, 1758 (Cc), and one Prioninae, Prinobius myardi Mulsant, 1842 (Pm) (Naveiro & Morcuende, Reference Naveiro and Morcuende1994; López-Pantoja et al., Reference López-Pantoja, Domínguez-Nevado and Sánchez-Osorio2008, Reference López-Pantoja, Domínguez-Nevado and Sánchez-Osorio2016; Torres-Vila et al., Reference Torres-Vila, Mendiola-Díaz and Sánchez-González2017a, Reference Torres-Vila, Zugasti-Martínez, Mendiola-Díaz, De-Juan-Murillo, Sánchez-González, Conejo-Rodríguez, Ponce-Escudero and Fernandez-Morenob). The life cycle of these longhorns is typically associated with old, decayed or diseased oak trees, and they are included among the highly diverse assemblage of saproxylic (wood-dwelling) insects. This functional group is important in wood degradation and hollow formation processes in old trees, later used as shelters for a large array of species, so these longhorns greatly contribute to biodiversity in oak forests (Speight, Reference Speight1989; Grove, Reference Grove2002; Buse et al., Reference Buse, Ranius and Assmann2008; Micó et al., Reference Micó, García-López, Sánchez, Juárez and Galante2015). However, since Cw and Cc are primary saproxylic beetles that need living and fresh wood to feed on and develop, they also colonize young/healthy living trees and can become harmful or pest species (López-Pantoja et al., Reference López-Pantoja, Domínguez-Nevado and Sánchez-Osorio2008, Reference López-Pantoja, Domínguez and Sánchez-Osorio2011; Carrasco, Reference Carrasco2009; Torres-Vila et al., Reference Torres-Vila, Sánchez-González, Ponce-Escudero, Martín-Vertedor and Ferrero-García2012, Reference Torres-Vila, Zugasti-Martínez, Mendiola-Díaz, De-Juan-Murillo, Sánchez-González, Conejo-Rodríguez, Ponce-Escudero and Fernandez-Moreno2017b; Torres-Vila, Reference Torres-Vila2017).
The pest and legal status of these longhorn species differs markedly depending on the geographical and forest context. Cw is currently considered an emerging pest involved in oak decline in Iberia (Martín et al., Reference Martín, Cabezas, Buyolo and Patón2005; López-Pantoja et al., Reference López-Pantoja, Domínguez-Nevado and Sánchez-Osorio2008; Carrasco, Reference Carrasco2009; Torres-Vila et al., Reference Torres-Vila, Sánchez-González, Ponce-Escudero, Martín-Vertedor and Ferrero-García2012, Reference Torres-Vila, Sánchez-González, Merino-Martínez, Ponce-Escudero, Conejo-Rodríguez, Martín-Vertedor and Ferrero-García2013); Cc is a protected species in Europe (CE, 1979; CEC, 1992; IUCN, 2010), although it has been also reported as a harmful or pest species (Torres-Vila, Reference Torres-Vila2017); and Pm is a secondary or minor pest (López-Pantoja et al., Reference López-Pantoja, Domínguez and Sánchez-Osorio2011; Torres-Vila et al., Reference Torres-Vila, Zugasti-Martínez, Mendiola-Díaz, De-Juan-Murillo, Sánchez-González, Conejo-Rodríguez, Ponce-Escudero and Fernandez-Moreno2017b). The three longhorns exhibit a typical western Palaearctic distribution with respective ranges widely overlapping in southern Europe. Main differences in geographical range are that Cw and Pm are rare or absent in central and northern Europe (attributed to their more thermophilic nature), and that Cw does not occur in North Africa (Bense, Reference Bense1995; Vives, Reference Vives2000; González-Peña et al., Reference González-Peña, Vives-Noguera and de Sousa-Zuzarte2007; Sama, Reference Sama and Audisio2013). In southern Iberia, Cw, Cc and Pm are widespread, share a similar ecological niche and often coexist in sympatry in the same stands, even in a single tree or branch section. Wood quality and host preference are major factors shaping larval resource partitioning between these longhorn species, although heterospecific adults may interact with one another in the same oak trunk (Torres-Vila et al., Reference Torres-Vila, Zugasti-Martínez, Mendiola-Díaz, De-Juan-Murillo, Sánchez-González, Conejo-Rodríguez, Ponce-Escudero and Fernandez-Moreno2017b).
It follows that there is a need to accurately distinguish these cerambycid species in forests harbouring mixed populations, not only for control, management or protection purposes, but also to investigate ecological aspects such as demography, habitat occupancy, larval assemblages and intraguild competition (Torres-Vila et al., Reference Torres-Vila, Mendiola-Díaz and Sánchez-González2017a, Reference Torres-Vila, Zugasti-Martínez, Mendiola-Díaz, De-Juan-Murillo, Sánchez-González, Conejo-Rodríguez, Ponce-Escudero and Fernandez-Morenob). The identification of these three species can be problematic, however, depending on species and development stage. Adults of Prinobius and Cerambyx are similar in size and brownish colour, but they can be easily recognized by their general look, pronotum shape and coxae prosternal process. Differences between Cw and Cc adults are subtler but reliable, even if identification may be sometimes difficult in intermediate or small adults, which are not uncommon in the wild (Starzyk & Strojny, Reference Starzyk and Strojny1985; Torres-Vila, Reference Torres-Vila2017; Torres-Vila et al., Reference Torres-Vila, Mendiola-Díaz and Sánchez-González2018). Main diagnostic characters are elytral shape and sculpture, apex truncation, sutural spine robustness and hairiness pattern of the hind-tarsus second-segment underside. As consequence, various useful dichotomous keys for the identification of these species in adult stage exist (Picard, Reference Picard1929; Villiers, Reference Villiers1946, Reference Villiers1978; Bense, Reference Bense1995; Vives, Reference Vives2000; Verdugo, Reference Verdugo2004; Özdikmen & Turgut, Reference Özdikmen and Turgut2009). Some studies also provide valuable information for egg taxonomy in Cerambyx species, which is mainly based in chorion sculpture and micropyle morphology (Hernández, Reference Hernández1991; Vitali, Reference Vitali2001; see also Duffy, Reference Duffy1953; Morcuende & Naveiro, Reference Morcuende and Naveiro1993).
Regarding larval taxonomy, various morphological descriptions (some of them succinct or very old) for Pm (Švácha & Danilevsky, Reference Švácha and Danilevsky1987; Naveiro & Morcuende, Reference Naveiro and Morcuende1994), Cc (Ratzeburg, Reference Ratzeburg1839 [as Cerambyx heros Fabricius]; Duffy, Reference Duffy1953; Švácha & Danilevsky, Reference Švácha and Danilevsky1988) and Cw (Xambeu, Reference Xambeu1895; Naveiro & Morcuende, Reference Naveiro and Morcuende1994) exist. Larvae of Cerambyx and Prinobius may be easily differentiated because Cerambyx larvae present a conspicuous ferruginous-pigmented band on the pronotum frontal margin and rounded mandible tips, while in Prinobius larvae the frontal band is missing and mandibles are pointed (Torres-Vila et al., Reference Torres-Vila, Zugasti-Martínez, Mendiola-Díaz, De-Juan-Murillo, Sánchez-González, Conejo-Rodríguez, Ponce-Escudero and Fernandez-Moreno2017b). The major taxonomic drawback arises based on the fact that Cw and Cc larvae are practically indistinguishable using morphometric criteria, a constraint further aggravated when larvae are small or coexist in larval assemblages with other related cerambycid species. To our knowledge, there is no integrating diagnostic key allowing an unambiguous differentiation of Cw and Cc larvae, assuming that larvae of both species could actually be differentiated due to subtle morphological differences and large intraspecific variability. In fact, larval diagnosis in the Cerambycidae family below the genus (or even subfamily) level through traditional taxonomy is difficult or even impossible (Kelley et al., Reference Kelley, Wingard, Szalanski and Stephen2006). In recent studies, we have used the alternative method of rearing field-derived larvae on artificial diet until adulthood, and then determining species membership (Torres-Vila et al., Reference Torres-Vila, Zugasti-Martínez, Mendiola-Díaz, De-Juan-Murillo, Sánchez-González, Conejo-Rodríguez, Ponce-Escudero and Fernandez-Moreno2017b, Reference Torres-Vila, Mendiola-Díaz and Sánchez-González2018), but this procedure entails constraints inherent to the long larval development (see Study species) such as mortality rates, consumption of time/resources and ultimately slow diagnosis. Larval taxonomic tools are important in ecological studies and are compulsory requirements for routine work in diagnostic laboratories and plant health services, both for an early detection of alien species and the diagnosis of pest species.
Precise and reliable larval diagnosis is a big challenge (Gossner & Hausmann, Reference Gossner and Hausmann2009) that may greatly complement classical taxonomy (Miller, Reference Miller2007), even if taxonomic expertise appears to be regrettably collapsing (Hebert et al., Reference Hebert, Cywinska, Ball and de Waard2003). The use of certain gene sequences (usually a fragment of the cytochrome c oxidase subunit I, [COI]) as molecular species-specific tags for identification purposes, the so-called DNA barcoding, has revolutionized taxonomy (e.g., Hebert & Gregory, Reference Hebert and Gregory2005; Bergsten et al., Reference Bergsten, Bilton, Fujisawa, Elliott, Monaghan, Balke, Hendrich, Geijer, Herrmann, Foster, Ribera, Nilsson, Barraclough and Vogler2012; Karahan et al., Reference Karahan, Douek, Paz, Stern, Kideys, Shaish, Goren and Rinkevich2017; Porter & Hajibabaei, Reference Porter and Hajibabaei2018). DNA barcoding could provide a solution for the diagnostic problem described above, ideally integrating DNA sequencing not only with traditional taxonomy but also with species ecological background (Dasmahapatra & Mallet, Reference Dasmahapatra and Mallet2006). For insects, DNA barcoding has succeeded as an effective identification tool especially in those species with no morphological keys for larvae/pupae (e.g., Ahrens et al., Reference Ahrens, Monaghan and Vogler2007), and is widely used in agronomy, conservation and pest control (see Bergsten et al., Reference Bergsten, Bilton, Fujisawa, Elliott, Monaghan, Balke, Hendrich, Geijer, Herrmann, Foster, Ribera, Nilsson, Barraclough and Vogler2012 for a review). Barcoding is supposed to allow the identification of any species at any life stage and permits a quick identification of large sample sizes to the species level.
The logic behind DNA barcoding relies on the structure of genetic variability at species level. Individuals of the same species are less genetically variable among themselves than with other species (Hebert & Gregory, Reference Hebert and Gregory2005; Bonal et al., Reference Bonal, Espelta and Vogler2011; Ratnasingham & Hebert, Reference Ratnasingham and Hebert2013; Porter & Hajibabaei, Reference Porter and Hajibabaei2018). COI intra-specific divergence rarely exceeds 2% whereas interspecific one usually surpasses it (Hebert et al., Reference Hebert, Cywinska, Ball and de Waard2003), hence, this value has been usually the limit established for species delimitation (see review in Ratnasingham & Hebert, Reference Ratnasingham and Hebert2013). DNA barcoding has some pitfalls that need to be considered to avoid misidentification, including heteroplasmy, the presence of NUMTs and Wolbachia infections (e.g., Arthofer et al., Reference Arthofer, Avtzis, Riegler and Stauffer2010). Also, hybridization artefacts among closely related species may occur when individuals externally looking like one species bear a haplotype of another related species, due to either current interspecific mating, past interbreeding or incomplete lineage sorting (e.g., Nicholls et al., Reference Nicholls, Challis, Mutun and Stone2012; van Velzen et al., Reference van Velzen, Weitschek, Felici and Bakker2012). Moreover, intraspecific genetic distance may increase along with the geographical scale, what may hamper the identification of query sequences if the reference DNA barcodes come from individuals sampled over a small range (Bergsten et al., Reference Bergsten, Bilton, Fujisawa, Elliott, Monaghan, Balke, Hendrich, Geijer, Herrmann, Foster, Ribera, Nilsson, Barraclough and Vogler2012, but see Huemer et al., Reference Huemer, Mutanen, Sefc and Hebert2014). These potential problems have triggered the development of alternative methods for putative species (operational taxonomic units, OTUs) delimitation not based on fixed intra- and interspecific genetic divergence thresholds (Hebert et al., Reference Hebert, Cywinska, Ball and de Waard2003). The General Mixed Yule Coalescent (GMYC) model (Pons et al., Reference Pons, Barraclough, Gómez-Zurita, Cardoso, Durán, Hazell, Kamoun, Sumlin and Vogler2006) or the Barcode Index Number (BIN) approach implemented in the Barcode of Life Data System (BOLD) (Ratnasingham & Hebert, Reference Ratnasingham and Hebert2013) are among them. Both methods also allow the identification of OTUs when DNA barcodes previously assigned to Linnaean species names are available.
The overall number of DNA barcodes reported for our target longhorn beetles is extremely scarce, especially in southern Europe. For the Iberian Peninsula, no barcode sequence is publicly available so far for Cw, Cc and Pm in GenBank or BOLD (http://www.barcodinglife.org). Moreover, the degree of phylogenetic closeness between Cc and Cw remains largely unknown, as well as associated potential problems like incomplete lineage sorting between them or potential hybridization in the wild that might reduce the accurateness of an identification based on DNA barcodes. In this study, we genotyped specimens of the three mentioned cerambycids collected at a wide scale sampling in southwestern Spain using one nuclear and three mitochondrial genes. We specifically investigated: (i) the resolution of each genetic marker for a reliable taxonomic identification of individuals at different life stages, and (ii) the phylogenetic proximity between Cc and Cw to assess species delimitation according to the GMYC model and the BIN approach. This proximity could provoke erroneous molecular species identification if hybridization occurs in the wild, an aspect serendipitously detected in our study with major phylogenetic and selective implications.
Materials and methods
Study species
Cw, Cc and Pm adults are univoltine, flying from late May to August, and peaking in late June and early July (Cw and Cc) or a little later (Pm). Adults are large (about 25–60 mm) with a blackish-brown body and show sexual dimorphism. Larvae are xylophagous mainly on oak species, and bore galleries into the wood that may cause huge physiological, mechanical and structural damage to host trees. Cw and Cc adults feed mainly on tree sap and exudates while those of Pm do not feed at all. Mated females lay eggs into bark crevices, pruning cuts and cork stripping wounds. After hatching, neonate larvae bore subcortically into the inner bark and then tunnel increasingly wider and longer galleries into sapwood and heartwood. Larval development usually lasts 2–4 years and pupation occurs in early (Pm) or late summer (Cw and Cc) within a pupal cell in the sapwood. Pupal stage is relatively short and lasts about 1 month. Pm adults leave the three in the same summer to reproduce, while emerged Cw and Cc adults overwinter protected within the pupal cell in a pre-reproductive stage until the following year. Colonized trees can be identified by the presence of adult exit holes (often with frass presence evidencing larval activity) but hole morphology is not species-specific and lacks taxonomic value. Daily activity of adults (feeding, flight, mating and egg-laying) occurs typically at dusk and early evening (Duffy, Reference Duffy1953; Bense, Reference Bense1995; Vives, Reference Vives2000; López-Pantoja et al., Reference López-Pantoja, Domínguez and Sánchez-Osorio2011; Torres-Vila et al., Reference Torres-Vila, Mendiola-Díaz, Conejo-Rodríguez and Sánchez-González2016, Reference Torres-Vila, Mendiola-Díaz and Sánchez-González2017a, Reference Torres-Vila, Zugasti-Martínez, Mendiola-Díaz, De-Juan-Murillo, Sánchez-González, Conejo-Rodríguez, Ponce-Escudero and Fernandez-Morenob).
Study site and insect sampling
The study site was the whole region of Extremadura in southwestern Spain, which extends over 41.634 km2 and houses more than one million ha of oak forests, most of them dehesa open woodlands (fig. 1). The climate is typically Mediterranean with dry and hot summers (up to 40°C). Samples of the three cerambycid species were collected in different development stages for several years (2012–2016) on different oak (Quercus) species, holm oak (Q. ilex Linnaeus, 1753), cork oak (Q. suber Linnaeus, 1753), pyrenean oak (Q. pyrenaica Willdenow, 1805) and gall oak (Q. faginea Lamarck, 1785). Geographical coordinates were taken for each sampled tree with a sub-metric GPS device (Trimble Geoexplorer® 6000, Westminster, USA) and altitudes were obtained from geographical coordinates using the MDT05-LIDAR Digital Elevation Model (CNIG-IGN, 2016). Altitude ranged from 180 m (Guadiana and Tajo river valleys) to about 1200 m (mountains of the Central System in the northern region).
Insects were collected either with traps or dissecting the host trees. Free-living adults were captured in summer during the flight period with baited traps (mostly Cw and Cc adults as Pm adults rarely go to traps) designed to prevent the captured adults from drowning in the bait (Torres-Vila et al., Reference Torres-Vila, Mendiola-Díaz and Sánchez-González2017a). Larvae and Cw/Cc overwintering adults were collected inside host trees taking advantage of a parallel study dealing with the larval ecology and assemblages of the target cerambycids (Torres-Vila et al., Reference Torres-Vila, Zugasti-Martínez, Mendiola-Díaz, De-Juan-Murillo, Sánchez-González, Conejo-Rodríguez, Ponce-Escudero and Fernandez-Moreno2017b). Collected insects were arranged individually in aerated plastic containers and carried to the laboratory. Larvae were fed on artificial diet (Morales-Rodríguez et al., Reference Morales-Rodríguez, Sánchez-González, Conejo-Rodríguez and Torres-Vila2015) and reared to adulthood when necessary. Additional samples of eggs and neonate larvae were obtained in the laboratory from field-derived ovipositing females. Insect sampling was often connected with other studies on the reproductive output of the target species (Torres-Vila et al., Reference Torres-Vila, Mendiola-Díaz, Conejo-Rodríguez and Sánchez-González2016, Reference Torres-Vila, Mendiola-Díaz and Sánchez-González2017a; Torres-Vila, Reference Torres-Vila2017), so that for most sequenced individuals there was available information of their mating history, fertility and/or lineage.
DNA extraction and gene amplification
Samples to be sequenced were chosen to adequately cover either geographical, altitudinal or host ranges. We included in the analyses a total of 172 individuals from 83 sites located all over Extremadura (fig. 1, table S1). Among these individuals, 137 (60 adults and 77 larvae) were collected in the field and the rest (35 samples) were either adults, eggs or larvae obtained from laboratory rearing. Biological fluids as larval haemolymph and defensive oral secretions (saliva) were also occasionally sampled. A total of 69 adults from either field or laboratory (23 Cc, 27 Cw and 19 Pm) were selected for molecular analyses. These specimens were carefully identified to the species level observing their morphological features under a stereomicroscope and the available barcoding sequences obtained from these adults (n = 62, see Results) were used as molecular references for further identification of larvae, eggs and biological samples. For the taxonomy of Cw and Cc adults we used as main diagnostic characters elytral shape and sculpture, elytral apex truncation, apical spine robustness and the hairiness in the second segment underside of the hind tarsus (see the above-cited taxonomic references).
DNA was extracted from insect tissue following the salt extraction protocol (Aljanabi & Martínez, Reference Aljanabi and Martínez1997). The body part used for the extraction depended on the development stage of each specimen. In adults, we used one leg or a half antenna. Medium- and large-sized larvae were cut transversally and one to two central segments were taken. In case of eggs and small larvae, complete samples were processed. After removing/processing each sample, scissors, scalpels and forceps were decontaminated by cleaning and flaming with ethanol. Biological fluids (larval haemolymph and saliva) were sampled with a disposable micro-syringe. Individual samples were transferred to eppendorf tubes under 1.5 ml 96% ethanol and stored until DNA extraction.
For each sample, we amplified a 658 base pairs long fragment of the mitochondrial gene COI using the universal primers pair LCOI1490/HCOI2198 (see Folmer et al., Reference Folmer, Black, Hoeh, Lutz and Vrijenhoek1994 for details on the primer sequences and PCR protocols). However, due to the existence of sequence ambiguities at the beginning/end of some sequences, some base pairs were removed to obtain a 625 base pairs long fragment. Besides, for a selection of 12 adults (six Cc and six Cw), we also sequenced fragments of another two mitochondrial genes (12S rRNA and 16S rRNA) codifying for the small ribosomal unit (lengths 351 and 370 base pairs, respectively) (see Kambhampati & Smith, Reference Kambhampati and Smith1995 for details on the PCR reactions), as well as a fragment 631 base pair long of the D3 fragment of the 28S rRNA nuclear gene (Whiting et al., Reference Whiting, Carpenter, Wheeler and Wheeler1997). We did so to assess whether a multiple gene approach could aid in the molecular diagnosis of the two Cerambyx congeneric species, which were expected to diverge much less between them than with Pm. All sequence chromatograms were assembled and edited using Sequencher 4.6 (Gene Codes Corp., Ann Arbor, MI, USA), and edited sequences were aligned by MUSCLE software (Edgard, Reference Edgard2004) as implemented in MEGA7 (Kumar et al., Reference Kumar, Strecher and Tamura2016) using the default alignment parameters.
Species determination
The alignment set including all the COI sequences was collapsed into unique haplotypes using the online fasta sequence toolbox FaBox (Villesen, Reference Villesen2007). These haplotypes were used to build a Neighbour-Joining tree in MEGA7 (Kumar et al., Reference Kumar, Strecher and Tamura2016). The Neighbour-Joining approach (Saitou & Nei, Reference Saitou and Nei1987) based on Kimura 2 Parameter model (Kimura, Reference Kimura1980) genetic distances is analogous to that provided in BOLD (Ratnasingham & Hebert, Reference Ratnasingham and Hebert2007). This analysis allows assessing whether the haplotypes corresponding to individuals morphologically classified within the same species cluster together into discrete groups separated from the others by significant genetic discontinuities (branch lengths). In addition, for the more closely related species (Cc and Cw), we calculated the pairwise distances between all distinct haplotypes using K2P model (Kimura, Reference Kimura1980). For the other three gene fragments (mitochondrial 12S and 16S or nuclear 28S), the K2P genetic distances between the different haplotypes of the two species were also calculated.
To confirm the species distinctiveness of Cc and Cw, we additionally used the GMYC single-locus method (Pons et al., Reference Pons, Barraclough, Gómez-Zurita, Cardoso, Durán, Hazell, Kamoun, Sumlin and Vogler2006; Fujisawa & Barraclough, Reference Fujisawa and Barraclough2013) and the BIN approach implemented in BOLD (Ratnasingham & Hebert, Reference Ratnasingham and Hebert2013) to delimit independently evolving evolutionary units (i.e., OTUs). For the BIN approach, the set of sequences has to be uploaded to an active project in BOLD. The query sequences are then aligned and their genetic divergence compared to assess the number of OTUs using the Refined Single Linkage (RESL) analysis. In this analysis, sequence alignments go through a Markov clustering with an optimality criterion that refines the structure of the OTUs (Ratnasingham & Hebert, Reference Ratnasingham and Hebert2013). Doing this it is possible to know the number of putative species, and whether they can be assigned to an already existing OTU in BOLD (with a Barcode Index Number, BIN). These OTUs with BIN sometimes include sequences previously uploaded that correspond to a Linnaean name, thus allowing the new query sequences to be identified to the species level.
The GMYC approach does not make any aprioristic consideration of species membership either. Rather to the contrary, the model delimits a number of independent evolutionary units closely correlated with the number of species (Fujisawa & Barraclough, Reference Fujisawa and Barraclough2013). On the basis of a clock-constrained phylogenetic tree, GMYC method delimits distinct clusters, each of which would correspond to a different unit (i.e., species). GMYC requires an ultrametric gene tree as input and we derived this from the COI sequence data using Beast 1.7.5 (Drummond et al., Reference Drummond, Suchard, Xie and Rambaut2012). In this tree, we included all the distinct haplotypes obtained with Cc and Cw in our study site plus all the sequences available at GenBank/BOLD for the species within the Cerambycini tribe to which the genus Cerambyx belong.
To select an appropriate partitioning scheme and substitution model for the three codon positions of COI, we used Partitionfinder 1.1.1 (Lanfear et al., Reference Lanfear, Calcott, Ho and Guindon2012). We tested between the available models with Beast software using the Bayesian Information Criteria (BIC) and between all possible partitioning schemes. Partitionfinder software supported three separate partitions for the codon positions, and three different models were selected according to the BIC scores (TrNef + I, HKY + I and TrN + G for first, second and third codon positions, respectively). We used a strict clock model with a rate fixed to n = 1 and a constant size coalescent tree prior, as this could be considered conservative towards the null model when testing against the GMYC model in a likelihood ratio test. The effect of tree reconstruction method and model for the GMYC results have been investigated and, in general, a Bayesian estimation under a coalescent tree prior has performed well in comparisons (Talavera et al., Reference Talavera, Dinca and Vila2013; Tang et al., Reference Tang, Humphreys, Fontaneto and Barraclough2014). Talavera et al. (Reference Talavera, Dinca and Vila2013) also found no difference in GMYC results between using a strict or relaxed clock model for inferring the input ultrametric tree. We ran two separate Markov chain Monte Carlo (MCMC) runs, each with 10 million generations, sampled every 1000 generations. Convergence of the two runs and effective sample size values for sampled model parameters were monitored in Tracer 1.6 as implemented in Beast 1.7.5 software (Drummond et al., Reference Drummond, Suchard, Xie and Rambaut2012). Sampled trees from the two runs were combined using Logcombiner (implemented in Beast) removing 25% from each run as burning, and resampling trees at half the original frequency. Treeannotator software (implemented in Beast) was used to select the maximum clade credibility tree (MCC tree) from the sampled trees with posterior median values used for node heights.
The MCC tree with branch lengths was imported in R version 3.3.1 (R Core Team, 2016) and the GMYC analysis was conducted using the Splits package 1.0 (Ezard et al., Reference Ezard, Fujisawa and Barraclough2009) assisted by the APE package (Paradis et al., Reference Paradis, Claude and Strimmer2004). We used the single threshold method, as the multiple threshold method does not seem to improve accuracy (Fujisawa & Barraclough, Reference Fujisawa and Barraclough2013). The Splits package calculates the likelihood of the tree under a single coalescent (null model) and under a GMYC model with the single threshold at every node in the tree. The threshold at the maximum likelihood solution delimits the number of evolutionary units, which have been shown close correlation with the number of species in the tree (Fujisawa & Barraclough, Reference Fujisawa and Barraclough2013).
After species delimitation was concluded, we used POPART software (Leigh & Bryant, Reference Leigh and Bryant2015) to build statistical parsimony networks (Templeton et al., Reference Templeton, Crandall and Sing1992) and so assess the evolutionary relationships among the COI haplotypes of each species.
Results
PCR success and sequencing reaction
The overall PCR success rate in our study was quite high with not significant differences between adults (93.1%) and larvae (93.7%) (χ2 = 0.23, df = 1, P = 0.66). DNA was also successfully extracted and sequenced from both Cw and Cc eggs (six out of six, 100%) and from samples of larval haemolymph/saliva (two out of three, 66%). The mitochondrial gene COI was sequenced in all individuals with successful PCRs, but we later discarded 10% of the sequences that had not met a minimum quality for further analyses. The total number of individuals with sequences for molecular identification was finally n = 137 (table S1).
DNA barcoding and species determination
The Neighbour-Joining tree based on COI (Kimura 2 Parameter model genetic distances) including all individuals retrieved three clusters that corresponded to Cc, Cw and Pm, respectively, and grouped together adults and larvae of the same species (fig. 2). Genetic variability within each cluster was very low and each species was separated from the remainder by long branches. In Cw and Pm, most individuals had the same haplotype, whereas in Cc, that dominance did not exist (GenBank accession numbers for each haplotype shown in table S1). In Cw and Pm, the statistical parsimony networks were star-like, with one dominant haplotype linked to many rare ones genetically very close (in all cases only one mutation away) (fig. 3).
Interspecific genetic divergence (K2P model) was in all cases higher than the minimum barcoding gap accepted for species identification (Bergsten et al., Reference Bergsten, Bilton, Fujisawa, Elliott, Monaghan, Balke, Hendrich, Geijer, Herrmann, Foster, Ribera, Nilsson, Barraclough and Vogler2012; Ratnasingham & Hebert, Reference Ratnasingham and Hebert2013), but in the case of Cw and Cc, it was just slightly over 3% (table 1). The analyses of interspecific differences in two additional mitochondrial genes (12S and 16S) confirmed the genetic closeness of Cc and Cw. In the case of these genes, which exhibit a lower mutation rate than COI, the maximum distance between the distinct haplotypes of each species was always below 2% (table 2) (GenBank accession numbers for 12S haplotypes MK084974 to MK084977 and for 16S haplotypes MK088074 to MK088076). In the nuclear 28S gene fragment, interspecific separation was not yet complete, as individuals of Cc and Cw even shared the same haplotype.
DNA taxonomy and species determination
As indicated, the phylogenetic tree was built using GMYC pooling together all the different haplotypes recorded in our study area for Cc (five haplotypes: H_01 to H_05) and for Cw (seven haplotypes: H_06 to H_12) together with all the sequences available at GenBank/BOLD for the Cerambycini tribe. The GMYC analyses delimited 19 distinct evolutionary units (also called GMYC taxa) that matched very well with the taxonomical species determination (Maximum Likelihood of the GMYC model = 327.66; likelihood ratio = 38.57; P < 0.0001) (fig. 4). Those clusters grouping several sequences corresponded to individuals previously identified as the same species, with the sole exception of a single specimen of Neoplocaederus ferrugineus Linnaeus surely due to a misidentification (fig. 4).
The GMYC species delimitation confirmed the genetic distinctiveness of Cc and Cw, which were classified as different species (fig. 4). The case of Cc showed a noticeable geographic pattern as Iberian haplotypes (H_01 to H_05) formed a cluster separated from that of the remainder European sequences (fig. 4). This fact was not observed in Cw, as the Iberian haplotypes (H_06 to H_12) grouped well with the only sequence available for Cw north of the Pyrenees (fig. 4). The BIN approach gave identical results. The Cerambyx sequences belonged to two different OTUs with BINs corresponding to Cc and Cw. Within each of those BINs, intraspecific divergence was below 1%, although the inclusion of the Iberian specimens of Cc increased genetic variability to almost 2%.
Hybridization between Cc and Cw and limits to DNA barcoding
All sequenced individuals were successfully ascribed to morphological species exception found in three Cerambyx larvae, which apparently were not adequately barcoded. To ascertain if these results were due either to an experimental error, barcoding inefficacy or other reasons, we studied each case separately.
The first unexpected result came from a laboratory larvae (Tube 79, table S1) barcoded as Cc despite its mother being morphologically Cw. The mother was collected early in the season (4-VI-2012) in a baited trap so their mating history prior to capture was unknown. Given the inconsistency between the maternal and filial phenotypes, the mother was also barcoded (Tube 98, table S1) and subsequently scored as Cc. A second PCR and sequencing reaction for both larval and adult samples was performed to discard manipulation errors. Female longevity, fecundity and fertility had normal values when compared (controlling for female size) with baseline data from either Cw × Cw or Cc × Cc conspecific crosses (Torres-Vila et al., Reference Torres-Vila, Mendiola-Díaz, Conejo-Rodríguez and Sánchez-González2016; Torres-Vila, Reference Torres-Vila2017).
The second unexpected results came from two L2 larvae (Tubes 82 and 83, table S1) barcoded as Cc. Both larvae were selected among the laboratory offspring of a Cw female collected in the field (30-III-2012) overwintering inside its pupal cell (which ensured that female was virgin) in a holm oak bolt for firewood. The female had been allowed to mate just once in the laboratory with a Cw male after which eggs were laid and viable offspring produced. It follows that these two sequenced larvae were full-sibs and were barcoded as Cc despite their mother being morphologically Cw. Given the inconsistency between the maternal and filial phenotypes, the mother was also barcoded (Tube 97, table S1) and subsequently scored as Cc. Male partner was barcoded as well (Tube 99, table S1) and genetically scored as Cw in agreement with its phenotype. As in the above case, all samples were also double-checked. Female longevity and reproductive parameters also have normal values when compared with baseline data as above. These results clearly show that females in both cases were in fact hybrid specimens despite their clear Cw morphological phenotype.
Our data also allowed some rough estimates of hybridization rate. From 43 sequenced adults morphologically determined as either Cc or Cw, the mismatch between morphological and genetic species identification only occurred in the two reported hybrid females (two out of 43, 4.7%) (table S1). From 22 sequenced eggs and neonate/small larvae obtained from laboratory crosses between adults morphologically identified to the species level, hybrid offspring was observed in three larvae (two of them full-sibs) from two crosses (two out of 22, 9.1%) (table S1). We are aware that by using mtDNA, the hybrid rate we can detect is underestimated as only in those cases in which the mitochondrial genome does not correspond to the expected species membership morphologically determined (or the mother's species membership in the case of neonate larvae or eggs) we could say that the target individual is a hybrid.
Discussion
Our results clearly showed that Cw, Cc and Pm can be readily distinguished by means of DNA barcoding with an overall high success rate. We also show that a relatively simple method of DNA extraction, such as the salt extraction protocol (Aljanabi & Martínez, Reference Aljanabi and Martínez1997), produced successful results for molecular analyses irrespective of insect development stage (adult, larvae and eggs) and can be useful for small amounts of biological fluids. This is important since these insects in the field may be collected in different life stages, and their identification allows a fine characterization of the differences in habitat selection among species taking into account the whole life cycle. The Neighbour-Joining tree based on COI retrieved three clusters corresponding accurately to the three target cerambycids (Cc, Cw and Pm). As expected, the distance was shorter between the congeneric species Cc and Cw (Cerambycinae), and longer in the case of Pm which belongs to a different subfamily (Prioninae). The clustering of sequences from larvae with those from adults (of known species) allowed a clear taxonomic assignment of the studied larvae. This is especially relevant in the case of Cc and Cw, as their larvae cannot be identified based on morphological traits only.
Besides the COI, we sequenced for the first time in Cw and Cc three more genes (the mitochondrial 12S and 16S and the nuclear 28S). The two mitochondrial ones differentiated well both species (interspecific divergence was higher than the intraspecific one), although the interspecific divergence was lower than in the case of COI. In the case of the nuclear 28S, there was no differentiation between species, as Cw and Cc shared haplotypes. This gene has also shown little or no variability among closely related species in other insect genera (Ács et al., Reference Ács, Melika, Pénzes, Pujade-Villar and Stone2007) due to its low mutation rate. The power of nuclear rDNA expansion segments for species identification increases when more than one of these genes are combined, but not when used isolated as in the present study (Raupach et al., Reference Raupach, Astrin, Hannig, Peters, Stoeckle and Wägele2010). These results confirm that the mitochondrial COI, the standard gene used in DNA barcoding (Hebert & Gregory, Reference Hebert and Gregory2005; Ratnasingham & Hebert, Reference Ratnasingham and Hebert2007), is also the more accurate for species differentiation between Cw and Cc.
Both the GMYC analyses and the BIN approach matched with the taxonomical species determination in the Cerambycinae tribe and confirmed the genetic distinctiveness of Cc and Cw, which were scored as different species as expected. The ultrametric gene tree showed that they are two nearby species sharing a common ancestor very close in evolutionary time, especially when compared with the common ancestor of the congeneric species Cerambyx miles Bonelli, 1823 and Cerambyx scopolii Füessly, 1775. The present study is the first providing sequences for Cc and Cw from Iberia, and showing a dissimilar genetic background between Cc populations from Iberia and those from European countries north of the Pyrenees, with important underlying taxonomic and evolutionary connotations. These two distinct Cc clusters show that the Iberian populations of this species have gone through periods of isolation from the rest. This isolation could date back to the Pleistocene glaciations, a period in which the populations of many species linked to forests remained isolated in different Peninsulas of southern Europe (Schmitt, Reference Schmitt2007).
Our phylogenetic data could also contribute to clarify the controversial taxonomic status of Cc at the subspecific level in the western Palaearctic (Vives, Reference Vives2000; Sama, Reference Sama2002; Özdikmen & Turgut, Reference Özdikmen and Turgut2009; Danilevsky, Reference Danilevsky2017). This exemplifies how molecular studies may reveal the occurrence of greater biodiversity within species than previously thought from classic taxonomic studies (Bickford et al., Reference Bickford, Lohman, Sodhi, Ng, Meier, Winker, Ingram and Das2007). Moreover, these results stress the problems of a reduced sampling in the reference collections for species identification using DNA barcoding (Bergsten et al., Reference Bergsten, Bilton, Fujisawa, Elliott, Monaghan, Balke, Hendrich, Geijer, Herrmann, Foster, Ribera, Nilsson, Barraclough and Vogler2012), even if in some insect taxa large geographic distances show small genetic impacts (Huemer et al., Reference Huemer, Mutanen, Sefc and Hebert2014). In the case of Cw, the Iberian sequences differed little from the European references available in BOLD, and could thus be matched according to sequence identity. However, this was not the case of Cc, as the genetic distance (K2P) between any of the Iberian sequences and the references from Central Europe was always above 1%, the minimum allowed in BOLD for unequivocal species identification (Ratnasingham & Hebert, Reference Ratnasingham and Hebert2007). It is thus necessary to add barcodes from different geographic origin, as we have done in this study with Cc, to have a reference collection that successfully identifies query sequences at a wide geographic scale.
The occurrence of natural cross-breeding between Cw and Cc is highly consistent with their recent phylogenetic divergence. Incongruities between morphology/lineage and barcodes in some individuals revealed natural hybridization between Cw and Cc. Interspecific mating between Cw and Cc has been reported in the field (Martínez-García, Reference Martínez-García2011) and also observed by us in the wild and laboratory. Our barcoding results coupled with laboratory crosses clearly showed that: (1) natural hybridization between Cc and Cw is possible, (2) genetic introgression between both species may occur in the wild, and last but not least, (3) hybrids produced from interspecific crosses may be fertile and able to reproduce. Note that we can identify as hybrids only those individuals that had a mitochondrial COI haplotype that did not correspond to its morphological species identity, a fact also reported in other beetle species (Solano et al., Reference Solano, Thomaes, Cox, Carpaneto, Cortellessa, Baviera, Bartolozzi, Zilioli, Casiraghi, Audisio and Antonini2016). Moreover, our results may have restrictions, as these hybrids could correspond to ancient cross-breeding between species in which the offspring would have subsequently mated with the father's species (Cw in this case, as all the hybrids were morphologically Cw with Cc mitochondrial genes). We miss those cases in which the hybrid offspring mated with the mother's species, as that would leave no trace in mitochondrial DNA. Further analyses using fast-evolving nuclear markers inherited from both parents (e.g., microsatellites or SNPs) will show the full picture of hybridization between these two species.
Our results suggest that hybridization is geographically widespread since the two detected cases come from populations relatively far away (almost 100 km apart) at least for low-dispersal beetles (Torres-Vila et al., Reference Torres-Vila, Mendiola-Díaz and Sánchez-González2017a). In spite of the fact that hybridization between Cw and Cc occurs in the wild, hybrid frequency is expected to be low. Pre- and post-mating interspecific barriers are acting against hybridization, among which mate recognition mediated by female-produced short-range sexual pheromones (Allison et al., Reference Allison, Borden and Seybold2004) is expected to be prominent in Cerambyx as suggested by laboratory-controlled mating observations (Torres-Vila et al., unpublished data). Nevertheless, the occurrence of natural cross-breeding between Cw and Cc is consistent with the high conservation in the pheromone chemistry between related species of cerambycids (Allison et al., Reference Allison, Borden and Seybold2004; Sweeney et al., Reference Sweeney, Silk, Gutowski, Wu, Lemay, Mayo and Magee2010; Barbour et al., Reference Barbour, Millar, Rodstein, Ray, Alston, Rejzek, Dutcher and Hanks2011; Mitchell et al., Reference Mitchell, Millar and Hanks2013; Hanks & Millar, Reference Hanks and Millar2013). Hence, cross-attraction between related species may be prevented or minimized just by the ratio of minor pheromone components that putatively act as synergists for conspecifics and antagonists for heterospecifics (Mitchell et al., Reference Mitchell, Reagel, Wong, Meier, Silva, Mongold-Diers, Millar and Hanks2015). It follows that a similar pheromone blend together with a high variability in EAG responses among adults depending on their physiological status (Sánchez-Osorio et al., Reference Sánchez-Osorio, Tapias, Domínguez and López Pantoja2009) could contribute to explain the occurrence of interspecific matings and hybridization between Cw and Cc in the wild.
In conclusion, despite its usefulness for quick species identification and biodiversity assessments, DNA barcoding has got a number of limitations (see Introduction). Hybridization is one of them, as the barcode gene (COI) is in the mitochondrial genome, which is only maternally inherited. Hybrids are thus identified as the maternal species when it is not completely correct. The problem aggravates when species morphologically distinguishable nowadays interbred in the past and share the same mitochondrial genome (Nicholls et al., Reference Nicholls, Challis, Mutun and Stone2012). An expected low hybrid rate minimizing the risk of larval misidentification will confirm DNA barcoding as a useful tool for species diagnosis within the Cerambyx genus. Our finding about natural hybridization between Cerambyx species is important from a phylogenetic and evolutionary perspective, but the extent in which cross-breeding occurs in the wild and the behavioural and ecological factors involved remain to be investigated.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0007485318000925
Acknowledgements
The authors are grateful to all colleagues who provided technical assistance in the field and laboratory: Félix Fernández, Paco Ponce, F. Javier Mendiola, Álvaro Sánchez, Rafael López, Yolanda Conejo, Juan Gragera (SSV-Mérida), Carlos Zugasti, Jose M. de Juan (SSV-Cáceres), Juande del Pozo (SSV-Badajoz) and María Santoro (CSIC-UCLM). We are indebted to the owners of dehesas and oak forests for their kind permission to take samples. We also appreciate the constructive criticism of two anonymous reviewers. This research was supported by the Servicio de Sanidad Vegetal (SSV, Junta de Extremadura), and benefited from funding by the projects PPII-2014-01-P from Junta de Comunidades de Castilla-La Mancha/European Social Fund and AGL2014-54739-R from Spanish Ministry of Economy and Competitiveness.