Introduction
Taxon instability has commonly, but not exclusively, characterized fossils and has been a concern for paleontological phylogenetics (Gauthier et al. Reference Gauthier, Kluge and Rowe1988; Novacek Reference Novacek1992; Springer et al. Reference Springer, Teeling, Madsen, Stanhope and De Jong2001). On the one hand, unstable taxa can be found in alternative positions with minor differences in fit, leading to low branch supports (Wilkinson Reference Wilkinson1996; Pei et al. Reference Pei, Pittman, Goloboff, Dececchi, Habib, Kaye, Larsson, Norell, Brusatte and Xu2020; Pol and Goloboff Reference Pol and Goloboff2020). Methods devised to address this issue reveal wildcards that—upon pruning—improve support values (Buenaventura et al. Reference Buenaventura, Whitmore and Pape2017; Pei et al. Reference Pei, Pittman, Goloboff, Dececchi, Habib, Kaye, Larsson, Norell, Brusatte and Xu2020; Pol and Goloboff Reference Pol and Goloboff2020). On the other hand, unstable taxa could have multiple positions across optimal trees, reducing the resolution of the (strict) consensus tree. In this case, the problem has been summarizing relationships within a highly collapsed consensus (Pol and Escapa Reference Pol and Escapa2009; Goloboff and Szumik Reference Goloboff and Szumik2015; Pol and Goloboff Reference Pol and Goloboff2020). These problems do not necessarily have a one-to-one correspondence, because taxa reducing branch support may differ from those reducing resolution (Pol and Goloboff Reference Pol and Goloboff2020). While the former affects—by definition—support values (Wilkinson Reference Wilkinson1996; Aberer et al. Reference Aberer, Krompass and Stamatakis2013; Pol and Goloboff Reference Pol and Goloboff2020), the impact of the latter on support is less obvious. The present study focuses on this latter type of taxa that reduce consensus resolution and are referred to as “unstable.”
Because of the rarity of well-preserved fossils, missing data were often considered the most important factor causing instability (Gauthier et al. Reference Gauthier, Kluge and Rowe1988; Wiens Reference Wiens1998, Reference Wiens2003; Pol and Escapa Reference Pol and Escapa2009; Mulcahy et al. Reference Mulcahy, Reeder, Townsend, Kuczynski, Sites, Wiens, Kuczynski, Townsend, Reeder, Mulcahy and Sites2010). Wiens and Morrill (Reference Wiens and Morrill2011) summarized several examples showing that missing data are problematic only in a limited number of cases. Moreover, it was suggested that the number of characters relative to the number of taxa is more important than taxon completeness for improving resolution, support, and accuracy (Huelsenbeck Reference Huelsenbeck1991; Wiens and Reeder Reference Wiens and Reeder1995; Dragoo et al. Reference Dragoo, Honeycutt, Mammalogy and May2007). Gernandt et al. (Reference Gernandt, Holman, Campbell, Parks, Mathews, Raubeson, Liston, Stockey and Rothwell2016), conversely, studied taxa with different degrees of incompleteness (33.8%–74.5% missing data) and found that excluding the most incomplete fossils resulted in better resolutions.
In comparing the stability of fossils and extant taxa, Cobbett et al. (2007: pp. 761) concluded that fossils are more likely to diminish branch support than extant taxa. The authors also stressed that fossils rarely increased homoplasy or the number of polytomies in a strict consensus and that taxon instability is not a simple function of missing data (Cobbett et al. Reference Cobbett, Wilkinson and Wills2007). Cobbett et al. (Reference Cobbett, Wilkinson and Wills2007), nevertheless, did not discuss the extent to which these unstable taxa affect support. That is, whether the number of nodes lost due to taxon instability reduces the support at a given “rate.” Moreover, because not only terminals but also clades may be unstable (Pol and Escapa Reference Pol and Escapa2009), the impact on support may depend on the nature of the unstable taxa. As instability is caused by several factors (Wiens Reference Wiens2003; Cobbett et al. Reference Cobbett, Wilkinson and Wills2007; Pol and Escapa Reference Pol and Escapa2009), the relationship between wildcards and branch support could be more intricate than suggested (Cobbett et al. Reference Cobbett, Wilkinson and Wills2007; Wiens and Morrill Reference Wiens and Morrill2011).
Another issue, seldom considered by previous authors (e.g., Wiens Reference Wiens2003; Cobbett et al. Reference Cobbett, Wilkinson and Wills2007; Manos et al. Reference Manos, Soltis, Soltis, Manchester, Oh, Bell, Dilcher and Stone2007), is that characters with missing data may display artificially low levels of homoplasy (Goloboff Reference Goloboff2014). Weighting characters according to their homoplasy has been of interest to systematists, and proposing classifications based upon the most reliable characters is the core of taxonomy (Goloboff Reference Goloboff1993, Reference Goloboff1998). Gernandt et al. (Reference Gernandt, Holman, Campbell, Parks, Mathews, Raubeson, Liston, Stockey and Rothwell2016) conducted tree searches under implied weighting (Goloboff Reference Goloboff1993) and found that character weighting and sampling highly complete fossils improved the analyses in comparison with equal weighting and Bayesian inference. As opposed to “standard” implied weighting, extended implied weighting allows characters to be weighted against their homoplasy while assuming that missing entries would add homoplasy upon being observed (see details in Goloboff Reference Goloboff2014). Gernandt et al. (Reference Gernandt, Holman, Campbell, Parks, Mathews, Raubeson, Liston, Stockey and Rothwell2016), however, ran tree searches under standard implied weighting. Therefore, whether extrapolating a proportion of the observed homoplasy to missing entries diminishes the problems caused by taxon instability remains underexplored.
Here, we evaluate wildcards that reduce consensus resolution for their impact on branch support. Analyses were conducted in 30 empirical datasets using parsimony as the optimality criterion and under both equal weighting and extended implied weighting. In addition, factors that could (1) contribute to taxon instability or (2) be associated with changes in branch support were evaluated. In our analyses, pruning unstable terminals that either collapsed few nodes in the strict consensus or were located closer to the root node improved the support of the remaining groups in the main subtree. Pruning the complete sets of wildcards (i.e., clades and terminals) also improved the support in the main subtree in midsized datasets characterized by low proportions of missing data. Down-weighting characters based on their homoplasy and taking into account their missing entries consistently reduced taxon instability.
Methodology
Unstable taxa that reduce consensus resolution and their impact on branch support were assessed in 30 empirical morphological matrices spanning a wide range of organismal groups and taxonomic categories. Datasets were obtained from open databases (MorphoBank and TreeBase), excluding matrices that consisted of more than 65% of missing data (Table 1). In the matrices, the number of sampled taxa and characters ranged from 11 to 173 and from 30 to 497, respectively. Except for a single dataset (Tavares et al. Reference Tavares, Warsi, Balseiro, Mancina and Dávalos2018), matrices included fewer taxa than characters (character:taxon ratio; Table 1). Although the Mk model—wherein transition rates among states are equal—is becoming widely used to analyze morphological data under Bayesian inference, the assumptions of model-based methods regarding morphological data are still controversial (O'Reilly et al. Reference O'Reilly, Puttick, Parry, Tanner, Tarver, Fleming, Pisani and Donoghue2016; Goloboff et al. Reference Goloboff, Torres and Arias2017, Reference Goloboff, Pittman, Pol and Xu2019; Goloboff and Arias Reference Goloboff and Arias2019). Thus, the evaluations conducted here rely on parsimony as an optimality criterion as implemented in TNT 1.5 (Goloboff et al. Reference Goloboff, Farris and Nixon2008b; Goloboff and Catalano Reference Goloboff and Catalano2016).
Table 1. Datasets analyzed with the number of taxa and characters sampled in each case. The proportion of missing cells refers to the number of cells with either “?” or “-” (found in the matrices analyzed in the cited papers) relative to the total number of cells in each dataset (taxon number × character number).

Tree searches were conducted in TNT using tree drifting, sectorial searches, and tree fusing (xmult; Goloboff Reference Goloboff1999) by running 10 initial replicates and finishing the run after hitting the best score 10 times. Branch support was estimated with symmetric resampling as frequency differences between the most frequent groups and their most frequent contradictory group (GC) (resample; Goloboff et al. Reference Goloboff, Farris, Källersjö and Oxelman2003). Tree searches and branch support calculations were performed under both equal weighting and extended implied weighting (xpi; Goloboff Reference Goloboff2014). Given the lack of a clear criterion to choose a reference concavity value under implied weighting, three concavity values were arbitrarily chosen: k5, k10, and k15. Characters were individually down-weighted by taking into account their homoplasy and their proportion of missing entries and extrapolating a proportion P of the observed homoplasy to the missing entries (P = 0.5; Goloboff Reference Goloboff2014). Throughout the paper, “incompleteness” refers to the proportion of missing entries in a given terminal. All subsequent evaluations were implemented in TNT scripts by using the TNT macro language, available as Supplementary Material.
Resampling Morphological Characters
To assess whether the inclusion of different amounts of morphological data has any effect on the stability of wildcards, several subsets of characters were iteratively analyzed following a bootstrap-like routine. For each dataset, a number N of characters was chosen at random, and tree searches were conducted on this subset of N characters; N varied between five and the total number of characters in the respective datasets. Ten replicates per dataset and per concavity value were performed, leading to 900 replicates. Replicates analyzed under each concavity were subsequently analyzed under equal weighting; thus concluding in 1800 tree searches. Some authors have suggested that exacerbated levels of character conflict, as in random-like data (“noise” sensu Wenzel and Siddall Reference Wenzel and Siddall1999; Dávalos et al. Reference Dávalos, Velazco, Warsi, Smits and Simmons2014), can affect either stability or support. Here, iteratively resampling characters enables considering characters whose signal is obscured by character conflict in the complete datasets. Additionally, contrary to simulated data, this approach allowed us to evaluate the effect of adding higher proportions of missing entries relative to the complete dataset while maximizing realism (Prevosti and Chemisquy Reference Prevosti and Chemisquy2010).
The analyses evaluate taxa that are unstable across multiple optimal trees under both equal weighting and extended implied weighting. Because implied weighting typically produces fewer optimal trees compared with equal weighting (Goloboff et al. Reference Goloboff, Carpenter, Arias and Esquivel2008a), evaluations entailed replicates that produced multiple optimal trees under all weighting schemes. The present work is organized in two complementary sections. The first (1) explores the influence of three factors (character:taxon ratio, taxon incompleteness, and scored data for closely related taxa) on “terminal instability.” The second (2) evaluates the extent to which the branch support is affected by “unstable terminals” and “clades.”
Selecting Taxa Unstable across Optimal Trees, Computing Instability and Exploring the Influence of Taxon Incompleteness, Scored Data for Closely Related Taxa and Character Sampling (1)
Unstable terminals were selected from those that increased the resolution of the strict consensus by at least a single node after being pruned. To identify these unstable terminals, the optimal trees found in each replicate were subjected to iterPCR, a triplet-based method that calculates the positional congruence of the taxa across optimal trees (PC; Estabrook Reference Estabrook1992; Pol and Escapa Reference Pol and Escapa2009; Goloboff and Szumik Reference Goloboff and Szumik2015). From the set of unstable taxa, any terminal located in an n-degree polytomy of the consensus was chosen at random (where n ≥ 3 and < T − 2, T being the number of taxa). Terminal instability was then estimated as the positional congruence score (PC score), where values below 1.0 indicate increasing instability (Pol and Escapa Reference Pol and Escapa2009). Additionally, instability was assessed as the number of collapsed nodes in the reference polytomic node (i.e., n) relative to the resolution of the equivalent group in the EPT(s). This accounts for the nodes spanned by any given number of alternative positions, regardless of the number of optimal trees recovered, because the number of nodes collapsed as a consequence of alternative positions is a direct function of the maximum “displacement” between them.
To explore the causes for terminal instability, we plotted both the PC scores and the number of collapsed nodes against three variables: (1) character:taxon ratio, (2) the proportion of missing entries in the selected wildcard (“taxon incompleteness”), and (3) the proportion of missing entries per character in a given clade (MC). The character:taxon ratio is defined as the ratio between the number of resampled characters (N) and the number of taxa in the full dataset. Taxon incompleteness is the sum of missing entries for the selected terminal. The proportion of missing entries per character (MC) is given by the number of missing entries M observed in T terminals originating from the reference polytomic nodes divided by T. Both taxon incompleteness and MC are scaled to N so that scored entries are minimized as both measures approach one. Both measures converge as they increase, although they express complementary aspects of the datasets. Taxon incompleteness reflects the preservation of the fossil record, whereas MC represents the degree of overlapping data between closely related taxa. Because the analyses are based on resampled matrices, both the MC and the taxon incompleteness can be higher than in the complete datasets.
Changes in Branch Support after Pruning Terminals and Clades Unstable across Optimal Trees (2)
The previous set of analyses focused on the factors promoting instability of terminals across multiple optimal trees. As stated earlier, while there are unstable taxa that by definition reduce branch support, taxa that are unstable across multiple optimal tees may affect support in a less obvious way. A simple approach to assess the latter issue is by comparing the (sum of) support for the groups in the main subtree after pruning unstable taxa with the (sum of) support for the groups in the complete (unpruned) tree. This is a ratio wherein values above or below unity indicate increase or decrease in support in the main subtree after pruning unstable taxa, respectively. Throughout the paper, we refer to this as “change(s)” or “increase/decrease” in “support.” Commonly, more than a single taxon occupies multiple optimal positions in a given dataset, and entire clades may be unstable as well (Pol and Escapa Reference Pol and Escapa2009). Hence, changes in support values were also estimated after pruning both individual terminals and the complete set (either clades or terminals).
To evaluate how terminal instability influences support estimates, changes in the support were plotted against two variables: (1) the number of nodes collapsed by the wildcards in the consensus and (2) the number of nodes (distance) of wildcards to the root in the consensus. Given that both the instability and the distance to the root may differ within a set of unstable taxa, two different variables were plotted in assessing the impact of the complete set of wildcards (i.e., either clades or terminals). In this latter case, changes in support were compared with (3) the number N of resampled characters (sampling effort) and (4) the proportion of missing cells in N for each replicate.
Results
Terminal Instability and the Influence of Missing Data and Sampling Effort (1)
Throughout the weighting schemes, unstable terminals frequently collapsed less than 0.3 nodes relative to a fully resolved tree (Fig. 1A). Although the strongest concavity value slightly increased the number of collapsed nodes in the range 0.25–0.5 (k5; Fig. 1A), there was little difference among concavity values (Fig. 1). On average, the proportion of nodes collapsed is higher under equal weighting (0.26) than under the implied weighting (0.19; Fig. 1A). Implied weighting maximized PC, leading to higher average values (0.87) than equal weighting (0.73; Fig. 1B).

Figure 1. Instability of terminals across the complete set of replicates, under both equal weighting (ew) and extended implied weighting (xpi5, xpi10, and xpi15). A, Proportion of nodes collapsed by unstable terminals. B, Positional congruence scores (PC) of unstable terminals.
Increasing the number of sampled characters diminished the taxon instability measured as either the proportion of collapsed nodes or PC (Fig. 2A,B). For most replicates, fewer nodes were collapsed under extended implied weighting, even at character:taxon ratios below 1.0; concluding in a lower tendency line as compared with equal weighting (Fig. 2A). Regarding PC, low values (0.25) are seen under equal weighting, even at character:taxon ratios above 2.0 (Fig. 2B). Among weighting schemes, the strongest concavity resulted in the steepest slope: instability was maximized at low character:taxon ratios and minimized at higher ratios (Fig. 2A,B).

Figure 2. Instability of terminals across the complete set of replicates against increasing character:taxon ratios, under both equal weighting (ew) and extended implied weighting (xpi5, xpi10, and xpi15). Results obtained under extended implied weighting (triangles) are compared against those achieved under equal weighting (circles). Trend lines for implied weighting and equal weighting are shown as dashed and solid lines, respectively. A, Proportion of nodes collapsed by unstable terminals. B, Positional congruence (PC) scores of unstable terminals.
Taxon incompleteness showed a positive relationship with instability of terminals (Fig. 3A). Even with well-scored taxa (<0.25), collapsed nodes tended to be more than 0.5 of a fully resolved tree under equal weighting (Fig. 3A). Likewise, terminals were highly unstable (PC values < 0.25), even at low values of incompleteness under equal weighting (<0.25; Fig. 3B). MC values below 0.3 resulted in mid- to high levels of instability for both the proportion of collapsed nodes and PC scores (0.5) when taxon incompleteness was low (<0.25; Fig. 3C,D). For both taxon incompleteness and MC, implied weighting reduced the tendency of unstable terminals to collapse more than 0.5 of the nodes or to have PC scores below 0.25, and all the concavities responded similarly to MC (Fig. 3C,D).

Figure 3. Instability of taxa under extended implied weighting (xpi5, xpi10, xpi15) and equal weighting (ew). A,B, Instability of taxa relative to taxon incompleteness, under extended implied weighting (triangles and dashed trend line) and equal weighting (circles and solid trend line). A, Instability measured as the proportion of collapsed nodes by unstable terminals. B, Instability measured as the positional congruence (PC) of unstable terminals. C,D, Instability of taxa relative to the proportion of characters scored for closely related taxa (MC) and taxon incompleteness, under implied weighting (triangles) and equal weighting (circles). C, Instability highlighted as the proportion of collapsed nodes. D, Instability highlighted as the PC scores of unstable terminals.
Changes in Support after Pruning Wildcards (2)
After pruning of unstable terminals, the support for the groups in the main subtree tended to improve compared with the unpruned tree in most replicates, whereas their support decreased only in a minor proportion of replicates (Fig. 4A,D). In the main subtree, support improved more than 20%–30% relative to the unpruned trees when unstable terminals collapsed less than 0.3 nodes in a fully resolved tree (Fig. 4A). Under equal weighting, as more nodes were collapsed, support values in the main subtree showed minor changes relative to the unpruned trees (Fig. 4A). A similar trend was seen regarding the distance to the root: changes in support diminished as unstable terminals were located more distantly from the root node (Fig. 4B). Although all the explored concavities responded similarly to pruning, k10 maximized improvements in support in both cases (Fig. 4A,B).

Figure 4. Changes in branch support in the main subtree after pruning unstable taxa compared with the unpruned tree, under equal weighting (ew) and extended implied weighting (xpi5, xpi10, and xpi15). A,B, Density plot showing changes in support after pruning unstable terminals; the dashed line indicates no change in support after pruning. A, Changes relative to the proportion of nodes collapsed (relative to a fully resolved tree). B, Changes relative to the distance of unstable terminals to the root node (relative to the consensus resolution). C,D, Branch support in the unpruned trees (x-axis) plotted against the support in the main subtrees after pruning the “complete set of wildcards” (y-axis); replicates above or below the diagonal imply a better performance by the respective approach. C, The proportion of missing cells is highlighted in each replicate. D, The character:taxon ratio is highlighted in each replicate.
Upon pruning of the complete set of wildcards (either terminals or clades), the branch support values tended to improve in most replicates (i.e., replicates above the diagonal; Fig. 4C). In matrices with less than 30% of missing cells, the branch support was improved under both equal and extended implied weighting (replicates above 50 in the x-axis; Fig. 4C). Replicates with a higher proportion of missing data, despite enhancing support, showed little improvement in the main subtree after pruning (replicates below 50 in the x-axis; Fig. 4C). In replicates with midrange character:taxon ratios (2.25–3), the support increased after pruning wildcards relative to both smaller and larger replicates (Fig. 4D). Extended implied weighting maximized the improvement of support after pruning wildcards compared with equal weighting, although all concavities led to similar improvements (i.e., distance of replicates to the diagonal; Fig. 4C,D).
Discussion
In the present study, taxa that are unstable across a set of optimal trees were assessed for their impact on branch support under different weighting schemes. In the first set of analyses, the causes underlying terminal instability across optimal trees were revisited. The second set of analyses assessed the extent to which unstable terminals affect support by considering the terminals’ instability and their distance to the root. In addition, the second set of analyses also addressed whether complete sets of unstable taxa obscure branch support by taking into account character sampling and the amount of missing data.
Our results expand on previous studies (Cobbett et al. Reference Cobbett, Wilkinson and Wills2007; Gernandt et al. Reference Gernandt, Holman, Campbell, Parks, Mathews, Raubeson, Liston, Stockey and Rothwell2016) by indicating that branch support is most affected by the same taxa that decrease the resolution of the consensus of optimal trees. Similarly, our analyses agree with previous studies in that the number of characters and missing entries affect wildcard stability (Wiens Reference Wiens2003; Wiens and Morrill Reference Wiens and Morrill2011; Figs. 2, 3). Wildcards that collapsed fewer than 30% of the nodes of a fully resolved tree or were deeply located in the consensus had a strong impact on support values (Fig. 4). Pruning the complete set of unstable taxa leads to improved support estimates mostly in well-scored datasets (≤30% of missing cells) and replicates with character:taxon ratios of 2.25–3 (Fig. 4C,D). More importantly, extended implied weighting was shown to reduce taxon instability and increased support improvements after pruning.
Instability of Terminals (1)
In our replicates, increasing the character:taxon ratio reduced the proportion of nodes collapsed by unstable terminals (Fig. 2), which agrees with previous studies (Manos et al. Reference Manos, Soltis, Soltis, Manchester, Oh, Bell, Dilcher and Stone2007; Wiens and Morrill Reference Wiens and Morrill2011). A novel result here, not found in previous analyses running standard implied weighting (Gernandt et al. Reference Gernandt, Holman, Campbell, Parks, Mathews, Raubeson, Liston, Stockey and Rothwell2016), is that the number of collapsed nodes is reduced under extended implied weighting: as the character:taxon ratio decreased, fewer replicates collapsed more than 25% of a fully resolved tree compared with equal weighting (Fig. 2A). Similarly, while PC values below 0.25 are seen at character:taxon ratios of 0.3–0.5 under equal weighting, lower character:taxon ratios are needed under implied weighting to achieve such PC values (Fig. 2B). Although k5 maximized stability at higher character:taxon ratios (Fig. 2A,B), all concavities responded similarly to character addition. Overall, down-weighting homoplastic characters increases stability, even at low character:taxon ratios, compared with equal weighting.
Likewise, as seen in earlier studies (e.g., Gauthier et al. Reference Gauthier, Kluge and Rowe1988; Wiens Reference Wiens1998, Reference Wiens2005), the proportion of missing entries in terminals correlates with their instability (Fig. 3A,B). Wiens (2003) documented that 25% of missing data is enough for collapsing half of the nodes of a fully resolved tree. In our datasets, more than half of the nodes in a fully resolved tree are susceptible of being collapsed at lower levels of incompleteness—below 25%—by a single wildcard (Fig. 3A). PC scores also dropped below 0.25 at low levels of incompleteness (<25%; Fig. 3B). However, under extended implied weighting, more than half of the nodes are prone to be collapsed only when taxon incompleteness approaches 50% (Fig. 3A). Only one concavity revealed slightly more unstable results at low levels of taxon incompleteness (k10); yet, extended implied weighting reduced instability compared with equal weighting (Fig. 3B). Despite missing data having a more severe impact than previously suggested (Wiens Reference Wiens2003), our results imply that down-weighting characters attenuate the influence of taxon incompleteness.
Although high levels of taxon incompleteness maximized instability, low levels of MC were sufficient to collapse more than 0.5 nodes in a fully resolved tree (Fig. 3C). Likewise, PC scores below 0.5 are achieved at low proportions of MC (Fig. 3D). While MC was here defined as unscored data for a specific node, it has been treated as “character incompleteness” by other authors (i.e., unscored taxa for a given character; Wiens and Morrill Reference Wiens and Morrill2011). It is clear from our analyses that “character incompleteness”—either seen as such or as unscored data for a given clade—depends on “taxon incompleteness” (Fig. 3C,D), an issue rarely considered before (see summary in Wiens and Morrill Reference Wiens and Morrill2011). Here, instability was more strongly affected by taxon incompleteness, although low MC values collapsed more than 0.5 nodes. As before, extended implied weighting reduced the number of replicates that collapsed such nodes or had PC scores below 0.25 (Fig. 3C,D). Altogether, this suggests that both scoring few characters for closely related terminals and not considering the homoplasy of missing entries are also detrimental for taxon stability.
Changes in Branch Support after Pruning Wildcards (2)
Our second set of analyses showed that pruning unstable terminals makes little difference in branch support (in the main subtree) when they collapse several nodes or are distant to the root node (Fig. 4A,B). Based on previous studies (Cobbett et al. Reference Cobbett, Wilkinson and Wills2007; Wiens and Morrill Reference Wiens and Morrill2011), it is expected that as more nodes are collapsed by wildcards, the higher the branch support should be in the remaining groups upon pruning. However, the opposite trend was seen here: pruning unstable terminals that collapsed less than 30% of a fully resolved tree maximized support changes (Fig. 4A). Terminals that collapse numerous nodes are typical of datasets with large amounts of missing data (Fig. 3B); therefore, pruning these wildcards led to little improvement of support in the main subtree. Likewise, it was seen that the closer the unstable terminal is to the root, the greater the influence is on support values (Fig. 4B). Although it has been suggested that the placement of a wildcard in a consensus tree can determine the number of affected nodes (e.g., Cobbett et al. Reference Cobbett, Wilkinson and Wills2007), the distance of a wildcard to the root does not always correlate with the number of surrounding nodes, because this also depends on topology.
Our analyses indicate that pruning the complete set of wildcards, whether terminals or entire clades, maximizes support in the main subtree when matrices have less than 30% of missing data (distance to the diagonal; Fig. 4C). This suggests that, regardless of the number of characters, wildcards are more detrimental in well-scored datasets; instability being less harmful as missing data increases. Resampling datasets under a “nozeroweight” approach (see Pei et al. Reference Pei, Pittman, Goloboff, Dececchi, Habib, Kaye, Larsson, Norell, Brusatte and Xu2020), which produces low support only in presence of character conflict, improved support in highly incomplete replicates, although well-scored replicates still maximized support changes (Supplementary Fig. S1). Wiens and Morrill (Reference Wiens and Morrill2011) stated that characters with missing entries are rarely disadvantageous. Our results do not allow us to conclude whether partially scored characters are detrimental. Instead, it can be argued that wildcards often affect branch support detrimentally in well-scored datasets, wherein much of the taxon instability is driven by character conflict (Pol and Escapa Reference Pol and Escapa2009). In highly incomplete datasets, conversely, character conflict is low, given the overall poor information content (Fig. 4C). In these conditions, enhancements in support are rarely achieved when wildcards are pruned, and instability is likely caused by missing data.
The fact that, in most replicates, support is improved in the main subtree after pruning (Fig. 4C,D) shows that the negative impact of wildcards on support is more common than expected based on the amount of missing data (Cobbett et al. Reference Cobbett, Wilkinson and Wills2007; Wiens and Morrill Reference Wiens and Morrill2011). In this regard, Wiens (Reference Wiens2003) hypothesized that as more informative characters are sampled, more missing entries are needed for an incomplete taxon to be detrimental. As seen in Figure 4C,D, support was maximized in the main subtree when replicates had a low proportion of missing data (<30%) and two to three sampled characters per taxon (character taxon ratio 2.25–3). This situation can be thought as the counter case of Wiens's (Reference Wiens2003) hypothesis: given a low number of characters (between two and three per taxa), few missing cells—and consequently wildcards—can negatively influence support values; support being overturned after pruning.
However, Wiens's (Reference Wiens2003) hypothesis does not fully account for those (few) replicates for which support in the main subtree decreased after pruning. This highlights the fact that, as stated earlier, wildcards that reduce the consensus resolution may differ from those that reduce branch support (Pei et al. Reference Pei, Pittman, Goloboff, Dececchi, Habib, Kaye, Larsson, Norell, Brusatte and Xu2020; Pol and Goloboff Reference Pol and Goloboff2020). Consider, for instance, the study by Holroyd and Strait (Reference Holroyd, Strait, Fleagle and Gilbert2008), in which a single unstable clade of seven taxa reduced the resolution of the strict consensus (Fig. 5). Pruning this clade improves the consensus resolution (from 0.86 to 0.93, relative to a fully resolved tree) but leads to a lower branch support in the main subtree (from 35.6 to 33.5; Fig. 5). In considering wildcards that are unstable with minor differences in fit (Pei et al. Reference Pei, Pittman, Goloboff, Dececchi, Habib, Kaye, Larsson, Norell, Brusatte and Xu2020), four of those seven taxa—along with other two—are found to be responsible for the low support. If only those four taxa are pruned, support is improved from 35.6 to 36.8. Conversely, in the study by Herrera and Dávalos (Reference Herrera and Dávalos2016), pruning only the three taxa that reduce resolution is enough to improve support (from 28.7 to 32.7). As opposed to other studies (e.g., Gernandt et al. Reference Gernandt, Holman, Campbell, Parks, Mathews, Raubeson, Liston, Stockey and Rothwell2016), this shows that taxa reducing consensus resolution might not explain low support values in some cases.

Figure 5. Effect of pruning a clade that is unstable across optimal trees on the branch support, as estimated by symmetric resampling under equal weighting, for the groups found in the dataset by Holroyd and Strait (Reference Holroyd, Strait, Fleagle and Gilbert2008) with all characters included. Unpruned tree: average support 35.6 (sum of supports: 1886). Main subtree after pruning: average support: 33.5 (sum of supports: 1775). Note that Trogolemur myoides Matthew Reference Matthew1909 is sister to the unstable clade in the strict consensus.
Effect of Implied Weighting on Wildcards
In the unpruned trees, extended implied weighting reduced the number of nodes collapsed by unstable terminals or increased the PC (Figs. 1–3). Implied weighting has been previously shown to improve the analyses of morphological data regarding topological stability (upon the addition of new data) and branch support (Goloboff et al. Reference Goloboff, Carpenter, Arias and Esquivel2008a). Here, wildcard stability increased after down-weighting characters and assuming that missing entries conceal some homoplasy. Under such an approach, more missing data or fewer sampled characters were required to achieve the same instability values seen under equal weighting (Figs. 1–3). More importantly, under extended implied weighting, support improvement is maximized when taxa that are unstable across optimal trees are pruned.
Implied weighting makes the most reliable (less homoplastic) characters more influential. Hence, upon conflict, groups are mainly decided on the basis of the best characters at the expense of gaining more steps in the homoplastic characters (Goloboff Reference Goloboff1993). A possible mechanism for the reduction in collapsed nodes in our analyses could be based on the relative weight given to the “best” characters. By having more weight, a set of characters with less homoplasy could favor one of the multiple positions imposed by more homoplastic data, resulting in fewer collapsed nodes in the consensus. It could be argued that fewer nodes are collapsed—or more nodes are solved—because of the precision of the fit concavity function (Goloboff et al. Reference Goloboff, Carpenter, Arias and Esquivel2008a), thus biasing the present analyses toward extended implied weighting. Our protocol to resample characters was designed to ensure that replicates simultaneously produced multiple optimal trees under both equal weighting and extended implied weighting. Likewise, the most unstable wildcards were selected from polytomies in the consensus for both weighting schemes. Therefore, although not completely avoiding the problem of higher precision under extended implied weighting, such a bias is reduced by this analytical design.
Although standard implied weighting has been employed to analyze paleontological data (e.g., Gernandt et al. Reference Gernandt, Holman, Campbell, Parks, Mathews, Raubeson, Liston, Stockey and Rothwell2016), down-weighting characters against their homoplasy while extrapolating such homoplasy to missing entries has less frequently been assessed regarding taxon instability (e.g., Flores et al. Reference Flores, Bippus, Suárez and Hyvönen2020). Overall, the evaluations carried out here suggest that analyzing the data under extended implied weighting—especially taking into account the number of missing entries—is a desirable strategy to improve the stability of taxa across optimal trees and their potential impact on branch support.
Final Remarks
In the current study, taxa that reduce the consensus resolution were assessed for their impact on branch support under different weighting approaches and analytical conditions. To the best of our knowledge, this is the first empirical assessment on the utility of extended implied weighting to deal with such wildcards. It was shown that wildcards collapsing less than one-third of the nodes in a fully resolved tree could severely affect branch support (Fig. 4A). This outcome is relevant, as it highlights the extreme sensitivity of morphological data to wildcards. Although taxon incompleteness has commonly been seen as the cause for taxon instability, no conclusive evidence supporting this notion has been found (Wiens Reference Wiens1998, Reference Wiens2003; Cobbett et al. Reference Cobbett, Wilkinson and Wills2007; Wiens and Morrill Reference Wiens and Morrill2011). Our results indicate that scoring characters for closely related taxa is also important in solving collapsed nodes (Fig. 3). Other factors, such as character conflict (Pol and Escapa Reference Pol and Escapa2009), may affect taxon stability as well.
Note that taxa that are unstable with minimum differences in fit are known to reduce branch support, and devising methods to efficiently deal with them has been challenging (Aberer and Stamatakis Reference Aberer and Stamatakis2011; Aberer et al. Reference Aberer, Krompass and Stamatakis2013; Goloboff and Szumik Reference Goloboff and Szumik2015; Pei et al. Reference Pei, Pittman, Goloboff, Dececchi, Habib, Kaye, Larsson, Norell, Brusatte and Xu2020; Pol and Goloboff Reference Pol and Goloboff2020). Taxa that are unstable across optimal trees entail a different problem (increasing the resolution of a consensus tree upon pruning), and their influence on support might not be obvious in some datasets. Previous studies have found that taxa that are unstable across a set of optimal trees—typically fossils—are likely to reduce branch support (e.g., Cobbett et al. Reference Cobbett, Wilkinson and Wills2007; Wiens and Morrill Reference Wiens and Morrill2011). Here, pitfalls arising from these latter taxa were considerably amended by extended implied weighting. Because both incompleteness and character conflict can be common in paleontological datasets, down-weighting characters against homoplasy while extrapolating a proportion of such homoplasy to missing entries is especially useful in these cases.
Data Availability Statement
Data available from the Dryad Digital Repository: https://doi.org/10.5061/dryad.sf7m0cg4w.
Acknowledgments
This study was supported by FONCyT, grant PICT 0810, and PIUNT, grant G631. P. Goloboff (Unidad Ejecutora Lillo, CONICET-FML; Argentina) and an anonymous reviewer provided useful comments to improve the article. We especially thank the Willi Hennig Society for making TNT 1.5 freely available to users.