Introduction
Field trials conducted in multiple years across several locations play an essential role in plant breeding and variety testing. In plant breeding, the lines are evaluated and the most promising lines are selected for further breeding. In variety testing, based on the results from the analysis of a series of field trials, the best varieties are recommended for cultivation. Usually, the analysis of the series of trials is performed using a two-stage approach, where each combination of year and site is treated as environment. In literature, several stability measures have been described, e.g. Shukla's stability variance (Shukla, Reference Shukla1972), regression on the environmental mean approach to assessments of stability (Finlay and Wilkinson, Reference Finlay and Wilkinson1963; Eberhart and Russell, Reference Eberhart and Russell1966; Digby, Reference Digby1979), additive main effects and multiplicative interaction (AMMI) model (Gauch, Reference Gauch1988) or genotype main effects and genotype-by-environment interaction effects (GGE) model (Yan and Kang, Reference Yan and Kang2003). However, with the increase of computing power of computers and the development of statistical methods, the assessment of stability based on linear mixed models has become more popular (see e.g. Piepho, Reference Piepho1999; Piepho and van Eeuwijk, Reference Piepho, van Eeuwijk and Kang2002; Smith and Cullis, Reference Smith and Cullis2018; Studnicki et al., Reference Studnicki, Paderewski, Piepho and Wójcik-Gront2017, Reference Studnicki, Derejko, Wójcik-Gont and Kosma2019). In Piepho (Reference Piepho1999), the above mentioned models were defined in terms of variance components.
In Poland, new varieties of important species are assessed before the registration in value-for-cultivation-and use (VCU) trials, and further in post-registration trials. Based on the results of post-registration trials, a recommendation to the farmers is given. Only stable and high-yielding varieties are recommended for cultivation. Looking at the variety recommendation process from the decision-theory perspective, the problem of recommending varieties for cultivation can be treated as a decision-making problem in a situation of uncertainty. In literature, similar problems were solved by using Bayesian decision theory (see e.g. Theobald and Talbot, Reference Theobald and Talbot2002; Theobald et al., Reference Theobald, Roberts, Talbot and Spink2006). For this reason, in the present work, a Bayesian approach was used. As in Theobald et al. (Reference Theobald, Talbot and Nabugoomu2002, Reference Theobald, Roberts, Talbot and Spink2006), we first fitted a Bayesian linear mixed model, and next the varieties were compared in terms of the posterior expected utility.
The Bayesian analysis has some benefits over the frequently used models in stability analysis. First of all, most of the frequently used stability measures are likelihood-based approaches. In Searle et al. (Reference Searle, Casella and McCulloch2006) (see also Harville, Reference Harville1977), it was pointed out that the problem with likelihood-based estimation with random effects is that, except the simplest situations of balanced and orthogonal data, the sampling distributions of variances are unknown. In consequence, since the sampling distributions do not have analytical closed-form solutions, it is very difficult to properly account for estimation error associated with variances when they are used to compute weights for estimation of means. In contrast to likelihood-based estimation, Bayesian estimation produces posterior distributions that properly account for uncertainty in the estimation of all parameters. Moreover, in Theobald et al. (Reference Theobald, Talbot and Nabugoomu2002), it was pointed out that a Bayesian approach can alleviate problems associated with estimating complex models, such as zero estimates of variance components.
The use of the Bayesian approach in the context of agricultural field experiments is rare (Theobald and Talbot, Reference Theobald and Talbot2002; Theobald et al., Reference Theobald, Talbot and Nabugoomu2002; Edwards and Jannink, Reference Edwards and Jannink2006; Theobald et al., Reference Theobald, Roberts, Talbot and Spink2006; Crossa et al., Reference Crossa, Perez-Elizalde, Jarquin, Cotes, Viele, Liu and Cornelius2011; Josse et al., Reference Josse, van Eeuwijk, Piepho and Denis2014; Orellana et al., Reference Orellana, Edwards and Carriquiry2014; de Oliveira et al., Reference de Oliveira, Silva, Nuvunga, Silva and Balestre2016; Edwards and Orellana, Reference Edwards and Orellana2015; Bernardo et al., Reference Bernardo, Silva, de Oliveira, Nuvunga, Pires, Von Pinho and Balestre2018) and is mainly focused on AMMI models (Crossa et al., Reference Crossa, Perez-Elizalde, Jarquin, Cotes, Viele, Liu and Cornelius2011; Josse et al., Reference Josse, van Eeuwijk, Piepho and Denis2014, Bernardo et al., Reference Bernardo, Silva, de Oliveira, Nuvunga, Pires, Von Pinho and Balestre2018) and GGE biplots (de Oliveira et al., Reference de Oliveira, Silva, Nuvunga, Silva and Balestre2016). The Bayesian counterpart of the Finlay–Wilkinson model has been described in Lian and de los Campos (Reference Lian and de los Campos2016). In the present study, Bayesian counterparts of two stability measures described in Piepho (Reference Piepho1999) are given. In contrast to Edwards and Jannink (Reference Edwards and Jannink2006), the stability measures are defined in terms of Kronecker products. Moreover, by adopting the approaches described in Theobald et al. (Reference Theobald, Talbot and Nabugoomu2002, Reference Theobald, Roberts, Talbot and Spink2006) to our variety recommendation problem, we compare varieties in terms of the posterior expected utility. Using the described methodology, we identify the most stable and highest yielding potato varieties for a Polish series of field trials conducted in years 2016–2018. As in Edwards and Jannink (Reference Edwards and Jannink2006) and Orellana et al. (Reference Orellana, Edwards and Carriquiry2014), the stability analysis was performed under the heterogeneity of error variances assumption.
Materials and methods
Data
The data set consists of potato field trials performed in years 2016–2018. The trials used in the study were conducted in experimental stations (sites) belonging to the Research Centre for Cultivar Testing (COBORU) (Table 1).
Table 1. Sites used in the 3-year variety trials conducted from 2016 to 2018
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_tab1.png?pub-status=live)
Most of the trials were organized in a randomized complete block design with three replicates. Only in 2018, at six sites, the trials were conducted according to a 1-resolvable design with three replicates and two blocks. On each plot, there were 60 plants planted. Plots that experienced unusual damage, such as boar damage, were excluded from further analyses. During the 3 years of study, in total, 22 varieties were tested. Since we were interested in assessing yielding stability, we choose to analyse varieties which were tested for 3 years. The list of varieties used in the present study with their country of origin is given in Table 2. During the 3-year period, most of the varieties were tested in all sites, only variety Everest was tested at five sites in years 2016 and 2017.
Table 2. Varieties used in the 3-year variety trials conducted from 2016 to 2018 and their country of origin
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_tab2.png?pub-status=live)
In Polish potato VCU and post-registration trials, one of the analysed characteristics is tuber yield, which is observed on plots. The observed tuber yields are expressed in decitonnes per hectare (dt/ha).
Statistical analysis
The observations were modelled according to a linear mixed model. For the sake of clarity, throughout the paper by environment we will mean a combination of year and location.
In the analysed data set, 1-resolvable design was used only in six out of 67 environments. Speed et al. (Reference Speed, Williams and Patterson1985) have shown that dropping the block effect for this design provides still a valid analysis. For this reason, we omitted in our model the effect of blocks nested within replicates and environments.
Let y ikr be the potato tuber yield for the i-th variety (i = 1, …, I) at the k-th environment (k = 1, …, K) and in the r-th replicate (r = 1, …, R). Then the model can be written as:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_eqn1.png?pub-status=live)
where μ i denotes the mean of the i-th variety. By u k, v ik , w kr and e ikr we denote in model (1) the random effects of environments, variety × environment interaction, of replicates nested within environment and of errors, respectively. Using vector notation, model (1) can be written as:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_eqn2.png?pub-status=live)
where y is a vector of potato tuber yields and β is the vector of fixed variety means. By u, v, w, we denote in model (2) the random vectors of environment effects, of environment × variety interaction effects, and of replicates nested within environment effects, respectively. Further in model (2), by ${\bi e} = \lsqb {\bi {e}^{\prime}}_1\comma \;\ldots \comma \;\;{\bi {e}^{\prime}}_K\rsqb ^{\prime}$, where ek = [e 1,k,1, …, e I,k,R]′ (k = 1, …, K), we denote the random vector of errors. The matrices ${\bf X}$
, ${\bf Z}_1$
, ${\bf Z}_2$
and ${\bf Z}_3$
in model (2) are design matrices for β, u, v and w, respectively.
To fit the model, a hierarchical Bayesian linear mixed model was used. The model has three hierarchies or stages. In the first, it was assumed that observations are exchangeable samples from normal distribution:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_eqnU1.png?pub-status=live)
where ${\bf R}_0 = {\rm diag\lpar }\sigma _1^2 \comma \;\sigma _2^2 \comma \;\ldots \comma \;\sigma _K^2 \rpar$ is a matrix of residual variances, n k is the number of plots in the k-th environment, and ${\oplus}$
denotes the direct sum of matrices. In case when the number of plots is equal in all trials, the above covariance matrix can be written as ${\bf R}_0\otimes {\bf I}$
, where ⊗ denotes the Kronecker product.
In the second, the prior distributions are assigned to β, u, v and w; these correspond to the assumptions made on fixed and random effects in a Bayesian linear model. For the vector of variety means β a vague, large-variance Gaussian prior was assigned, i.e.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_eqnU2.png?pub-status=live)
The vector of environment effects u was assigned a normal distribution with mean vector of zeros as:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_eqnU3.png?pub-status=live)
whereas the vector of replicates nested within environments w was assigned a normal distribution with mean vector of zeros as:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_eqnU4.png?pub-status=live)
For each trial (k = 1, …, K), we assumed that ${\bi e}_k\vert \sigma _k^2 \sim N\lpar 0\comma \;\sigma _k^2 {\bf I}_{n_k}\rpar$. This implies that conditionally on $\sigma _1^2 \comma \;\ldots \comma \;\sigma _K^2$
, the vector e follows
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_eqnU5.png?pub-status=live)
The vector of environment × variety interaction v was assigned a normal distribution with mean vector of zeros and covariance matrix ${\bf G}_0\otimes {\bf I}$, i.e.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_eqnU6.png?pub-status=live)
Since we are interested in identifying the most stable varieties, by assuming different covariance matrices and modifying the list of random vectors in (2), we obtained different models. Our base model is Shukla model (SH) for which ${\bf G}_0 = {\rm diag\lpar }\sigma _{11}^2 \comma \;\ldots \comma \;\sigma _{II}^2 \rpar .$ Next, by removing from the model (2) the random vector of environment effects and assuming that ${\bf G}_0$
is completely unrestricted covariance matrix (i.e. the elements in the symmetric variance-covariance matrix ${\bf G}_0$
may take any value, as long ${\bf G}_0$
), we obtained the environmental variance model (ENVIR; Piepho, Reference Piepho1999). Since in ENVIR model, the random vector of environment effects u is excluded, the random vector v is no longer vector of interaction effects rather a random vector of environment-variety effects. For both models, the diagonal elements of covariance matrix ${\bf G}_0$
are variances corresponding to the I varieties, which are interpreted as stability measures.
In the final or last stage, prior distributions were assigned to hyperparameters $\sigma _u^2$, $\;\sigma _w^2$
, ${\bf R}_0{\bi \;}$
and $\;{\bf G}_0$
. We assumed that both $\sigma _u^2$
and $\;\sigma _w^2$
follow an inverse-Wishart distribution with 2.002 degrees of freedom and one-dimensional scale matrix equal to $\lpar 0.002/2.002\rpar {\bf I}$
. For covariance matrix ${\bf G}_0$
, as in Gelman et al. (Reference Gelman, Carlin, Stern, Dunson, Vehtari and Rubin2014), we assigned inverse-Wishart distribution prior distribution with ${\rm dim}{\bf G}_0 + 1$
degrees of belief. In our stability analysis, for covariance matrix ${\bf G}_0$
as the scale matrix, we used diagonal matrix of order ${\rm dim}{\bf G}_0$
with elements equal to empirical variances of each variety divided by ${\rm dim}{\bf G}_0$
. Finally, for matrix ${\bf R}_0$
, we assigned inverse-Wishart prior distribution with 2.002 degrees of belief. For matrix ${\bf R}_0$
as the scale matrix, we used the diagonal matrix of order K, with elements equal to (0.002/2.002). In the supplement of Longdon et al. (Reference Longdon, Hadfield, Webster, Obbard and Jiggins2011), it was pointed out that inverse-Wishart prior distribution defined in this way induces a marginal prior distribution on the variances that is equivalent to an inverse-γ distribution with shape and scale equal to 0.001.
Both SH and ENVIR models were fitted using R-package MCMCglmm (Hadfield, Reference Hadfield2012). Each model was run five times with different starting values. The length of each chain was set to 2 000 000 iterations with a burn-in period of 1 000 000 iterations and a thinning interval equal to 50. The convergence of chains was examined by visual inspection of trace plots and using the Gelman and Rubin (Reference Gelman and Rubin1992) convergence diagnostic (see also Cowles and Carlin, Reference Cowles and Carlin1996; Gelman et al., Reference Gelman, Carlin, Stern, Dunson, Vehtari and Rubin2014) implemented in the coda R-package (Plummer et al., Reference Plummer, Best, Cowles and Vines2006). For each model and chain, the deviance information criterion (Spiegelhalter et al., Reference Spiegelhalter, Best, Carlin and van der Linde2002; DIC; in smaller-is-better form; see also Sorensen and Gianola, Reference Sorensen and Gianola2002) was calculated. For clear presentation, only the results for the best model will be reported. For the best model, the estimates from all five chains were summarized using the summary function from coda R-package (Plummer et al., Reference Plummer, Best, Cowles and Vines2006).
Further, based on the obtained posterior mean yields and stability measure associated with the best model, all varieties were ranked. To combine the rankings, we applied Kang's (Reference Kang1988) non-parametric rank-sum stability procedure, where both posterior mean yield and posterior stability measure associated with the best model were used as a selection criterion. As in Kang (Reference Kang1988), the varieties with the lowest rank sum are the most desirable.
Further, for the best model, based on the variance components and the posterior variety means, coefficients of variation (CV) were calculated using the formula:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_eqn3.png?pub-status=live)
where $\hat{\tau }_i$ is equal to $\sqrt {\hat{\sigma }_u^2 + \hat{\sigma }_{ii}^2 }$
for the SH model, $\sqrt {\hat{\sigma }_{ii}^2 }$
for the ENVIR model, and $\hat{\sigma }_{ii}^2$
is the estimate of the i-th diagonal element of ${\bf G}_0$
either in the SH model or the ENVIR model.
Since choosing which of the number of varieties for future sowing is a decision theoretic problem, optimum Bayesian choice of the best variety requires a utility function to be specified. In the present work, the critical-level approach to stability was used and for each variety, the utility function was defined in the following way: when the yield y is below the critical level γ, the utility is zero; above γ, the utility is equal to a positive constant c times y. Given the model parameters and assuming normality, the expected utility for variety i is equal to
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_eqn4.png?pub-status=live)
where τ i is equal to $\sqrt {\sigma _u^2 + \sigma _{ii}^2 }$ for the SH model, and $\sqrt {\sigma _{ii}^2 }$
for the ENVIR model,$\;\sigma _{ii}^2$
is the i-th diagonal element of ${\bf G}_0$
either in the SH model or the ENVIR model, φ( ⋅ ) is the density function of the standard normal distribution, and D, depending on the model, is a set of parameters of either the SH model or the ENVIR model. Given γ, the best variety is the one maximizing the posterior expected value of this expression. Since the parameters in the preceding equation are usually unknown, one has to use their estimates instead. From our observations, it follows that varieties with mean tuber yields above 42 t/ha are being recommended. For this reason, in the present work, we set the critical level γ equal to 42 t/ha. Moreover, from the discussions with farmers, it follows that the farmers' tuber yield is equal to 80% of the tuber yield obtained in the trials. For this reason, we set in (4) c equal to 0.8.
Next, for a risk analysis, the probabilities that the yields of varieties fall below a certain critical level in the environments were calculated. Since the probability of falling below a certain level depends on the variety mean and variance, it can be treated as a different stability measure which combines mean and variance in a meaningful way (Eskrigde, Reference Eskrigde1990; Eskridge and Mumm, Reference Eskridge and Mumm1992). We assumed that the tuber yield follows a normal distribution. The probability that the tuber yield of the i-th variety falls below a certain critical level in a randomly chosen environment is equal to (Eskrigde, Reference Eskrigde1990)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_eqn5.png?pub-status=live)
where μ i is the mean for the i-th variety, $\sigma _{ii}^2$ is the variance of i-th variety across environments, Φ( ⋅ ) is the standard normal c.d.f.
Finally, by expressing p(i 2) as a function of p(i 1) (Piepho, Reference Piepho1996)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_eqn6.png?pub-status=live)
where Φ−1( ⋅ ) is the inverse of the standard normal c.d.f., the relative risks for the best varieties from each country can be plotted. Since the parameters $\mu _{i_1}$, $\;\mu _{i_2}$
, $\sigma _{i_1i_1}^2$
, $\sigma _{i_2i_2}^2$
in the preceding equations are usually unknown, one has to use their estimates instead.
Results
Firstly, to improve convergence and mixing of the chains, the observed tuber yields were divided by 10. Mean, standard deviation (s.d.), minimum (min) median (med) and maximum (max) of tuber yield expressed in tonnes per hectare (t/ha) for the three studied years are given in Table 3.
Table 3. Summary statistics for tuber yield: mean, standard deviation (s.d.), minimum (min), median (med), maximum (max) of observations for tuber yield
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_tab3.png?pub-status=live)
From Table 3, it can be seen that tuber yield varied from 8.95 to 85.99 t/ha. Furthermore, the lowest mean tuber yield was observed in 2018, while the highest mean tuber yield was observed in 2016.
The data set was analysed five times using model (2) (Table 4) with different starting values (see Supplements S1 and S2). For each model and chain, the values of DIC were calculated (Table 4).
Table 4. List of models, covariance matrices for 3-year series, and DIC values for different models
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_tab4.png?pub-status=live)
DIC, deviance information criterion.
a Env, environment; Var, variety; Rep, replicate; Env.Rep, effect of replicates nested within environments. Depending on the model Env.Var denotes either environment × variety interaction (SH) or environment-variety effect (ENVIR).
It can be seen that, for each chain, the smallest value of DIC was obtained for the ENVIR model. This means that the ENVIR model was the best among the two fitted models, and that this model should be preferred. For clarity, the results for the ENVIR model are presented. Before looking at the estimates from the ENVIR model, we first examined the convergence of the random and fixed effects by visually inspecting the trace plots (Supplements S3 and S4). From the trace plots, one can observe that all chains show good convergence. This was confirmed by the Gelman and Rubin tests. The values of point estimates of potential scale reduction factor (Point est.) for all fixed and random effects were equal to one (see Supplement S7). The evolution of the Gelman and Rubin's shrink factor for the parameters of interests is shown in Supplements S5 and S6.
For the series of field trials, the analysis of the ENVIR model provided several estimated parameters and statistics (Table 5). The variance component for replicates nested within environments was equal to 2.728 (Supplement S7). The mean error variance from the series of field trials was equal to 5.804. A detailed inspection of the residual posterior variances revealed that the biggest error variance occurred in environments 201610, 201611, 201623, 201629 and 201710, i.e. in 2016 and sites Rychliki, Krzyżewo, Świebodzin, Zybiszów, and in 2017 and in site Rychliki (see Supplement S5).
Table 5. Posterior variety means, environmental variances for tuber yield, coefficients of variation and posterior expected utilities
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_tab5.png?pub-status=live)
The numbers in square brackets denote the ranking of variety. Most favourable scores are highlighted in bold while least favourable scores are shown in italics.
Table 5 reports the estimates of posterior variety means, environmental stability variances, Kang's rank sums, coefficients of variations and posterior expected utilities.
The estimated posterior variety means with 95% credible intervals (CI) are reported in columns two and three. The posterior distributions of variety means are shown in Supplement S4. Among the tested varieties, variety Arielle had the highest tuber yield. This variety was also the best among the Dutch varieties tested in the series of field trials. For the Polish varieties, variety Denar had the highest tuber yield. The German variety Viviana had one of the lowest tuber yields in the series of field trials.
In columns four and five of Table 5, the estimates of the posterior environmental variances with their 95% CI are reported. One can observe that variety Miłek was the most stable variety, i.e. this variety had the smallest value of the environmental variance. This variety was also the most stable among the Polish varieties. The second best variety in terms of environmental variance was variety Riviera, which was also the most stable variety among the Dutch varieties. The German variety Viviana was the fourth best variety. On the other hand, variety Lord was the most unstable variety in the series of field trials. The posterior distributions of environmental variances are shown in Supplement S3.
The Kang's rank sums are reported in column six of Table 5. It can be noticed that variety Arielle had the smallest rank sum, which was equal to 4 and means that this variety is the most desirable. A different pattern was observed for the Polish varieties where the smallest rank sum was obtained for varieties Denar and Miłek and was equal to 9. The rank sum for variety Viviana was equal to 10.
The estimated posterior coefficients of variation (CV) with their 95% CI are given in columns six and seven of Table 5. One can observe that the smallest value of CV was obtained for variety Arielle and was equal to 22.42%. The second smallest value of CV was obtained for variety Riviera. Among the Polish varieties, the smallest value of CV was obtained for variety Miłek. On the other hand, variety Lord was the worst in terms of CV. Figure 1 shows the posterior distributions of coefficients of variations, from which the convergence of the Gibbs sampler can be inferred.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_fig1.png?pub-status=live)
Fig. 1. Posterior distributions of coefficients of variation estimated in the Bayesian environmental variance model for varieties: Arielle (a), Denar (b), Everest (c), Impala (d), Lord (e), Miłek (f), Riviera (g) and Viviana (h).
In the last column of Table 5, the expected posterior utilities are reported. The highest value of the expected utility was obtained for variety Everest and was equal to 45.19. The second best variety in terms of the posterior expected utility was variety Arielle (44.85). On the other hand, the smallest value of the posterior expected utility was obtained for variety Miłek.
Next, the estimates from Table 5 (posterior variety means and environmental variances) were plugged in to (5) and the risks of all varieties as well as for each group of varieties were plotted against the critical level in Figs 2(a)–(c), respectively.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20210128033659232-0393:S0021859620000945:S0021859620000945_fig2.png?pub-status=live)
Fig. 2. Colour online. Risk functions for varieties in the series of field trials (top panel) and relative risk functions for the best Dutch, Polish and German varieties (bottom panel).
From Fig. 2(a), one cannot characterize any variety as the most stable across all critical levels. It can be seen that variety Arielle up to a critical level around 60 t/ha had the lowest risk. Above this level, variety Everest displayed the lowest risk probability and variety Arielle was more risky (Fig. 2(b)). A different pattern can be observed for the Polish varieties (Fig. 2(c)). It can be noticed that up to a critical level around 70 t/ha, variety Denar had the lowest risk. Above this level, both varieties Denar and Lord had equal risk. Finally, using (6), in Figs 2(d)–(f), the relative risk functions for varieties Arielle, Denar and Viviana are plotted. Comparing varieties Arielle and Denar, it can be noticed that up to a risk about 0.8, variety Denar was slightly more risky than variety Arielle, whereas above this level, both varieties Arielle and Denar had equal risk (Fig. 2(d)). Now, comparing variety Viviana with varieties Arielle and Denar, one can observe that Viviana was always more risky than Arielle and Denar (Figs 2(e) and (f)).
Discussion
Usually, the stability of plant traits is assessed in multi-environment trials, which are analysed using either a two-stage or one-stage approach (see e.g. Piepho et al., Reference Piepho, Möhring, Schulz-Streeck and Ogutu2012; Caliński et al., Reference Caliński, Czajka, Kaczmarek, Krajewski, Pilarczyk, Siatkowski and Siatkowski2017; Damesa et al., Reference Damesa, Möhring, Worku and Piepho2017; Studnicki et al., Reference Studnicki, Derejko, Wójcik-Gont and Kosma2019). In the two-stage approach, one first analyses each trial separately and next, variety means are analysed using AMMI models or linear mixed models. In the one-stage approach, the analysis is performed on plot data. In the present work, we followed a one-stage approach. In a Bayesian framework, a two-stage approach is commonly used in a network meta-analysis.
In the present study, the stability analysis was performed on plot data using a Bayesian linear mixed model. This approach was preferred for several reasons. First of all, the Bayesian approach allows to incorporate the agronomist's knowledge about the likely values of average yields and variance components into the assessment of variety performance in a systematic way by using proper prior distribution. In the present work, as a prior knowledge about the tested varieties, the values of empirical variances of each variety were used. A similar approach was used in Mathew et al. (Reference Mathew, Holand, Koistinen, Léon and Sillanpää2016), where the authors applied the multi-trait animal model in the analysis of plant breeding trials. However, instead of using empirical variances, one can use the knowledge about varieties from the previous studies. Further, Bayesian analysis offers a possibility of calculating posterior distributions of new quantities, which are functions of model parameters. In the present study, the posterior distribution for coefficients of variation of each variety was calculated. In a similar manner, plant breeders can calculate the posterior distribution of heritability of a given trait. An exemplary code can be found in Villemereuil (Reference Villemereuil2012). Moreover, by considering a multi-trait model and by using the Bayesian approach, plant breeders will be able to calculate the posterior distribution of heritability of each trait as well as the posterior distribution of the genetic correlation between the traits. Finally, under the Bayesian approach, the variety recommendation process can be treated as a formal decision problem.
In the present work, by assuming different covariance matrices for the random vector of environment × variety interaction and modifying the list of random effects in our model, we obtained different models, which were compared in terms of DIC. A similar strategy has been described in Piepho (Reference Piepho1999). In that paper, the author also analysed two-way data and his goal was also to find the most stable variety. He assumed different covariance matrices for the environment × variety interaction and modified the list of random effects, this way he obtained several models and using different goodness of fit statistics chose the best one. In the same paper, he pointed out that within unifying mixed modelling framework, the problem of choosing an appropriate stability measure can be regarded as the problem of selecting the most appropriate variance–covariance structure and concluded that the choice of stability measure is data-dependent. In a different study, Studnicki et al. (Reference Studnicki, Paderewski, Piepho and Wójcik-Gront2017) used a similar approach to find a model with the highest predictability for means of cultivar × location and of cultivar × region.
In our Bayesian model, we assumed the heterogeneity of random errors. However, in the literature, many authors in the analyses of two-way or three-way data assume homogeneity of error variances (see e.g. Moore and Dixon, Reference Moore and Dixon2015; Caliński et al., Reference Caliński, Czajka, Kaczmarek, Krajewski, Pilarczyk, Siatkowski and Siatkowski2017; Studnicki et al., Reference Studnicki, Derejko, Wójcik-Gont and Kosma2019). It should be stressed that the model under the residual error variances homogeneity assumption can be justified from the randomization theory (see e.g. Caliński et al., Reference Caliński, Czajka, Kaczmarek, Krajewski, Pilarczyk, Siatkowski and Siatkowski2017 and the references therein). From the practical point of view, this assumption does not fully reflect the reality and may affect stability analysis. In Hu et al. (Reference Hu, Yan and Shen2013, Reference Hu, Yan and Li2014), it was shown that residual error variances homogeneity assumption may limit the accuracy of genotype evaluations or reliability of varietal recommendations. Moreover, Edwards and Jannink (Reference Edwards and Jannink2006) in their study have found that for all sources of variability in their model, the variance components in homogeneous variance model were numerically larger than the marginal variance estimates in the heterogeneous variance model. The heterogeneity of error variances was assumed by Edwards and Jannink (Reference Edwards and Jannink2006) in Bayesian modelling of multi-environment trials. A similar assumption was used by Smith and Cullis (Reference Smith and Cullis2018) in construction of their plant breeding selection tools. The same assumption was also used by Malosetti et al. (Reference Malosetti, Ribout, Vargas, Crossa and van Eeuwijk2008) in mixed-model analyses for multi-trait multi-environment data. From the computational point of view, the heterogeneity of error variances assumption can be easily implemented in statistical packages like Genstat, SAS, ASREML or some R-packages like MCMCglmm. An exemplary Genstat code can be found in Malosetti et al. (Reference Malosetti, Ribout, Vargas, Crossa and van Eeuwijk2008).
The two models described in the present work can be easily programmed in MCMCglmm R-package. In this package, the syntax used to specify the model is similar to that by an R interface to ASREML (Gilmour et al., Reference Gilmour, Gogel, Cullis, Welham and Thompson2002; Butler et al., Reference Butler, Cullis, Gilmour and Gogel2007), so scientist familiar with ASREML will not have problems with specifying their own model. Moreover, this package allows to define the covariance matrices for random effects in terms of Kronecker product as in Genstat, SAS or ASREML, and which is impossible in programs such as WinBUGS or JAGS. However, in the MCMCglmm package, by default, only a single chain is generated, while in WinBUGS or JAGS, one can use more than one chain to estimate the parameters. To fit the model with different starting values using MCMCglmm package, one has to run the model several times. This may be time consuming, especially for large data sets. To overcome this problem, one can use parallel computing and run the model with different starting values on separate cores.
In the MCMCglmm package, by default, an inverse-Wishart distribution is used as prior for variance components and covariance matrices. As in Gelman et al. (Reference Gelman, Carlin, Stern, Dunson, Vehtari and Rubin2014), in the present study to estimate the covariance matrix ${\bf G}_0$, an inverse-Wishart distribution with ${\rm dim}\;{\bf G}_0 + 1$
degrees of belief was used. On the other hand, in Lunn et al. (Reference Lunn, Jackson, Best, Thomas and Spiegelhalter2013), it was pointed out that the least informative proper inverse-Wishart is obtained by setting degrees of belief equal to ${\rm dim}\;{\bf G}_0$
. As in Lunn et al. (Reference Lunn, Jackson, Best, Thomas and Spiegelhalter2013), a similar assumption regarding the degrees of belief in the inverse-Wishart distribution was used in Mathew et al. (Reference Mathew, Holand, Koistinen, Léon and Sillanpää2016). However, in literature concerning Bayesian statistics, it was argued against using inverse-Wishart priors for covariance matrices because they impose a degree of informativity and the posterior inferences are sensitive to the choice of hyper-parameters. Instead, it was proposed to use half-Cauchy distribution or inverse-gamma distribution as a prior for variance components (Gelman, Reference Gelman2006) and scaled inverse Wishart distribution (Gelman et al., Reference Gelman, Carlin, Stern, Dunson, Vehtari and Rubin2014) as prior for covariance matrices (Gelman et al., Reference Gelman, Carlin, Stern, Dunson, Vehtari and Rubin2014). To overcome this problem, one can use the guidelines found in the supplement of Longdon et al. (Reference Longdon, Hadfield, Webster, Obbard and Jigins2012) or in Villemereuil (Reference Villemereuil2012). In both of these papers, the authors have shown how to parameterize the inverse-Wishart distribution so that it corresponds to several commonly used prior distributions. Finally, it should be stressed that, at least for now, the list of covariance matrices in the MCMCglmm package, which can be used in modelling the environment × variety, is short. For example, one cannot fit the mixed-model counterparts of Finlay–Wilkinson model and of AMMI model (see Piepho, Reference Piepho1999), since the factor-analytic covariance matrix has not been implemented. However, by combining the available list of covariance matrices with the grouping of random effects as described in Piepho and van Eeuwijk (Reference Piepho, van Eeuwijk and Kang2002), one could try to extend the methodology described in the present study to the analysis of three-way data. We plan to explore this problem in future work.
In the literature, several Bayesian counterparts of the classical stability measures and models have been described (Crossa et al., Reference Crossa, Perez-Elizalde, Jarquin, Cotes, Viele, Liu and Cornelius2011; Josse et al., Reference Josse, van Eeuwijk, Piepho and Denis2014; de Oliveira et al., Reference de Oliveira, Silva, Nuvunga, Silva and Balestre2016; Lian and de los Campos, Reference Lian and de los Campos2016; Bernardo et al., Reference Bernardo, Silva, de Oliveira, Nuvunga, Pires, Von Pinho and Balestre2018). It would be interesting to compare the results of stability analyses obtained by different Bayesian counterparts of the classical stability measures and how similar the rankings of high-yielding and stable varieties obtained by different methods would be.
In the present work, we have shown that recommendation of varieties for future sowing can be treated as a formal Bayesian decision theoretic problem. For this purpose, we first fitted a Bayesian model, and next using a very simple utility function, we calculated the posterior expected utility of each variety. A similar strategy was described by Theobald et al. (Reference Theobald, Talbot and Nabugoomu2002, Reference Theobald, Roberts, Talbot and Spink2006). In these papers, the authors used Bayesian decision theory with more complicated utility functions to choose a crop variety and fertilizer dose, and to estimate economically optimum seed rates for winter wheat. For example, using the approach described in Theobald et al. (Reference Theobald, Talbot and Nabugoomu2002, Reference Theobald, Roberts, Talbot and Spink2006) or described in the present study, agronomists can choose variety and planting date or find the optimal plant density and variety combination, e.g. in soybean or maize.
In the present work, for each variety, the probabilities of falling below a certain level were calculated. These probabilities can be treated as a different stability measure which combines mean and variance in a meaningful way (Eskrigde, Reference Eskrigde1990; Eskridge and Mumm, Reference Eskridge and Mumm1992). Eskrigde (Reference Eskrigde1990), and Eskridge and Mumm (Reference Eskridge and Mumm1992) used this concept to find high-yielding and stable varieties. Recently, Marcholdt et al. (Reference Macholdt, Piepho and Honermeier2019) used this concept to evaluate the yield performance and stability of wheat depending on the fertilization level and combination. For example, combining the approach presented in this paper with the methodology described in Piepho (Reference Piepho1996,Reference Piepho1998) could gain additional information for recommending varieties. In agronomy studies, by using the concept of falling below a certain threshold, agronomists can calculate the probability of a new planting data or plant density outperforming the standard one. Moreover, one can calculate the probability of a new agronomic system outperforming the old one. We plan to explore this problem in a future work.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0021859620000945
Financial support
This research received no specific grant from any funding agency, commercial or not-for-profit sectors.
Conflict of interest
None.
Ethical standards
Not applicable.