Buffalo milk is mainly used to make cheeses, especially mozzarella. As with any commercially produced milk, the economic value depends not only on the quantity produced, but also on the composition, mainly the fat and protein contents (Seno et al. Reference Seno, Cardoso and Tonhati2007). In Brazil, as a consequence of improvement in management, infrastructure and feeding practices on buffalo farms, over the last two decades the buffalo milk production per lactation has increased considerably (Tonhati et al. Reference Tonhati, Baruselli, Oliveira, Vasconcellos and Toledo1996; Malhado et al. Reference Malhado, Ramos, Carneiro, Souza and Piccinin2007; Aspilcueta et al. Reference Aspilcueta, Araujo Neto, Baldi, Bignardi, Albuquerque and Tonhati2010a). To date, the genetic evaluation for milking buffaloes is carried out for total milk yield using a repeatability model. This model assumes that genetic and phenotypic variances are constant along the lactation and the lactation curve is equal or the same for all the animals. Moreover, short-length lactations are common in buffaloes, and thus, lactation records must be extended to be included in a genetic evaluation. Alternatively, the production of milk and its constituents by dairy buffaloes can be represented by points that are related to the lactation trajectory, enabling their evaluation by random regression models (RRMs).
RRMs allow the fitting of random lactation curves to each individual, expressed as deviations from a mean curve of the population or groups of individuals. In fitting a RRM, a structure of covariances among the observations is implicitly assumed, determined by the covariances of the regression coefficients, so that it can be defined as a covariance function. The implementation of RRMs to estimate genetic parameters for characteristics of the yield of milk and its constituents is widespread in the dairy industry (Schaeffer & Jamrozik, Reference Schaeffer and Jamrozik2008).
Estimates of genetic parameters obtained by using random regression models for dairy buffalo traits are scarce in the literature and those that do exist only consider milk output (Breda et al. Reference Breda, Albuquerque, Euclydes, Bignardi, Baldi, Torres, Barbosa and Tonhati2010; Sesana et al. Reference Sesana, Bignardi, Aspilcueta, El Faro, Baldi, Albuquerque and Tonhati2010). These authors report heritability estimates of milk yield ranging from 0·19 to 0·54 during lactation. However, there are no reports of genetic parameter estimates for the constituents of buffalo milk using RRMs. In contrast, in the case of dairy cattle, such estimates of the production of milk and its constituents have been published for a number of breeds (Liu et al. Reference Liu, Reinhardt and Reents2000; Jakobsen et al. Reference Jakobsen, Madsen, Jensen, Pedersen, Christensen and Sorensen2002a, Reference Jakobsen, Madsen and Pedersenb; De Roos et al. Reference De Roos, Harbers and Jong2004; Silvestre et al. Reference Silvestre, Petim-Batista and Colaço2005).
Therefore, the objective of this work was to estimate covariance functions for the additive genetic and permanent environmental effects, and subsequently the genetic parameters, for yield of milk, fat and proteins, through the use of random regression models with Legendre polynomials, considering different residual variance structures.
Material and Methods
We analysed test-day results of milk, fat and protein production from 1433 first lactations of Murrah buffaloes, with ages from 24 to 48 months and born between 1985 and 2007, daughters of 113 sires, from 12 herds located in the state of São Paulo, Brazil. The milk yields records were obtained starting on the fifth day after calving and were truncated at 305 days of lactation, since only 12% of females had a lactation length greater than this period. Only cows that had their first test-day record before 45 d after calving were considered in the analyses.
The test-day productions were considered in monthly lactation classes, varying from 1 to 10 classes, and included animals with at least four tests. The contemporary groups were defined as herd-year-month of milk test, with the restriction that each group had to contain at least four animals. After data consistency, the descriptive statistics for milk, fat and protein yield along the lactation (first lactation) are shown in Table 1. A pedigree file containing 10 088 animals was used in all the analyses.
† C: monthly test; N: number of observations; CV: coefficients of variation
The characteristics of milk, fat and protein production were analysed by means of single-trait random regression models. All the models included additive genetic, permanent environmental and residual effects of the animal. The fixed effects considered were the contemporary group, number of milkings (1 or 2 daily), the linear and quadratic effects of the covariable cow age at calving and the average lactation curve of the population, modelled by a third-order orthogonal polynomial. In matrix form, the model can be represented by:
where y=vector of observations; b=fixed-effects vector (contemporary group, number of milkings; covariable age of the cow at calving and mean population curve); a=vector of solutions for the random additive genetic regression coefficients; ap=vector of solutions for the random permanent environmental regression coefficients; e=vector of the different residuals; and X, Z, W=incidence matrices for the fixed effects and random additive genetic and permanent environmental effects, respectively. The dimension of the vector a is k a Na coefficients, where k a represents the order of the polynomial and N a the number of animals in the numerator relationship matrix. The vector ap has dimension of k ap ×Nd coefficients, where k ap represents the order of the polynomial and N d , the number of animals with phenotypic records.
In the analysis we assumed that the records are distributed with mean Xβ, and for the random additive genetic, permanent environmental and residual effects we considered:
where: K a and K ap are the matrices of covariances between the random additive genetic regression and permanent environmental coefficients, respectively; A is the numerator relationship matrix among the individuals; I Nd is the identity matrix with dimension N d ; ⨷ is the Kroneker product between the matrices; and R represents a diagonal block matrix containing the residual variances. We assumed independence of the residuals.
The random additive genetic and permanent environmental effects were modelled by third- to sixth-order Legendre polynomials. Residual variances were modelled using a step function with 1, 3 (1–3, 4–8, 9–10), 4 (1, 2–3, 4–8, 9–10) or 10 classes for milk production, and with 1, 4 (1, 2–6, 7–8, 9–10), 6 (1, 2, 3–4, 5–6, 7–8, 9–10) or 10 classes for fat production, and with 1, 3 (1, 2–6, 7–10) or 10 classes for protein production.
The citation of the RRMs follows the pattern: LEGk a .kap_r, referring to the order of the covariance function for the additive genetic effects (k a ), the permanent environmental effects (k ap ) and the residual variances structure (r). For example, the LEG3,4_3 model denotes an analysis fitting a third- and fourth-order Legendre polynomial for the additive genetic and the permanent environmental effects, respectively, and the residual variances modelled with a step function with 3 classes.
The covariances functions were estimated by the restricted maximum likelihood method, employing the WOMBAT statistical program (Meyer, Reference Meyer2006).
The different models were compared by the logarithm of the likelihood function (log L), the likelihood ratio test (LRT) at 1% probability, the restricted maximum likelihood forms of the Akaike information criterion (AIC) and Bayesian information criterion (BIC) of Schwarz and by examining the variances and correlations estimated for the traits. The information criteria can be represented as:
where p is the number of parameters estimated, N is the number of data, r(X) the rank of the coefficient matrix of fixed effect in the model of analysis, and log L is the restricted maximum log-likelihood function (Wolfinger, Reference Wolfinger1993).
Results and Discussion
The average test-day milk yields (Table 1) show a typical lactation curve for buffaloes, starting at 8·24 kg, with increased production until a peak on the second test day (8·87 kg) and subsequent decline until the end of lactation (4·9 kg). For the yields of fat and protein, the test-day averages were 0·40 and 0·24 kg, with sd of 0·14 and 0·08 kg and coefficients of variation of 35·40 and 35·32%, respectively. As can be observed in Table 1, there was an increase in production in the initial lactation phase (0·40 kg for fat and 0·23 kg for protein) until the third month (0·49 kg for fat and 0·27 kg for protein). These yields both declined as of the fourth month, with the increase in the number of lactation days until the end (0·30 kg for fat and 0·22 kg for protein). There was greater variation in the fat and protein production at the start and end of lactation.
The best fit for the residual variance (Table 2) shows increases in the log-likelihood, significant (P<0·01) according to the likelihood ratio test (LRT), with an increasing number of heterogeneous classes. According to the criteria used (AIC and BIC) to assess the goodness of fit, the model considering homogeneity of the residual variances was inadequate. This indicated that the residual variances behaved differently during the lactation period, making it necessary to consider a heterogeneous variance structure for the residuals. The heterogeneous residual variances can be attributed to factors such as the stage of pregnancy, body condition and duration of the lactation interval, among others, since these factors are not easily incorporated in analytic models owing to the lack of information about them (Rekaya et al. Reference Rekaya, Carabaño and Toro2000; El Faro & Albuquerque, Reference El Faro and Albuquerque2003).
† Models: LEGka.kpe_r or LEGka.kpe_1, corresponding the functions Legendre polynomials (LEG), corresponding to the order of the covariance function for additive genetic (ka) and permanent environmental (kpe) effects and to the residual variance structure of variances modeled by a step function (r) assuming n variance classes or variance homogeneity (1)
‡ Verisimilitude ratio test among the hierarchical models.
** P<0,01;
n.s not significant
For the traits under study, the models with 10 residual classes should be applied, but for production of milk and fat, based on the LEG3,3_4 model, and for production of protein based on the LEG3,3_3 model, the changes in the log L were small in magnitude and not significant by the LRT (P>0·01). This can indicate that for the milk and fat yield models, four residual classes would be sufficient and three classes would be enough for protein yield to obtain a good residual variance fit. This would avoid using over-parameterized models, which in general present parameter estimation problems (Bignardi et al. Reference Bignardi, El Faro, Albuquerque, Cardoso and Machado2009; Sesana et al. Reference Sesana, Bignardi, Aspilcueta, El Faro, Baldi, Albuquerque and Tonhati2010).
Random regression models by means of Legendre polynomial functions require definition of the most appropriate order for each random effect considered in the model. For the milk, fat and protein output (Table 3), in general there was an improvement in the Log L criterion (P>0·01, by the LRT) and AIC by increasing the order of adjustment from three to four for the additive genetic variance part associated with order six for the permanent environmental variance. The BIC, more rigorous because of the parameterization, indicated LEG3,4_4, LEG3,5_4 and LEG3,4_3 as the best models for milk, fat and protein yield, respectively. Therefore these are the most suitable models to describe the variation of milk, fat and protein production during lactation. Pool et al. (Reference Pool, Janss and Meuwissen2000) reported that the lactation curves can be modelled with enough precision using a third-order Legendre polynomial for the additive genetic component and a fourth-order one for the permanent environmental component. Lopez-Romero & Carabaño (Reference López-Romero and Carabaño2003) also reported that Legendre polynomials of low order for the additive and permanent environmental variances can be most adequate.
† Models: LEGka.kpe_r, corresponding the functions Legendre polynomials (LEG), corresponding to the order of the covariance function for additive genetic (ka) and permanent environmental (kpe) effects and to the residual variance structure of variances modeled by a step function (r) assuming 3 or 4 variance classes
‡ Verisimilitude ratio test between the hierarchical models.
** P<0·01;
n.s not significant
§ Indicates the best model based on the AIC and BIC
Percentage of variance explained by eigenvalues associated with the random regression coefficients matrix for the additive genetic and permanent environmental effects, for the two best models for each of the traits according to the BIC, are presented in Table 4. The models of lower order are sufficient to capture all the variation of the trait, with a fourth coefficient for the additive part being associated with a zero eigenvalue, unlike for the permanent environmental effect. The eigenvalues analysed show that the first was responsible for over 79% of the variance of the data in the models adjusted to the traits under study. The variability of the data for the additive genetic and permanent environmental effects was mainly explained by the first two eigenvalues (more than 90%). According to the results of the eigenvalues, the dimension of the two random effects could be reduced without loss of information, in disagreement with the log L and Akaike information criteria for the additive genetic and permanent environmental effects and in agreement with the Bayesian information criterion for the additive genetic effect. However, Legarra et al. (Reference Legarra, Misztal and Bertrand2004) indicated that one must consider the fact that reducing the dimensionality resulting from eliminating eigenvalues near zero is not advisable in all cases, since adopting this criterion can result in an over-simplistic or inadequate model.
The estimates of the genetic additive, permanent environmental, residual and phenotypic variances for the yields of milk (model LEG3,4_4), fat (model LEG3,5_4) and protein (model LEG3,4_3) on the test day are presented in Fig. 1. In the estimates of the additive variance for milk output, there was an increase from the start until the fourth month, after which the yield declined until the end of lactation. The estimate of the permanent environmental variance remained practically constant during the lactation period and was greater than the additive and residual variances. The residual variance estimates declined from the fourth to the eighth month of lactation.
For the fat and protein yields (Fig. 1), the additive genetic variances presented the same tendency as the phenotypic variance, with higher values at the end of lactation. These higher estimates of the constituents might have been due to the number of data readings obtained or to the higher levels of fat and protein found at the end of lactation. In general, the trends of the additive and permanent environmental variances throughout the lactation period obtained in this study (for the three traits) are comparable to those found by Gengler et al. (Reference Gengler, Tijini, Wiggans and Philpot1997), Silvestre et al. (Reference Silvestre, Petim-Batista and Colaço2005), Muir et al. (Reference Muir, Kistemaker, Jamrozik and Canavesci2007) and De Groot et al. (Reference De Groot, Keown, Van Vleck and Kachman2007) all of whom reported higher additive and permanent environmental variances at the start and end of lactation for milk as well as fat and protein yield.
Figure 2 shows the heritability (h 2) and proportion of the phenotypic variance corresponding to the permanent environmental variance. The estimates of heritability for milk yield fluctuated between 0·16±0·05 and 0·29±0·05 and were greatest in the fourth and sixth months of measurement.
This result differs from those found by Sesana et al. (Reference Sesana, Bignardi, Aspilcueta, El Faro, Baldi, Albuquerque and Tonhati2010) and Breda et al. (Reference Breda, Albuquerque, Euclydes, Bignardi, Baldi, Torres, Barbosa and Tonhati2010) in which the estimates were higher at the extremes of the lactation curve. This can be attributed to the fact that these authors utilized weekly tests. However, the heritability estimates were near those found by Aspilcueta et al. (2010b, c) who used finite dimensional models. The heritability estimates for production of fat and protein (Fig. 2) showed a similar trend during the lactation, with a sharp increase at the end of this phase. These results follow the same trend observed in the literature by the majority of researchers in dairy cattle (Gengler et al. Reference Gengler, Tijini, Wiggans and Philpot1997; Silvestre et al. Reference Silvestre, Petim-Batista and Colaço2005, De Groot et al. Reference De Groot, Keown, Van Vleck and Kachman2007; Muir et al. Reference Muir, Kistemaker, Jamrozik and Canavesci2007). The heritability estimates for fat production varied from 0·20±0·05 (ninth month) to 0·30±0·08 (tenth month). However, the heritability estimates for protein varied from 0·18±0·06 (first month) to 0·27±0·08 (tenth month). These results are near the estimates obtained by the majority of researchers in dairy cattle using random regression models (Gengler et al. Reference Gengler, Tijini, Wiggans and Philpot1997; Liu et al. Reference Liu, Reinhardt and Reents2000; De Roos et al. Reference De Roos, Harbers and Jong2004; Muir et al. Reference Muir, Kistemaker, Jamrozik and Canavesci2007). Aspilcueta et al. (Reference Aspilcueta, Sesana, Muñoz-Berrocal, Seno, Bignardi, El Faro, Albuquerque, Camargo and Tonhati2010c) used finite dimensional models with monthly data on fat and protein yield of buffaloes and found heritability estimates of lower magnitude than those found in the present study.
The fractions of phenotypic variances referring to the permanent environmental variance of the traits in the present study were greater than the heritability estimates throughout the lactation period, a similar finding to that of El Faro & Albuquerque (Reference El Faro and Albuquerque2003) in dairy cattle.
The estimates of the genetic and phenotypic correlations for production of milk, fat and protein are presented in Table 5. The genetic correlations for milk production on the test day ranged from 0·18±0·130 to 0·99±0·002, and were higher the nearer the test days were to each other, declining as the interval between them increased. Among the test days in the middle of the lactation period, these genetic correlations were the highest, near one. This pattern is close to those reported by Sesana et al. (Reference Sesana, Bignardi, Aspilcueta, El Faro, Baldi, Albuquerque and Tonhati2010) and Breda et al. (Reference Breda, Albuquerque, Euclydes, Bignardi, Baldi, Torres, Barbosa and Tonhati2010). Nevertheless, those authors found negative genetic correlation estimates between the first and last test days. Aspilcueta et al. (2010b, c) reported higher estimates, probably owing to the finite dimensional model methodology.
† Monthly test day
The estimates of the genetic correlations (Table 5) for the fat and protein yields during the lactation period ranged from 0·44±0·080 to 0·99±0·004 and 0·41±0·080 to 0·99±0·004, respectively. Most of the estimated correlations were high, approaching one. It can also be seen that the estimates were higher when the test-day fat and protein yields were nearer to each other. Aspilcueta et al. (Reference Aspilcueta, Sesana, Muñoz-Berrocal, Seno, Bignardi, El Faro, Albuquerque, Camargo and Tonhati2010c) reported similar estimates using a finite dimensional model for fat and protein.
For the traits under investigation, the lowest genetic correlation estimates occurred between the yield in the first month and the other test-day months, probably owing to the difficulty of modelling the initial lactation tests, because during this phase the animals suffer post-partum stress and also have negative energy balance.
The phenotypic correlation estimates (Table 5) between the yields for the traits under study were lower as the interval between the test days increased. In general the phenotypic correlation estimates were lower than the genetic ones. This same pattern was reported by Aspilcueta et al. (Reference Aspilcueta, Sesana, Muñoz-Berrocal, Seno, Bignardi, El Faro, Albuquerque, Camargo and Tonhati2010c).
Conclusions
Random regression models employing Legendre polynomials were efficient to describe the genetic variation for test-day yield of milk, fat and protein among buffaloes.
The heritability estimates of the traits were moderate, which can help in the process of selecting animals to obtain genetic gains. The estimates of the genetic correlations were high among the test days, indicating that by either selection criterion, indirect genetic gains can be expected throughout the lactation curve.
This study was supported by the State of São Paulo Research Foundation (Fapesp) and the National Council of Technological and Scientific Development (CNPq). The authors thank the Brazilian Association of Breeders of Buffaloes (ABCB) for providing the pedigree and phenotypic data.