Egyptian buffalo have played an important role over centuries in securing food supply (milk and meat) and providing manure for agricultural operations in rural regions (SADS, 2009). In Egypt, there are around 3.4 million buffalo producing around 45 and 37% of total milk and red meat production, respectively (FAO, 2019). Buffalo milk has around 22% more protein, 47% more fat, 50% more iron and 50% less lactose as well as lower cholesterol content as compared with typical bovine milk (Barłowska et al., Reference Barłowska, Szwajkowska, Litwińczuk and Król2011). In addition, buffalo are more adapted to harsh environmental conditions than other dairy species. Likewise, buffalo are better able to utilize poor quality roughages, are more resistant to several infectious diseases and have longer productive life than cows (Wanapat et al., Reference Wanapat, Ngarmsang, Korkhuntot, Nontaso, Wachirapakorn, Beakes and Rowlinson2000; Deb et al., Reference Deb, Nahar, Duran and Presicce2016). Although considerable attention has been paid in the last decades for buffalo to achieve their production potential, genetic improvement for buffalo performance is still not satisfactory. The main reason for this delay is the lack of national recording systems and pedigree information. To partially overcome this problem, an intensive crossing with Italian buffalo was applied by the traditional breeding industry. Although significant progress has been achieved, milk yield is still considerably lower than potentially expected. An alternative and potentially more rapid approach is to use genomic information in breeding programs. Genomic data can identify essential molecular markers and biological pathways to better understand the genetic mechanisms underlying milk properties so as to accelerate breeding progress (Hayes and Daetwyler, Reference Hayes and Daetwyler2019).
Advances in genomic technologies have led to the identification of many thousands of genomic markers covering the whole genome, mainly on a single base (single nucleotide polymorphisms: SNPs). Consequently, genome-wide association studies (GWAS) have been widely accepted as a primary approach for localizing quantitative trait loci (QTL) and have achieved considerable success with detecting candidate genes associated with economically complex traits in livestock species and disease risks in human (Goddard and Hayes, Reference Goddard and Hayes2009; Visscher et al., Reference Visscher, Wray, Zhang, Sklar, McCarthy, Brown and Yang2017). For buffalo, development of Axiom Buffalo Genotyping Array (90k SNPs) opens new opportunities to identify candidate genes associated with buffalo milk production traits and provides the possibility of predicting genomic breeding values for buffalo breeds in different countries. So far, 87 significant and 354 suggestive genomic regions were reported for milk production traits in Brazilian, Chinese, Egyptian, Iranian, Italian and Philippino buffalo (de Camargo et al., Reference de Camargo, Aspilcueta-Borquis, Fortes, Porto-Neto, Cardoso, Santos, Lehnert, Reverter, Moore and Tonhati2015; El-Halawany et al., Reference El-Halawany, Abdel-Shafy, Shawky, Abdel-Latif, Al-Tohamy and Abd El-Moneim2017; Iamartino et al., Reference Iamartino, Nicolazzi, Van Tassell, Reecy, Fritz-Waters, Koltes, Biffani, Sonstegard, Schroeder, Ajmone-Marsan, Negrini, Pasquariello, Ramelli, Coletta, Garcia, Ali, Ramunno, Cosenza, de Oliveira, Drummond, Bastianetto, Davassi, Pirani, Brew and Williams2017; Mokhber, Reference Mokhber2017; da Costa Barros et al., Reference da Costa Barros, de Abreu Santos, Aspilcueta-Borquis, de Camargo, de Araujo Neto and Tonhati2018; Herrera et al., Reference Herrera, Flores, Duijvesteijn, Gondro and van der Werf2018; Liu et al., Reference Liu, Liang, Campanile, Plastow, Zhang, Wang, Salzano, Gasparrini, Cassandro and Yang2018; Abdel-Shafy et al., Reference Abdel-Shafy, Awad, El-Regalaty, Ismael, El-Assal and Abou-Bakr2020; Awad et al., Reference Awad, Abou-Bakr, El-Regalaty, El-Assal and Abdel-Shafy2020; Lu et al., Reference Lu, Duan, Li, Abdel-Shafy, Rushdi, Liang, Ma, Liang and Deng2020). Despite the fact that these loci were distributed on almost all chromosomes, none of these SNPs were overlapped between studies indicating the complexity of milk production traits and the modest effect of such SNPs (Abdel-Shafy et al., Reference Abdel-Shafy, Awad, El-Regalaty, Ismael, El-Assal and Abou-Bakr2020). Therefore, a number of QTL and novel genes underlying buffalo milk properties may still need to be identified and/or validated. Motivated by these reasons, the objectives of the current study were to identify genomic regions and detect potential candidate genes associated with milk production traits in Egyptian buffalo.
Materials and methods
The study was approved by the Institutional Animal Care and Use Committee (IACUC) of Cairo University, Egypt (approval number: CU-II-F-40-17).
Animals and phenotypes
Data consisted of 177 995 daily milk records from 1992 multiparous Egyptian buffalo cows belonging to eleven herds. To ensure homogenous dataset, a number of quality criteria were applied to animals and phenotypes (online Supplementary Table S1). Animals with unknown calving dates, records of lactation number above thirteen, and data below five and above 290 days in milk (DIM) were excluded. Furthermore, animal must have completed at least three months of lactation to be used in the analyses. After quality check, 161 479 daily milk records from 1670 animals remained. The average lactation length during the first thirteen lactations was 199 ± 54 d (mean ± sd). The pedigree file included 2121 animals. The average daily milk yield (DMY) in the filtered data was 7.8 ± 3.0 kg. All information was obtained from Cattle Information System/Egypt (CISE) and Animal Production Research Institute (APRI) of Agricultural Research Center (ARC), Egypt.
Milk samples from the above filtered animals were monthly collected and stored at −20°C until analysis. Time between subsequent milk samples for the same animal was 31 ± 9 d. The total number of analyzed samples was 60 318 from 1481 animals. Fat (FP), and protein (PP) percentages were determined by Infrared Milk Analyzer (Bentley). The average of FP and PP were 7 ± 2.0 and 3.7 ± 0.6, respectively. Fat (FY), and protein (PY) yields were calculated by multiplying percentages by milk yield at same test day. The averages of FY and PY were 0.62 ± 0.30 and 0.33 ± 0.13 kg/day, respectively.
Genotypic data
A total number of 114 animals were selected for genotyping according to their average daily milk, where animals with highest and lowest deviation from the population mean were used. Blood samples were collected from the jugular vein and kept in a 15 ml Falconer tubes containing 1 ml 0.5 M EDTA and chilled immediately on ice. Genomic DNA was extracted from whole blood samples using QIAamp® DNA Blood Mini Kit (QIAGEN, Hilden, Germany). Genotyping was performed with Axiom® Buffalo Genotyping panel 90 K according to the standard protocol of Thermo Fisher Scientific (https://www.thermofisher.com). The raw signal intensities files (CEL format) were converted into genotype calls and annotated to the latest reference assembly of buffalo genome (UOA_WB_1: GCA_003121395.1) using Genotyping Console™ 4.2.
Statistical analysis
A yield deviation (adjusted phenotype) is a weighted average of the animal's phenotype adjusted for non-genetic factors (VanRaden and Wiggans, Reference VanRaden and Wiggans1991). In this regard, yield deviation for each trait was computed with univariate animal model using BLUPF90 family (Misztal et al., Reference Misztal, Tsuruta, Strabel, Auvray, Druet and Lee2002). Given the vector of y representing the phenotypic observations on the tested trait (DMY, FP, PP, FY, and PY), the following univariate animal model was used: y = Xb + Zα + Wp + ɛ, where b is the vector of all fixed effects including milking frequency per day, lactation number, herd, and year-season of calving. In addition, linear regressions of age at calving, and fourth order Legendre polynomials of DIM were used. While α and p are the vectors of random additive genetic and permanent environmental effects, respectively, and ɛ is the vector of random residual. X, Z and W are incidence matrices relating observations of y to fixed, random animal, and random permanent environmental effects, respectively. The analysis was performed under the following assumptions: ${\propto} \;\sim \;N\;\lpar {0\comma \;\;A\sigma_a^{\,\,2} } \rpar $, $p\;\sim \;N\;\lpar {0\comma \;\;I\sigma_p^{\,\,2} } \rpar $, and $\varepsilon \;\sim \;N\;\lpar {0\comma \;\;I\sigma_\varepsilon^{\,\,2} } \rpar $, where A is a matrix of additive genetic relationship among animals based on available pedigree; I is an identity matrix; and σ2α, σ2p and σ2ɛ are additive, permanent environmental and residual variances, respectively.
The GWAS was performed using the linear regression model implemented in PLINK 1.9 (Chang et al., Reference Chang, Chow, Tellier, Vattikuti, Purcell and Lee2015), where the adjusted phenotype (DMY, FP, PP, FY, and PY) were regressed on the number of copies of the alleles using PLINK–linear option with population stratification as covariates. In this issue, a multidimensional scaling approach was used to adjust for a potential population structure. The scaling process identified eight significant clusters indicating the axes of ancestry at P < 0.0001. These clusters were included as covariates into the model when performing GWAS. The Manhattan and Q-Q plots were generated using qqman package in R (Turner, Reference Turner2014). For statistical inference, the Bonferroni method was applied to correct for multiple testing.
QTL and candidate genes
Beside previously reported loci in buffalo studies, reported QTL for milk production traits in cattle were retrieved from animal QTLdb (http://www.animalgenome.org/QTLdb), release 39 (Hu et al., Reference Hu, Park and Reecy2019), since buffalo and cattle are closely related. Candidate genes in each genomic region were extracted from the latest annotated file (na35.r2.a2) for Axiom® Buffalo Genotyping array provided by Thermo Fisher Scientific. Gene functions are extracted from UniProtKB (https://www.uniprot.org/) and GeneCards (https://www.genecards.org/) databases.
Results and discussion
Animals and phenotypes
To accurately detect genomic loci associated with complex traits, it is crucial to use a suitable number of individuals with accurate phenotypes (Goddard and Hayes, Reference Goddard and Hayes2009). The number of animals used in current GWAS is relatively small compared to other investigations performed in dairy cattle populations. The main limiting reason for increasing the sample size is the availability of accurate daily milk records, since it is not routinely recorded in buffalo farms. In addition, there is currently an intensive crossing with Italian buffalo in most of the commercial herds, which makes it difficult to find large scale data from pure Egyptian buffalo. To circumvent such a situation, we used data from institutional herds, where pure Egyptian buffalo are kept. In addition, we firstly corrected phenotypes for non-genetic factors using 161 479 daily milk records and 60 318 milk composition records from 1670 animals. We reduced the heterogeneity among animals and lactations as much as possible. In this regard, the coefficients of variation among lactations were 0.41, 0.27, 0.16, 0.43, and 0.36 for DMY, FP, PP, FY, and PY, respectively. Adjusted phenotypes (yield deviations) are commonly used when assessing the animal's own performance to reduce variability attributed to non-genetic factors, increase the accuracy of estimating the genetic effect and to reduce the error rate (VanRaden and Wiggans, Reference VanRaden and Wiggans1991). Later, we used adjusted phenotypes for GWAS analyses.
Genotype characteristics
In the most recent version of annotation file (na35.r2.a2) provided by Thermo Fisher Scientific, the number of chromosomes is 25 including the sex chromosomes. The previous version of assembly (r1) used in the previous investigations was annotated to bovine genome (UMD31 assembly: see de Camargo et al. (Reference de Camargo, Aspilcueta-Borquis, Fortes, Porto-Neto, Cardoso, Santos, Lehnert, Reverter, Moore and Tonhati2015); Iamartino et al. (Reference Iamartino, Nicolazzi, Van Tassell, Reecy, Fritz-Waters, Koltes, Biffani, Sonstegard, Schroeder, Ajmone-Marsan, Negrini, Pasquariello, Ramelli, Coletta, Garcia, Ali, Ramunno, Cosenza, de Oliveira, Drummond, Bastianetto, Davassi, Pirani, Brew and Williams2017); da Costa Barros et al. (Reference da Costa Barros, de Abreu Santos, Aspilcueta-Borquis, de Camargo, de Araujo Neto and Tonhati2018); Liu et al. (Reference Liu, Liang, Campanile, Plastow, Zhang, Wang, Salzano, Gasparrini, Cassandro and Yang2018); Herrera et al. (Reference Herrera, Flores, Duijvesteijn, Gondro and van der Werf2018); El-Halawany et al. (Reference El-Halawany, Abdel-Shafy, Shawky, Abdel-Latif, Al-Tohamy and Abd El-Moneim2017); Mokhber (Reference Mokhber2017)). Therefore, the number of chromosomes in the previous versions of this array was 30. The current version of the Axiom Buffalo Genotyping array comprises 123 040 SNPs. The average probe space between these markers in relation to the buffalo UOA_WB_1 assembly was one SNP every 34.46 kb across all loci.
In the current study, the raw genotypic data were subjected to quality control procedures performed with PLINK 1.9 (Chang et al., Reference Chang, Chow, Tellier, Vattikuti, Purcell and Lee2015). In this respect; almost 48% from the genotyped SNPs was omitted due to unknown or duplicated positions, missing genotype per a SNP >0.15, low minor frequency (MAF) <0.01, and/or significant (P < 0.0001) deviation from Hardy-Weinberg proportion (online Supplementary Table S2). Individuals with low call rate (>0.15) were also discarded. After adopting the filtering options, the number of genotyped animals and SNPs were 113 and 64 169, respectively. The genotyping rate for the remaining animals was 98.4%. To reduce the redundant information and to prevent bias that could skew the association tests for GWAS (Anderson et al., Reference Anderson, Pettersson, Clarke, Cardon, Morris and Zondervan2010; Laurie et al., Reference Laurie, Doheny, Mirel, Pugh, Bierut, Bhangale, Boehm, Caporaso, Cornelis, Edenberg, Gabriel, Harris, Hu, Jacobs, Kraft, Landi, Lumley, Manolio, McHugh, Painter, Paschall, Rice, Rice, Zheng and Weir2010), we excluded one of a pair of SNPs if the LD of r 2 > 0.5 within a sliding window of 50 SNPs and moving 5 SNPs per set using PLINK (Chang et al., Reference Chang, Chow, Tellier, Vattikuti, Purcell and Lee2015).This LD pruning option step led to reduce the SNP number to 44 985 markers.
The filtered SNPs covered ~2.6 Gb of the buffalo genome with average distance of 40.8 ± 32.0 kb between SNPs (Table 1) and average MAF of 0.29 ± 0.13 across chromosomes. While in the previous investigation in Egyptian buffalo, the average density and MAF were 62.66 ± 67.10 and 0.30 ± 0.13, respectively (El-Halawany et al., Reference El-Halawany, Abdel-Shafy, Shawky, Abdel-Latif, Al-Tohamy and Abd El-Moneim2017). Because most of SNP markers are outside and/or away from the candidate genes, the lower distance would lead us to detect genomic loci more accurately and facilitate detection of candidate genes (Brodie et al., Reference Brodie, Azaria and Ofran2016). The square of correlation coefficient (r 2) is commonly used to determine the level of LD between markers. It has been suggested that GWAS and QTL mapping are better performed when r 2 ≥0.3 (Ardlie et al., Reference Ardlie, Kruglyak and Seielstad2002). In the current study, the average r 2 between adjacent markers is equal to 0.43 ± 0.21 providing powerful information for performing GWAS.
Chr, chromosome; Mb, mega base; kb, kilo base; LD, linkage disequilibrium; SD, standard deviation; Max, maximum; Min, minimum; r2, square of the correlation coefficient between pairs of SNPs; dist, distance between the two SNPs that are in LD, UnPos, un-positioned SNPs.
a Distance between adjacent SNPs.
b Number of SNPs after applying filtering options (see materials and methods for further details) and duplicate SNP position.
Genome-wide association analyses
An important issue in GWAS is to ensure that associations between markers and tested traits are not spurious. The most important source of false associations is the variance distortion attributed to unrecognized population stratification and cryptic relatedness (Cardon and Palmer, Reference Cardon and Palmer2003). The genomic inflation factor (λ) is the most known measure to determine the success of accounting for population structure. The value of one would reflect the assumption that only a small proportion of loci show a true association (Devlin and Roeder, Reference Devlin and Roeder1999). In the current study, even after correction for population stratification, λ was still greater than one (e.g. 1.06 for DMY) and the Q-Q plots showed deviations from expected levels (online Supplementary Figure S1). The slight inflation would probably attribute to the polygenic nature of complex traits, in which a large number of genetic variants with small effect affecting the trait variation and/or the trait locus is in LD with multiple genomic regions (Abdel-Shafy et al., Reference Abdel-Shafy, Bortfeldt, Tetens and Brockmann2014).
With 5% Bonferroni genome-wide threshold (P value < 1.1 × 10−6), we identified 23 significant and 24 suggestive SNPs associated with at least one of the five tested traits (Table 2, and see online Supplementary Fig. S1 and Table S3). These SNPs are located within 36 QTL and distributed over 20 buffalo chromosomes (BBU) with MAF ranges from 0.01 to 0.43. The 47 SNPs were two for DMY, three for FP, 21 for PP, seven for FY, and 14 for PY. One SNP was overlapped for three traits (DMY, FY, and PY); six SNPs were associated with two traits (one for PP and PY; and five for FY and PY); while the rest of markers were associated with only one trait (Table 2 and online Supplementary Table S3). Along with previously reported loci for buffalo studies, the results of current GWAS generally point to the polygenic nature underlying milk production traits in buffalo. However, some chromosomal regions had more prominent associations with relevant traits in corresponding to statistical significance and biological function of nearest candidate genes, making these QTL and genes more likely to be candidates for causal effects. In this respect, our identified SNPs on BBU 1, 5, 6, 7, 9, 19 and 23 reside in the genomic regions where previously reported QTL for milk production traits have been mapped by GWAS in different buffalo populations (de Camargo et al., Reference de Camargo, Aspilcueta-Borquis, Fortes, Porto-Neto, Cardoso, Santos, Lehnert, Reverter, Moore and Tonhati2015; El-Halawany et al., Reference El-Halawany, Abdel-Shafy, Shawky, Abdel-Latif, Al-Tohamy and Abd El-Moneim2017; Mokhber, Reference Mokhber2017; da Costa Barros et al., Reference da Costa Barros, de Abreu Santos, Aspilcueta-Borquis, de Camargo, de Araujo Neto and Tonhati2018; Herrera et al., Reference Herrera, Flores, Duijvesteijn, Gondro and van der Werf2018; Liu et al., Reference Liu, Liang, Campanile, Plastow, Zhang, Wang, Salzano, Gasparrini, Cassandro and Yang2018). Albeit the same buffalo array was used in these studies, different SNPs were detected. Interestingly, the significant SNPs AX-85128442 and AX-85047828 on BBU 7 (PP: P-values 1.4 × 10−7 and 1.1 × 10−6, respectively) coincided with previously reported QTL that are repeatedly mapped for milk production traits in Brazilian and Egyptian buffalo (de Camargo et al., Reference de Camargo, Aspilcueta-Borquis, Fortes, Porto-Neto, Cardoso, Santos, Lehnert, Reverter, Moore and Tonhati2015; El-Halawany et al., Reference El-Halawany, Abdel-Shafy, Shawky, Abdel-Latif, Al-Tohamy and Abd El-Moneim2017; da Costa Barros et al., Reference da Costa Barros, de Abreu Santos, Aspilcueta-Borquis, de Camargo, de Araujo Neto and Tonhati2018). Also, these SNPs were located in genomic regions previously reported for the same traits around corresponding positions in bovine genome (13.2 and 81.0 Mb on BTA6, respectively) in Holstein and Jersey cattle (Cohen-Zinder et al., Reference Cohen-Zinder, Seroussi, Larkin, Loor, der Wind A, Lee, Drackley, Band, Hernandez, Shani, Lewin, Weller and Ron2005; Pryce et al., Reference Pryce, Bolormaa, Chamberlain, Bowman, Savin, Goddard and Hayes2010; Buitenhuis et al., Reference Buitenhuis, Janss, Poulsen, Larsen, Larsen and Sorensen2014; Jiang et al., Reference Jiang, Ma, Prakapenka, VanRaden, Cole and Da2019).
Chr, chromosome; MA, minor allele; MAF, minor allele frequency; β, change per minor allele; DMY, daily milk yield; DFY, daily fat yield; DPY, daily protein yield; DLY, daily lactose yield.
In the distance: ‘ + ’ for upstream and ‘-’ for downstream. Positions are given according to the latest reference assembly of buffalo genome (UOA_WB_1: GCA_003121395.1).
The identified SNPs on BBU 1 (AX-85065545, FY; P-value 9.5 × 10−6), 5 (AX-85118863 and AX-85095040, FY and PP; P-values 6.9 × 10−6 and 1.2 × 10−8, respectively) and 6 (AX-85112177, PP; P-value 1.6 × 10−9) were also supported by known QTL in Italian, Brazilian, and Philippine buffalo populations (de Camargo et al., Reference de Camargo, Aspilcueta-Borquis, Fortes, Porto-Neto, Cardoso, Santos, Lehnert, Reverter, Moore and Tonhati2015; da Costa Barros et al., Reference da Costa Barros, de Abreu Santos, Aspilcueta-Borquis, de Camargo, de Araujo Neto and Tonhati2018; Herrera et al., Reference Herrera, Flores, Duijvesteijn, Gondro and van der Werf2018; Liu et al., Reference Liu, Liang, Campanile, Plastow, Zhang, Wang, Salzano, Gasparrini, Cassandro and Yang2018). Likewise, they coincided with corresponding locations near to QTL previously mapped in bovine genome in different Holstein and Jersey cattle populations (Rodriguez-Zas et al., Reference Rodriguez-Zas, Southey, Heyen and Lewin2002; Daetwyler et al., Reference Daetwyler, Schenkel, Sargolzaei and Robinson2008; Pryce et al., Reference Pryce, Bolormaa, Chamberlain, Bowman, Savin, Goddard and Hayes2010; Cole et al., Reference Cole, Wiggans, Ma, Sonstegard, Lawlor, Crooker, Van Tassell, Yang, Wang, Matukumalli and Da2011; Buitenhuis et al., Reference Buitenhuis, Janss, Poulsen, Larsen, Larsen and Sorensen2014).
The detected markers on BBU 9 (AX-85108874 and AX-85120516, PP and PY; P-values, 8.5 × 10−7 and 2.6 × 10−6, respectively), 19 (AX-85103359, PY; P-value 9.6 × 10−6) and 23 (AX-85053529, PP; P-value 2.4 × 10−7) were located in QTL previously discovered in Brazilian, and Iranian buffalo (Mokhber, Reference Mokhber2017; da Costa Barros et al., Reference da Costa Barros, de Abreu Santos, Aspilcueta-Borquis, de Camargo, de Araujo Neto and Tonhati2018). Also, these SNPs are placed close to QTL reported for several milk production traits in Holstein, Jersey, Brown Swiss, Nordic Red, and Blonde d'Aquitaine cattle breeds (Pryce et al., Reference Pryce, Bolormaa, Chamberlain, Bowman, Savin, Goddard and Hayes2010; Cole et al., Reference Cole, Wiggans, Ma, Sonstegard, Lawlor, Crooker, Van Tassell, Yang, Wang, Matukumalli and Da2011; Cecchinato et al., Reference Cecchinato, Ribeca, Maurmayr, Penasa, De Marchi, Macciotta, Mele, Secchiari, Pagnacco and Bittante2012, Reference Cecchinato, Ribeca, Chessa, Cipolat-Gotet, Maretto, Casellas and Bittante2014; Meredith et al., Reference Meredith, Kearney, Finlay, Bradley, Fahey, Berry and Lynn2012; Iso-Touru et al., Reference Iso-Touru, Sahana, Guldbrandtsen, Lund and Vilkki2016; Michenet et al., Reference Michenet, Barbat, Saintilan, Venot and Phocas2016). Mapping the same genomic regions in independent buffalo and cattle populations with different experimental and analytical structures would increase the confidence that these QTL contain true causative mutations affecting milk production traits.
Beside significant SNPs that overlapped with previously reported loci, several new markers were identified which were not detected in previous buffalo GWAS. Most of these markers are placed on BBU 15, 2, and 4 (Table 2 and online Supplementary Table S3). The most interesting new genomic region is located on BBU 15 between 34.53 and 49.37 Mb, where several SNPs are associated with four milk production traits. One of these SNPs (AX-85055593) is associated with DMY, PY and FY (P-values 6.8 × 10−7, 6.7 × 10−6, and 1.7 × 10−7, respectively). Corresponding to bovine genome, this genomic region coincides with previously reported QTL for milk yield as well as protein and fat percentages in Irish and US Holstein along with German Holstein and Charolais crossing cattle (Meredith et al., Reference Meredith, Kearney, Finlay, Bradley, Fahey, Berry and Lynn2012; Friedrich et al., Reference Friedrich, Brand, Ponsuksili, Graunke, Langbein, Knaust, Kuhn and Schwerin2016; Jiang et al., Reference Jiang, Ma, Prakapenka, VanRaden, Cole and Da2019). Interestingly, minor allele of this SNP had a positive effect direction for three traits (1.42, 0.07, and 0.15 units, respectively). On the other hand, our study did not detect most of associations in QTL that have been previously suggested by GWAS in Egyptian buffalo (El-Halawany et al., Reference El-Halawany, Abdel-Shafy, Shawky, Abdel-Latif, Al-Tohamy and Abd El-Moneim2017). This result was probably due to different LD structure of genotyped animals (r 2: 0.43 ± 0.21 v. 0.31 ± 0.08), frequency of minor alleles (MAF: 0.29 ± 0.13 v. 0.30 ± 0.13), average distance between markers (40.8 ± 32.0 kb v. 62.66 ± 67.10), nature of the analyzed trait, and/or potential false associations.
Candidate genes
The location of all identified SNPs in relation to genes on the buffalo genome was calculated according to UOA_WB_1 (GCA_003121395.1) assembly. We found that almost half of SNPs were intronic (51%) while the rest were outside the known genes (Table 2 and online Supplementary Table S3). The nearest genes for those intergenic SNPs are located in a distance ranging from 0.02 to 0.77 Mb. Several identified SNPs are located within or close to numerous candidate genes. For example: the SNP AX-85055593 on BBU 15 associated with DMY, FY and PY is placed 115 and 189 kb far from the genes encoding tumor protein D52 (TPD52) and zinc finger and BTB domain containing 10 (ZBTB10), respectively. These proteins have functional roles during immune response (Tiacci et al., Reference Tiacci, Orvietani, Bigerna, Pucciarini, Corthals, Pettirossi, Martelli, Liso, Benedetti, Pacini, Bolli, Pileri, Pulford, Gambacorta, Carbone, Pasquarello, Scherl, Robertson, Sciurpi, Alunni-Bistocchi, Binaglia, Byrne and Falini2005; Szyda et al., Reference Szyda, Mielczarek, Fraszczak, Minozzi, Williams and Wojdak-Maksymiec2019). In addition, the TPD52 is promoting intracellular lipid storage within cultured cells (Kamili et al., Reference Kamili, Roslan, Frost, Cantrill, Wang, Della-Franca, Bright, Groblewski, Straub, Hoy, Chen and Byrne2015). Likewise, the position of the SNP AX-85047648 on BBU 17 that was associated with FY and PY is 197 kb away from the adhesion G protein-coupled receptor D1 (ADGRD1) gene. The protein product of this gene is affecting fatty acids concentration in chicken meat (Yang et al., Reference Yang, Wang, Wang, Shi, Ou, Wu, Zhang, Hu, Yuan, Wang, Cao and Liu2018). The SNP AX-85118863 on BBU 5 identified for FY and PY is located in the estrogen related receptor gamma gene (ESRRG), which plays a central function in lipid metabolism (Sanoudou et al., Reference Sanoudou, Duka, Drosatos, Hayes and Zannis2010). In addition, the SNP AX-85111822 (identified for FY and PY) on BBU 15 is placed at a distance of 0.77 and 1.5 Mb from RNA-binding raly-like (RALYL) and sorting nexin 16 (SNX16), respectively. These genes have been previously identified as candidates for milk yield and fatty acids content along with resistance to F. hepatica (Li et al., Reference Li, Sun, Zhang, Wang, Wu, Zhang, Liu, Li and Qiao2014; Yodklaew et al., Reference Yodklaew, Koonawootrittriron, Elzo, Suwanasopee and Laodim2017; Twomey et al., Reference Twomey, Berry, Evans, Doherty, Graham and Purfield2019). The SNP AX-85043902 on BBU 4 detected for PP and PY is in a distance of 0.55 Mb interval from the gene encoding glutamate receptor interacting protein 1 (GRIP1), which is involved in several pathways including metabolic hormones (Hong et al., Reference Hong, Kohli, Garabedian and Stallcup1997).
In conclusion, this is the first reported GWAS for milk components in Egyptian buffalo. In the current study, we detected 47 SNPs for five milk production traits (DMY, FP, PP, FY, and PY). These SNPs are located within 36 QTL and distributed over 20 buffalo chromosomes. Some of these loci (11 out of 36) overlap with previously reported QTL in buffalo and/or cattle populations, and some of them are placed within or close to potential candidate genes. The consistence of our identified genomic regions with known QTL and candidate genes provides further evidence for the importance of such loci for the variation in milk production traits. In addition, novel genomic loci were suggested. Further confirmation studies including larger population size should be performed to validate the findings and detect the causal genetic variants.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0022029920000953
Acknowledgments
This study was financially supported by Cairo University, Egypt through the project ‘Genomic evaluation for milk production traits in Egyptian buffalo: A step forward for sustainable improvement and food security’.
Financial support
None.